图像属性的JavaScript正则表达式

JavaScript regex for image attribute

本文关键字:正则表达式 JavaScript 属性 图像      更新时间:2023-09-26

我正试图从以下页面获取图像URL:

http://www.amazon.co.uk/The-Classics-3xCD-Box-Set/dp/B000W3Q4X2/ref=sr_1_fkmr0_1/277-3029293-0823745?ie=UTF8&qid=1410727619&sr=8-1-fkmr0&keywords=Classic+Euphoria+3xCD+Box+Chicane+Hybrid+++P%26P

http://www.amazon.co.uk/Hinari-HIN172-Digital-Steam-Generator/dp/B00472M9S8/ref=sr_1_fkmr0_1/280-9070877-0582850?ie=UTF8&qid=1410725454&sr=8-1-fkmr0&keywords=Hinari+HIN172+2500+W+Digital+Steam+Generator+BOXED

该图像可以在imgTagWrapperId分区内的img标签的data-a-dynamic-image属性中找到

最终图像应返回为:

http://ecx.images-amazon.com/images/I/81Vi7ECR9hL.jpg

例如,应将_SX522_http://ecx.images-amazon.com/images/I/81Vi7ECR9hL._SX522_.jpg 的原始图像URL中删除

我只需要从源中返回一个图像。

$html=file_get_contents('http://www.amazon.co.uk/The-Classics-3xCD-Box-Set/dp/B000W3Q4X2/ref=sr_1_fkmr0_1/277-3029293-0823745?ie=UTF8&qid=1410727619&sr=8-1-fkmr0&keywords=Classic+Euphoria+3xCD+Box+Chicane+Hybrid+++P%26P');
$html = preg_replace('/'s{2,}/', ' ', $html); // replace all instances of more than one whitespace with a single space
preg_match('/'{'&quot';(https?':'/'/['S]+)'&quot';/', $html, $matches); // can be either http or https potentially?
print_r($matches);

阵列(

[0] => {"http://ecx.images-amazon.com/images/I/41pi9o3crTL.jpg"
[1] => http://ecx.images-amazon.com/images/I/41pi9o3crTL.jpg

)

相同的正则表达式在Javascript中工作:

document.getElementById('imgTagWrapperId').outerHTML.match(/'{'&quot';(https?':'/'/['S]+)'&quot';/);
["{"http://ecx.images-amazon.com/images/I/41pi9o3crTL.jpg"", "http://ecx.images-amazon.com/images/I/41pi9o3crTL.jpg"]