正在尝试从ASX网站表中进行数据清理

Attempting to datascrape from ASX website table

本文关键字：数据网站 ASX 更新时间：2023-09-26

我一直在尝试从ASX.com.au网站上获取股票的当前价值。也就是说，我正在努力获取澳交所的当前价值。这可以在这里找到。

http://www.asx.com.au/asx/markets/equityPrices.do?by=asxCodes&asxCodes=asx

这是左起第二个td，在撰写本文时，它位于30.410。

我可以玩一些代码，但一直没能让它发挥作用。

下面是我一直在玩的示例代码，如果有人能帮助我实现这一点，我将不胜感激！

<?php
$data = file_get_contents('http://www.asx.com.au/asx/markets/equityPrices.do?by=asxCodes&asxCodes=asx');
$asx = explode('<th class="row" scope="row">ASX: </th>', $data);
$asx = substr($asx[1], 4, strpos($asx[1], '</td>') - 4);
?><div class="asxvalue"><?php echo $asx . "<br />'n";?></div>

编辑

代码更新

<?php
$data = file_get_contents('http://www.asx.com.au/asx/research/companyInfo.do?by=asxCode&asxCode=DTL');
preg_match('/<td class="last">([^<]*?)</td>/i',$data,$matches);
$valueYouWant = $matches[1];
?><div class="data"><?php echo $valueYouWant ?></div>

每个人都会理所当然地告诉你，你不能用regex解析html，应该使用html解析器（比如simple_dom中的这个），但对于你的特定问题，你可以这样做：

preg_match('/<td class="last">([^<]*?)</td>/i',$data,$matches);
$valueYouWant = $matches[1];

要在另一页上查找日期和最后一次的值，可以使用以下方法：事实上，我建议在未来使用Simple_Dom来处理这样的事情，但在你对它感到满意之前，这将暂时有效：

$data = file_get_contents('http://www.asx.com.au/asx/research/companyInfo.do?by=asxCode&asxCode=DTL');
preg_match('/id="closing-prices".*?<strong>(.*?)<'/strong>.*?<td class="last">(.*?)<'/td>/s',$data,$matches);
$date = $matches[1]; 
$lastValue = $matches[2];

我已经测试过了，它是有效的。为了使它更健壮，我建议使用其他工具，但这应该会让你起步。祝你好运

谢谢你-我能够在Wordpress PHP代码小部件中使用这些代码，它对ASX股价来说是一种享受：

<?php
$data = file_get_contents('http://www.asx.com.au/asx/research/companyInfo.do?by=asxCode&asxCode=asx');
preg_match('/id="closing-prices".*?<strong>(.*?)<'/strong>.*?<td class="last">(.*?)<'/td>/s',$data,$matches);
$date = $matches[1]; 
$lastValue = $matches[2];
?><div class="data">$<?php echo $lastValue ?></div>

我想有人可能会把上面的解决方案放在一起，以防有用。

非常感谢您回答这个问题hackmartist：）