php - Scrape HTML table data and create XML or JSON doc -
i need scrape website data table on website , create xml or json document used app. , have problem getting below data.
the table looks this:
<table border="0" cellpadding="3" cellspacing="0" bgcolor="#ddeeff" width="100%"> <tr> <td width="20%" ><font face="verdana, arial" size="1">src</a></td></font> <td width="58%" ><font face="verdana, arial" size="1"><a href="http://example.com/this/news?id=1&by=today" onmouseover="a('open bulletin');return true" onmouseout="b()">welcome</font></a></td> <td width="17%" align="center"><font face="verdana, arial" size="1">event</td></font> </tr> <tr> <td width="20%" ><font face="verdana, arial" size="1">fmd</a></td></font> <td width="58%" ><font face="verdana, arial" size="1"><a href="http://example.com/this/news?id=2&by=today" onmouseover="a('open bulletin');return true" onmouseout="b()">another news</font></a></td> <td width="17%" align="center"><font face="verdana, arial" size="1">updates</td></font> </tr> </td>
and create xml feed or json looks this:
<bulletins> <title>welcome</title> <id>1</id> <type>news</type> </bulletins> <bulletins> <title>another news</title> <id>2</id> <type>updates</type> </bulletins> here current code :
<?php $body = explode('<table border="0" cellpadding="3" cellspacing="0" bgcolor="#ddeeff" width="100%">', $html); $xml = simplexml_load_string("<?xml version='1.0' encoding='utf-8'?><xml />"); $rows = array(); foreach (array_slice(explode('<tr>', end($body)), 1) $row) { preg_match('#<a.*?href="(.*?)".*?>(.*?)</a>#i', $row, $title); preg_match('/<a.*href="(.*)".*>(.*)<\/a>/iu', $row, $id); // preg_match('/type">([^<]+)<\/td>/', $row, $type); $node = $xml->addchild('bulletins'); $node->addchild('title', $title[1]); $node->addchild('id', $id[1]); // $node->addchild('type', $due[1]); } header('content-type: text/xml'); echo $xml->asxml(); ?> but problem got
<xml> <bulletins> <title>http://example.com/this/news?id=1</title> <id>http://example.com/this/news?id=1</id> </bulletins> <bulletins> <title>http://example.com/this/news?id=2</title> <id>http://example.com/this/news?id=2</id> </bulletins> </xml>
here's quick example started using dom functions:
$dom = new domdocument(); @$dom->loadhtmlfile(url); $xpath = new domxpath($dom); $xml = new domdocument(); foreach($xpath->query('//table/tr') $tr) { $bulletin = $xml->appendchild($xml->createelement("bulletin")); $title = $xpath->query('.//td[2]//a', $tr)->item(0)->nodevalue; $bulletin->appendchild($xml->createelement("title",$title)); $type = $xpath->query('.//td[3]/font', $tr)->item(0)->nodevalue; $bulletin->appendchild($xml->createelement("type",$type)); } echo $xml->savexml();
Comments
Post a Comment