php - Scrape HTML table data and create XML or JSON doc -


i need scrape website data table on website , create xml or json document used app. , have problem getting below data.

the table looks this:

<table border="0" cellpadding="3" cellspacing="0" bgcolor="#ddeeff" width="100%"> <tr> <td width="20%" ><font face="verdana, arial" size="1">src</a></td></font>     <td width="58%" ><font face="verdana, arial" size="1"><a href="http://example.com/this/news?id=1&by=today" onmouseover="a('open bulletin');return true" onmouseout="b()">welcome</font></a></td> <td width="17%" align="center"><font face="verdana, arial" size="1">event</td></font>     </tr>     <tr> <td width="20%" ><font face="verdana, arial" size="1">fmd</a></td></font>     <td width="58%" ><font face="verdana, arial" size="1"><a href="http://example.com/this/news?id=2&by=today" onmouseover="a('open bulletin');return true" onmouseout="b()">another news</font></a></td> <td width="17%" align="center"><font face="verdana, arial" size="1">updates</td></font>     </tr>    </td> 

and create xml feed or json looks this:

<bulletins>     <title>welcome</title>     <id>1</id>     <type>news</type> </bulletins>  <bulletins>     <title>another news</title>     <id>2</id>     <type>updates</type> </bulletins> 

here current code :

<?php $body = explode('<table border="0" cellpadding="3" cellspacing="0" bgcolor="#ddeeff" width="100%">', $html);  $xml = simplexml_load_string("<?xml version='1.0' encoding='utf-8'?><xml />");  $rows = array(); foreach (array_slice(explode('<tr>', end($body)), 1) $row) {        preg_match('#<a.*?href="(.*?)".*?>(.*?)</a>#i', $row, $title);     preg_match('/<a.*href="(.*)".*>(.*)<\/a>/iu', $row, $id);    // preg_match('/type">([^<]+)<\/td>/', $row, $type);       $node = $xml->addchild('bulletins');      $node->addchild('title', $title[1]);     $node->addchild('id', $id[1]);    // $node->addchild('type', $due[1]); }  header('content-type: text/xml'); echo $xml->asxml(); ?> 

but problem got

<xml>     <bulletins>         <title>http://example.com/this/news?id=1</title>         <id>http://example.com/this/news?id=1</id>     </bulletins>     <bulletins>         <title>http://example.com/this/news?id=2</title>         <id>http://example.com/this/news?id=2</id>     </bulletins> </xml> 

here's quick example started using dom functions:

$dom = new domdocument(); @$dom->loadhtmlfile(url); $xpath = new domxpath($dom);  $xml = new domdocument(); foreach($xpath->query('//table/tr') $tr) {   $bulletin = $xml->appendchild($xml->createelement("bulletin"));   $title = $xpath->query('.//td[2]//a', $tr)->item(0)->nodevalue;   $bulletin->appendchild($xml->createelement("title",$title));   $type = $xpath->query('.//td[3]/font', $tr)->item(0)->nodevalue;   $bulletin->appendchild($xml->createelement("type",$type)); } echo $xml->savexml(); 

Comments

Popular posts from this blog

java - Play! framework 2.0: How to display multiple image? -

gmail - Is there any documentation for read-only access to the Google Contacts API? -

php - Controller/JToolBar not working in Joomla 2.5 -