php - Parsing with Goutte - how to target an element after one containing a text string -


i using https://github.com/friendsofphp/goutte parse , extract data , doing well...

but stumbled upon unfriendly spot:

<tr> <th>website:</th> <td>     <a href="http://www.adres.com" target="_blank">http://www.adres.com</a> </td> </tr> 

i trying text td element follows th element contains specific string, website: in case.

my php looks this:

$client3 = new \goutte\client(); $crawler3 = $client3->request('get', $supplierurl . 'contactinfo.html');  if($crawler3->filter('th:contains("+website+") + td a')->count() > 0) {     $parsed_company_website_url = $crawler3->filter('th:contains("website:") + td')->text(); } else {     $parsed_company_website_url = null; } return $parsed_company_website_url; 

problem

my code doesn't work.

attempts
  • i tried using both "+website+" , "website:"
  • i tried smart targeting counting rows of table, each db entry on target site arranges items differently, no reliable pattern.

to do

make script extract text a

seems contains() jquery feature , not css selector. css, may inspect attribute value not text node inside markup.

so, in case, use xpath selector, especially: following-sibling (see https://stackoverflow.com/a/29380551/1997849)


Comments

Popular posts from this blog

python - Operations inside variables -

Generic Map Parameter java -

arrays - What causes a java.lang.ArrayIndexOutOfBoundsException and how do I prevent it? -