node.js - Image urls change while scraping in Node (works in browser console) -


i'm using artoo.js web scraping reason scraped image url's change when working cheerio in node . i.e original image url :

"https://images-na.ssl-images-amazon.com/images/m/mv5bnwu4nmy3mtmtmtbmmi00njfjltkwmmitywzhzwuwndg5m2exxkeyxkfqcgdeqxvynduyotg3njg@._v1_sx300.jpg"

however after scraping url turns url:

"http://ia.media-imdb.com/images/g/01/imdb/images/nopicture/156x231/tv-3797070466._cb522736147_.png@._v1_sx300.jpg"

if scrape while in chrome browser console using artoo.js bookmark. url stays same original. why changing when use in node?.any suggestions

update: update: think found issue not solution. seems scraper method runs before correct images have loaded on page. changed url placeholder image. how can wait till entire page loads.

it may caused js code. if using request+cheerio scrap page. when make request in node js code nothing (it's not interpreted). getting original url before lib or piece of code changes it. try @ source code of page in browser crtl+u. if it's "http://ia.media-imdb.com/images/g/01/imdb/images/nopicture/156x231/tv-3797070466._cb522736147_.png@._v1_sx300.jpg" know piece of code doing change it.

edit

if absolutly need run js obtain url. sould use phantomjs. it's headless browser. imaes load. can use directly nodejs or if want simpler way go casperjs. assume you're not used scraping complicated web apps. if it's case go casperjs. it's easy , job. it's not fast using request + cheerio works. , can put code run on server.


Comments

Popular posts from this blog

ubuntu - PHP script to find files of certain extensions in a directory, returns populated array when run in browser, but empty array when run from terminal -

php - How can i create a user dashboard -

javascript - How to detect toggling of the fullscreen-toolbar in jQuery Mobile? -