Download picture from a java webpage with login using R -

September 15, 2010

for record: never scraped webpage before , don't have html knowledge.

i need download thousand of pictures urls require login. 6 months ago used able download single image simply:

url <- "https://customer.roamler.com/image/71138bc0-0ff3-405d-9406-5e643b018f7f" download.file(url = url,destfile = "myimage.jpg",mode = "wb",method = "libcurl")

i don't know if @ time computer had different configuration fact code above doesn't work anymore. when use myimage.jpg not picture, it's source code (saved jpg). thought "maybe need scrape page in order image". that's why tried rvest forms , sessions in order login.

require("rvest") session <- html_session(url) form <- html_form(session)[[1]]

but got error

> form <- html_form(session)[[1]] error in html_form(session)[[1]] : subscript out of bounds

and realized session had different url 1 had provided

> session <session> https://customer.roamler.com/   status: 200   type:   text/html; charset=utf-8   size:   10043

when read source code url before logging account don't find login form in it. funny thing when login page loaded , inspect (ctrl+shift+i on windows+chrome) find login form (xpath='//*[@id="login-form"]/form'). conclusion needed wait page load before getting login node , pass credentials. tried rselenium

require("rselenium")  driver<- rsdriver(browser = "phantomjs") remdr <- driver[["client"]] remdr$open() remdr$setimplicitwaittimeout(milliseconds = 10000) remdr$navigate(url)

now url seems kept correctly when @ page source still loading page source (the 1 without login form)

> remdr$getcurrenturl() [[1]] [1] "https://customer.roamler.com/#/login?returnurl=%2fimage%2f71138bc0-0ff3-405d-9406-5e643b018f7f" > remdr$findelements(using = "xpath", '//*[@id="login-form"]/form') list()

can me out? don't know try next , feel i'm loosing time should relatively easy (download picture url).

Search This Blog

Force Net

Download picture from a java webpage with login using R -

Comments

Post a Comment

Popular posts from this blog

python - Operations inside variables -

Generic Map Parameter java -

arrays - What causes a java.lang.ArrayIndexOutOfBoundsException and how do I prevent it? -