Submit R script to Spark via Livy on remote standalone server -
i set standalone ubuntu server spark 2.2.0 , running. aim allow several users (clients) connect server , develop locally (from own computer) rstudio, code, has executed on spark.
so, installed livy on server (which , running), allows me connect server rstudio
config = livy_config(username = "me", password = "***") sc <- spark_connect(master = "http://myserver:8998", method = "livy", config = config) rstudio sends me message telling me i'm connected.
from this, have few questions :
can develop on rstudio locally , send processing spark (e.g.: manage dataframe + perform machine learning)? if yes, how ? have use function sparklyr directly? have install spark instance running locally able test code before sending spark cluster on remote server ?
when use copy_to function, iris dataframe, takes approximatively 1 minute. can conclude connection slow consider develop locally , send proccessings server ?
it not possible use rstudio inside server directly (because access commands lines) , several persons develop @ same time. best solution develop ?
finally, i'm facing simple issue : if best solution develop our apps locally , then, send them via ssh server, , execute them directly on server, how can run them ? tried archive simple r script .jar file , run spark_submit got class not found error (no main program found). how can ?
many tanks in advance answers.
Comments
Post a Comment