Submit R script to Spark via Livy on remote standalone server -


i set standalone ubuntu server spark 2.2.0 , running. aim allow several users (clients) connect server , develop locally (from own computer) rstudio, code, has executed on spark.

so, installed livy on server (which , running), allows me connect server rstudio

config = livy_config(username = "me", password = "***") sc <- spark_connect(master = "http://myserver:8998", method = "livy", config = config) 

rstudio sends me message telling me i'm connected.

from this, have few questions :

  1. can develop on rstudio locally , send processing spark (e.g.: manage dataframe + perform machine learning)? if yes, how ? have use function sparklyr directly? have install spark instance running locally able test code before sending spark cluster on remote server ?

  2. when use copy_to function, iris dataframe, takes approximatively 1 minute. can conclude connection slow consider develop locally , send proccessings server ?

  3. it not possible use rstudio inside server directly (because access commands lines) , several persons develop @ same time. best solution develop ?

finally, i'm facing simple issue : if best solution develop our apps locally , then, send them via ssh server, , execute them directly on server, how can run them ? tried archive simple r script .jar file , run spark_submit got class not found error (no main program found). how can ?

many tanks in advance answers.


Comments

Popular posts from this blog

python - Operations inside variables -

Generic Map Parameter java -

arrays - What causes a java.lang.ArrayIndexOutOfBoundsException and how do I prevent it? -