r - Rhive query inserts multiple entries into Hive table for a single row insert when run via an Oozie job -
i have oozie job triggers r script. r script in turn runs hive query insert single row hive table. problem there 4 rows being inserted single insert statement when run job. however, if run rscript alone without oozie job, works fine , single row gets inserted table.
please note: oozie job runs on cloudera hadoop distribution.i suspect problem in bigdata environment since processing gets split between nodes. below code present inside r script.
library(rhive) sys.setenv("hadoop_home"="/opt/cloudera/parcels/cdh/lib/hadoop") sys.setenv("hive_home"="/opt/cloudera/parcels/cdh/lib/hive") sys.setenv("hadoop_cmd"="/etc/hadoop") library(rhdfs) rhive.init() rhive.connect(host="10.223.99.33", port="10000", defaultfs="hdfs://10.223.69.37:8020") rhive.execute("insert table apphalo.errorlogtable values ('2017-08-21 15:00:08','sampling','3657','3658','1','3','112') to mitigate issue, tried writing row csv file in hdfs , in turn loading csv file hive table. returned same results(i.e. inserts 4 rows single row insert hive table)
Comments
Post a Comment