Spark Dataset: Reduce, Agg, Group or GroupByKey for a Dataset<Tuple2> Java -


i have dataset <tuple2<string, double>> follows:

<a,1> <b,2> <c,2> <a,2> <b,3> <b,4> 

and need reduce string sum values using spark java api final result should below:

<a,3> <b,9> <c,2> 

shall use reduce, agg, group or groupbykey? , how?

consider have dataset

dataset<tuple2<string, double>> ds = ..; 

then can call groupby function , sum below

ds.groupby(col("_1")).sum("_2").show(); 

or can convert dataset<row> , call groupby function

dataset<row> ds1 = ds.todf("key","value"); ds1.groupby(col("key")).sum("value").show(); 

Comments

Popular posts from this blog

ubuntu - PHP script to find files of certain extensions in a directory, returns populated array when run in browser, but empty array when run from terminal -

php - How can i create a user dashboard -

javascript - How to detect toggling of the fullscreen-toolbar in jQuery Mobile? -