H2O running slower than data.table R -


how possible storing data h2o matrix slower in data.table?

#packages used "h2o" , "data.table" library(h2o) library(data.table) #create matrix matrix1<-data.table(matrix(rnorm(1000*1000),ncol=1000,nrow=1000)) matrix2<-h2o.createframe(1000,1000)  h2o.init(nthreads=-1) #data.table variable store for(i in 1:1000){ matrix1[i,1]<-3 } #h2o matrix frame store for(i in 1:1000){   matrix2[i,1]<-3 } 

thanks!

h2o client/server architecture. (see http://docs.h2o.ai/h2o/latest-stable/h2o-docs/architecture.html)

so you've shown inefficient way specify h2o frame in h2o memory. every write going turning network call. don't want this.

for example, since data isn't large, reasonable thing initial assignment local data frame (or datatable) , use push method of as.h2o().

h2o_frame = as.h2o(matrix1) head(h2o_frame) 

this pushes r data frame r client h2o frame in h2o server memory. (and can as.data.table() opposite.)


data.table tips:

for data.table, prefer in-place := syntax. avoids copies. so, example:

matrix1[i, 3 := 42] 

h2o tips:

the fastest way read data h2o ingesting using pull method in h2o.importfile(). parallel , distributed.

the as.h2o() trick shown above works small datasets fit in memory of 1 host.

if want watch network messages between r , h2o, call h2o.startlogging().


Comments

Popular posts from this blog

ubuntu - PHP script to find files of certain extensions in a directory, returns populated array when run in browser, but empty array when run from terminal -

php - How can i create a user dashboard -

javascript - How to detect toggling of the fullscreen-toolbar in jQuery Mobile? -