How to replace nulls with empty string ("") in Apache spark using scala -


this question has answer here:

i working huge datasets (contains 332 fields) in apache spark scala ( except 1 field, remaining 331 can null) of around 10m records. replace null blank string (""). best way achieve have huge number of fields? want handle nulls while importing data set safe while performing transformations or exporting df. have created case class 332 fields, best way handle these nulls? can use option(field).getorelse(""), guess it's not best way have huge number of fields. thank you!!

we can use udf safe column this

val df = seq((1,"hello"), (2,"world"), (3,null)).todf("id", "name")  val safestring: string => string = s => if (s == null) "" else s val udfsafestring = udf(safestring)  val dfsafe = df.select($"id", udfsafestring($"name").alias("name"))  dfsafe.show 

if have lots of columns, , 1 of columns key column. can this.

val safecols = df.columns.map(colname =>      if (colname == "id") col(colname)      else udfsafestring(col(colname)).alias(colname))  val dfsafe =  df.select(safecols:_*) dfsafe.show 

Comments

Popular posts from this blog

ubuntu - PHP script to find files of certain extensions in a directory, returns populated array when run in browser, but empty array when run from terminal -

php - How can i create a user dashboard -

javascript - How to detect toggling of the fullscreen-toolbar in jQuery Mobile? -