apache spark sql - How to split a list to multiple columns in Pyspark? -

July 15, 2013

i have:

key   value    [1,2,3] b    [2,3,4]

i want:

key value1 value2 value3     1      2      3 b     2      3      4

it seems in scala can write:df.select($"value._1", $"value._2", $"value._3"), not possible in python.

so there way this?

from depends on type of "list":

if of type arraytype():

df = hc.createdataframe(sc.parallelize([['a', [1,2,3]], ['b', [2,3,4]]]), ["key", "value"]) df.printschema() df.show() root  |-- key: string (nullable = true)  |-- value: array (nullable = true)  |    |-- element: long (containsnull = true)

you can access values python using []:

df.select("key", df.value[0], df.value[1], df.value[2]).show() +---+--------+--------+--------+ |key|value[0]|value[1]|value[2]| +---+--------+--------+--------+ |  a|       1|       2|       3| |  b|       2|       3|       4| +---+--------+--------+--------+  +---+-------+ |key|  value| +---+-------+ |  a|[1,2,3]| |  b|[2,3,4]| +---+-------+

if of type structtype(): (maybe built dataframe reading json)

df2 = df.select("key", psf.struct(         df.value[0].alias("value1"),          df.value[1].alias("value2"),          df.value[2].alias("value3")     ).alias("value")).show() df2.printschema() df2.show() root  |-- key: string (nullable = true)  |-- value: struct (nullable = false)  |    |-- value1: long (nullable = true)  |    |-- value2: long (nullable = true)  |    |-- value3: long (nullable = true)  +---+-------+ |key|  value| +---+-------+ |  a|[1,2,3]| |  b|[2,3,4]| +---+-------+

you can directly 'split' column directly using *:

df2.select('key', 'value.*').show() +---+------+------+------+ |key|value1|value2|value3| +---+------+------+------+ |  a|     1|     2|     3| |  b|     2|     3|     4| +---+------+------+------+

Search This Blog

Force Net

apache spark sql - How to split a list to multiple columns in Pyspark? -

Comments

Post a Comment

Popular posts from this blog

ubuntu - PHP script to find files of certain extensions in a directory, returns populated array when run in browser, but empty array when run from terminal -

php - How can i create a user dashboard -

javascript - How to detect toggling of the fullscreen-toolbar in jQuery Mobile? -