python - Combine multiple columns in Pandas excluding NaNs -


my sample df has 4 columns nan values. goal concatenate rows while excluding nan values.

import pandas pd import numpy np  df = pd.dataframe({'keywords_0':["a", np.nan, "c"],                  'keywords_1':["d", "e", np.nan],                 'keywords_2':[np.nan, np.nan, "b"],                 'keywords_3':["f", np.nan, "g"]})    keywords_0 keywords_1 keywords_2 keywords_3 0                   d        nan          f 1        nan          e        nan        nan 2          c        nan          b          g 

want accomplish following:

  keywords_0 keywords_1 keywords_2 keywords_3 keywords_all 0                   d        nan          f        a,d,f 1        nan          e        nan        nan            e 2          c        nan          b          g        c,b,g 

pseudo code:

cols = [df.keywords_0, df.keywords_1, df.keywords_2, df.keywords_3]  df["keywords_all"] = df["keywords_all"].apply(lambda cols: ",".join(cols), axis=1) 

i know can use ",".join() exact result, unsure how pass column names function.

you can apply ",".join() on each row passing axis=1 apply method. first need drop nans though. otherwise typeerror.

df.apply(lambda x: ','.join(x.dropna()), axis=1) out:  0    a,d,f 1        e 2    c,b,g dtype: object 

you can assign original dataframe with

df["keywords_all"] = df.apply(lambda x: ','.join(x.dropna()), axis=1) 

or if want specify columns did in question:

cols = ['keywords_0', 'keywords_1', 'keywords_2', 'keywords_3'] df["keywords_all"] = df[cols].apply(lambda x: ','.join(x.dropna()), axis=1) 

Comments

Popular posts from this blog

ubuntu - PHP script to find files of certain extensions in a directory, returns populated array when run in browser, but empty array when run from terminal -

php - How can i create a user dashboard -

javascript - How to detect toggling of the fullscreen-toolbar in jQuery Mobile? -