python - convert pandas df to multi-dimensional numpy array -


i have sparse data in pandas dataframe 25million+ records. has converted multi dimensional numpy array. have written straightforward way using for loop, , wondering if there more efficient way.

import numpy np import pandas pd  facts_pd = pd.dataframe.from_records(columns=['name','offset','code'],     data=[('john', -928, 'dx_434'), ('steve',-757,'dx_5859'), ('jack',-800,'dx_250'),           ('john',-919,'dx_401'),('john',-956,'dx_5859')])  name_lu = pd.dataframe(sorted(facts_pd['name'].unique()), columns=['name']) name_lu["nameid"] = name_lu.index  offset_lu = pd.dataframe(sorted(facts_pd['offset'].unique(), reverse=true), columns=['offset']) offset_lu["offsetid"] = offset_lu.index  code_lu = pd.dataframe(sorted(facts_pd['code'].unique()), columns=['code']) code_lu["codeid"] = code_lu.index  facts_pd = pd.merge(pd.merge(pd.merge(facts_pd, name_lu, how="left", on="name")     , offset_lu, how="left", on="offset"), code_lu, how="left", on="code") facts_pd.drop(["name","offset","code"], inplace=true, axis=1)  facts_np = np.zeros((len(name_lu),len(offset_lu),len(code_lu))) row in facts_pd.iterrows():     i,j,k = row[1]     facts_np[i][j][k] = 1 

the command looking dataframe.as_matrix() return numpy array , not matrix despite command says here man pages it.

here stack overflow topic on use of well


Comments

Popular posts from this blog

ubuntu - PHP script to find files of certain extensions in a directory, returns populated array when run in browser, but empty array when run from terminal -

php - How can i create a user dashboard -

javascript - How to detect toggling of the fullscreen-toolbar in jQuery Mobile? -