Proper way of writing and reading Dataframe to file in Python -

July 15, 2015

i write , later read dataframe in python.

df_final.to_csv(self.get_local_file_path(hash,dataset_name), sep='\t', encoding='utf8') ... df_final = pd.read_table(self.get_local_file_path(hash,dataset_name), encoding='utf8',index_col=[0,1])

but get:

sys:1: dtypewarning: columns (7,17,28) have mixed types. specify dtype option on import or set low_memory=false.

i found question. in bottom line says should specify field types when read file because "low_memory" deprecated... find inefficient.

isn't there simple way write & later read dataframe? don't care human-readability of file.

you can pickle dataframe:

df_final.to_pickle(self.get_local_file_path(hash,dataset_name))

read later:

df_final = pd.read_pickle(self.get_local_file_path(hash,dataset_name))

if dataframe ist big , gets slow, might have more luck using hdf5 format:

df_final.to_hdf(self.get_local_file_path(hash,dataset_name))

read later:

df_final = pd.read_hdf(self.get_local_file_path(hash,dataset_name))

you might need install pytables first.

both ways store data along types. therefore, should solve problem.

Search This Blog

Force Net

Proper way of writing and reading Dataframe to file in Python -

Comments

Post a Comment

Popular posts from this blog

python - Operations inside variables -

Generic Map Parameter java -

arrays - What causes a java.lang.ArrayIndexOutOfBoundsException and how do I prevent it? -