Proper way of writing and reading Dataframe to file in Python -


i write , later read dataframe in python.

df_final.to_csv(self.get_local_file_path(hash,dataset_name), sep='\t', encoding='utf8') ... df_final = pd.read_table(self.get_local_file_path(hash,dataset_name), encoding='utf8',index_col=[0,1]) 

but get:

sys:1: dtypewarning: columns (7,17,28) have mixed types. specify dtype option on import or set low_memory=false.

i found question. in bottom line says should specify field types when read file because "low_memory" deprecated... find inefficient.

isn't there simple way write & later read dataframe? don't care human-readability of file.

you can pickle dataframe:

df_final.to_pickle(self.get_local_file_path(hash,dataset_name)) 

read later:

df_final = pd.read_pickle(self.get_local_file_path(hash,dataset_name)) 

if dataframe ist big , gets slow, might have more luck using hdf5 format:

df_final.to_hdf(self.get_local_file_path(hash,dataset_name)) 

read later:

df_final = pd.read_hdf(self.get_local_file_path(hash,dataset_name)) 

you might need install pytables first.

both ways store data along types. therefore, should solve problem.


Comments

Popular posts from this blog

python - Operations inside variables -

Generic Map Parameter java -

arrays - What causes a java.lang.ArrayIndexOutOfBoundsException and how do I prevent it? -