Proper way of writing and reading Dataframe to file in Python -
i write , later read dataframe in python.
df_final.to_csv(self.get_local_file_path(hash,dataset_name), sep='\t', encoding='utf8') ... df_final = pd.read_table(self.get_local_file_path(hash,dataset_name), encoding='utf8',index_col=[0,1])
but get:
sys:1: dtypewarning: columns (7,17,28) have mixed types. specify dtype option on import or set low_memory=false.
i found question. in bottom line says should specify field types when read file because "low_memory" deprecated... find inefficient.
isn't there simple way write & later read dataframe? don't care human-readability of file.
you can pickle dataframe:
df_final.to_pickle(self.get_local_file_path(hash,dataset_name))
read later:
df_final = pd.read_pickle(self.get_local_file_path(hash,dataset_name))
if dataframe ist big , gets slow, might have more luck using hdf5 format:
df_final.to_hdf(self.get_local_file_path(hash,dataset_name))
read later:
df_final = pd.read_hdf(self.get_local_file_path(hash,dataset_name))
you might need install pytables first.
both ways store data along types. therefore, should solve problem.
Comments
Post a Comment