pandas - how to make python loop faster to run pairwise association test -

July 15, 2013

i have list of patient id , drug names , list of patient id , disease names. want find indicative drug each disease.

to find want fisher exact test p-value each disease/drug pair. loop runs slowly, more 10 hours. there way make loop more efficient, or better way solve association problem?

my loop:

import numpy np import pandas pd scipy.stats import fisher_exact   most_indicative_medication = {} rx_list = list(meps_meds.rxname.unique())  disease_list = list(meps_base_data.columns.values)[8:]  in disease_list:     print     rx_dict = {}     j in rx_list:          subset = base[['id', i, 'rxname']].drop_duplicates()         subset[j] = subset['rxname'] == j         subset = subset.loc[subset[i].isin(['yes', 'no'])]         subset = subset[[i, j]]         tab = pd.crosstab(subset[i], subset[j])          if len(tab.columns) == 2:             rx_dict[j] = fisher_exact(tab)[1]         else:              rx_dict[j] = np.nan     most_indicative_medication[i] = min(rx_dict, key=rx_dict.get)

you need multiprocessing/multithreading, have added code.:

from multiprocessing.dummy import pool threadpool most_indicative_medication = {} rx_list = list(meps_meds.rxname.unique())  disease_list = list(meps_base_data.columns.values)[8:]  def run_pairwise(i):     print     rx_dict = {}     j in rx_list:          subset = base[['id', i, 'rxname']].drop_duplicates()         subset[j] = subset['rxname'] == j         subset = subset.loc[subset[i].isin(['yes', 'no'])]         subset = subset[[i, j]]         tab = pd.crosstab(subset[i], subset[j])          if len(tab.columns) == 2:             rx_dict[j] = fisher_exact(tab)[1]         else:              rx_dict[j] = np.nan     most_indicative_medication[i] = min(rx_dict, key=rx_dict.get)  pool = threadpool(3) pairwise_test_results = pool.map(run_pairwise,disease_list) pool.close() pool.join()

notes:http://chriskiehl.com/article/parallelism-in-one-line/

Search This Blog

Force Net

pandas - how to make python loop faster to run pairwise association test -

Comments

Post a Comment

Popular posts from this blog

ubuntu - PHP script to find files of certain extensions in a directory, returns populated array when run in browser, but empty array when run from terminal -

php - How can i create a user dashboard -

javascript - How to detect toggling of the fullscreen-toolbar in jQuery Mobile? -