python - Creating a heatmap by sampling and bucketing from a 3D array -

January 15, 2013

i have experimental data exists so:

x = array([1, 1.12, 1.109, 2.1, 3, 4.104, 3.1, ...]) y = array([-9, -0.1, -9.2, -8.7, -5, -4, -8.75, ...]) z = array([10, 4, 1, 4, 5, 0, 1, ...])

if it's convenient, can assume data exists 3d array or pandas dataframe:

df = pd.dataframe({'x': x, 'y': y, 'z': z})

the interpretation being, every position x[i], y[i], value of variable z[i]. these not evenly sampled, there parts "densely sampled" (e.g. between 1 , 1.2 in x) , others sparse (e.g. between 2 , 3 in x). because of this, can't chuck these pcolormesh or contourf.

what instead resample x , y evenly @ fixed interval , aggregate values of z. needs, z can summed or averaged meaningful values, not problem. naïve attempt this:

x = np.arange(min(x), max(x), 0.1)   y = np.arange(min(y), max(y), 0.1) x_g, y_g = np.meshgrid(x, y) nx, ny = x_g.shape z_g = np.full(x_g.shape, np.nan)  ix in range(nx - 1):     jx in range(ny - 1):         x_min = x_g[ix, jx]         x_max = x_g[ix + 1, jx + 1]         y_min = y_g[ix, jx]         y_max = y_g[ix + 1, jx + 1]         vals = df[(df.x >= x_min) & (df.x < x_max) &                    (df.y >= y_min) & (df.y < y_max)].z.values         if vals.any():             z_g[ix, jx] = sum(vals)

this works , output desire, plt.contourf(x_g, y_g, z_g) slow! have ~20k samples, subsample ~800 samples in x , ~500 in y, meaning loop 400k long.

is there way vectorize/optimize this? better if there function this!

(also tagging matlab because syntax between numpy/matlab similar , have access both software.)

here's vectorized python solution employing numpy broadcasting , matrix multiplication np.dot sum-reduction part -

x_mask = ((x >= x[:-1,none]) & (x < x[1:,none])) y_mask = ((y >= y[:-1,none]) & (y < y[1:,none]))  z_g_out = np.dot(y_mask*z[none].astype(np.float32), x_mask.t)  # if needed fill invalid places nans z_g_out[y_mask.dot(x_mask.t.astype(np.float32))==0] = np.nan

note avoiding use of meshgrid there. thus, saving memory there meshes created meshgrid huge , in process gaining performance improvement.

benchmarking

# original app def org_app(x,y,z):         x = np.arange(min(x), max(x), 0.1)       y = np.arange(min(y), max(y), 0.1)     x_g, y_g = np.meshgrid(x, y)     nx, ny = x_g.shape     z_g = np.full(np.asarray(x_g.shape)-1, np.nan)      ix in range(nx - 1):         jx in range(ny - 1):             x_min = x_g[ix, jx]             x_max = x_g[ix + 1, jx + 1]             y_min = y_g[ix, jx]             y_max = y_g[ix + 1, jx + 1]             vals = z[(x >= x_min) & (x < x_max) &                        (y >= y_min) & (y < y_max)]             if vals.any():                 z_g[ix, jx] = sum(vals)     return z_g  # proposed app def app1(x,y,z):     x = np.arange(min(x), max(x), 0.1)       y = np.arange(min(y), max(y), 0.1)     x_mask = ((x >= x[:-1,none]) & (x < x[1:,none]))     y_mask = ((y >= y[:-1,none]) & (y < y[1:,none]))      z_g_out = np.dot(y_mask*z[none].astype(np.float32), x_mask.t)      # if needed fill invalid places nans     z_g_out[y_mask.dot(x_mask.t.astype(np.float32))==0] = np.nan     return z_g_out

as seen, fair benchmarking, using array values original approach, fetching values dataframe slow things down.

timings , verification -

in [143]: x = np.array([1, 1.12, 1.109, 2.1, 3, 4.104, 3.1])      ...: y = np.array([-9, -0.1, -9.2, -8.7, -5, -4, -8.75])      ...: z = np.array([10, 4, 1, 4, 5, 0, 1])      ...:   # verify outputs in [150]: np.nansum(np.abs(org_app(x,y,z) - app1(x,y,z))) out[150]: 0.0  in [145]: %timeit org_app(x,y,z) 10 loops, best of 3: 19.9 ms per loop  in [146]: %timeit app1(x,y,z) 10000 loops, best of 3: 39.1 µs per loop  in [147]: 19900/39.1  # speedup figure out[147]: 508.95140664961633

Search This Blog

Force Net

python - Creating a heatmap by sampling and bucketing from a 3D array -

benchmarking

Comments

Post a Comment

Popular posts from this blog

ubuntu - PHP script to find files of certain extensions in a directory, returns populated array when run in browser, but empty array when run from terminal -

php - How can i create a user dashboard -

javascript - How to detect toggling of the fullscreen-toolbar in jQuery Mobile? -