get unique combination values of a correlation matrix - pandas

Let's suppose I have a correlation matrix that looks like this:

df = pd.DataFrame(data='a':[1,0.2,0.3,0.4],'b':[0.2,1,0.5,0.6],'c':[0.3,0.5,1,0.7],'d':[0.4,0.6,0.7,1], index=['a','b','c','d'])

what is the best way to extract the unique values of each pairwise combination (a-b, a-c, etc)?

df2 = a_b a_c a_d b_c b_d c_d 0.2 0.3 0.4 0.5 0.6 0.7

the only way I see doing this is to write my own function, but was wondering if someone knows a shortcut for this

2 Answers
2

IIUC:

df_out = df.stack() df_out.index = df_out.index.map('_'.join) df_out = df_out.to_frame().T

Output:

a_a a_b a_c a_d b_a b_b b_c b_d c_a c_b c_c c_d d_a d_b d_c 0 1.0 0.2 0.3 0.4 0.2 1.0 0.5 0.6 0.3 0.5 1.0 0.7 0.4 0.6 0.7

And, if you want to get rid of a_a, b_b, etc..

df_out = df.stack() df_out = df_out[df_out.index.get_level_values(0) != df_out.index.get_level_values(1)] df_out.index = df_out.index.map('_'.join) df_out = df_out.to_frame().T

Output

a_b a_c a_d b_a b_c b_d c_a c_b c_d d_a d_b d_c 0 0.2 0.3 0.4 0.2 0.5 0.6 0.3 0.5 0.7 0.4 0.6 0.7

Or to get rid of b_a and keep a_b:

df_out = df.stack() df_out = df_out[df_out.index.get_level_values(0) < df_out.index.get_level_values(1)] df_out.index = df_out.index.map('_'.join) df_out = df_out.to_frame().T

Or combining a few lines using lambda function in .loc:

.loc

df_out = df.stack().loc[lambda x: x.index.get_level_values(0) < x.index.get_level_values(1)] df_out.index = df_out.index.map('_'.join) df_out = df_out.to_frame().T

Output:

a_b a_c a_d b_c b_d c_d 0 0.2 0.3 0.4 0.5 0.6 0.7

Thanks, that works great! I knew that there would be a more elegant way to do this than writing my own function. Thanks a lot!
– HappyPy
Aug 9 at 19:44

@HappyPy Thank you. You're welcome. Happy coding!
– Scott Boston
Aug 9 at 19:46

IIUC, you can play with indexes

df2 = df.unstack().reset_index() s = df2[['level_0', 'level_1']].agg(frozenset,1).drop_duplicates() df2 = df2.loc[s.index] ind = df2.agg(lambda k: (k['level_0']+'_'+k['level_1']), axis=1) df2.set_index(ind)[0].to_frame().T a_a a_b a_c a_d b_b b_c b_d c_c c_d d_d 0 1.0 0.2 0.3 0.4 1.0 0.5 0.6 1.0 0.7 1.0

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

搜尋此網誌

Sfyjdyy

get unique combination values of a correlation matrix - pandas

get unique combination values of a correlation matrix - pandas

2 Answers
2

Popular posts from this blog

make 2 or more post in bootsrap

Store custom data using WC_Cart add_to_cart() method in Woocommerce 3

React Native Navigation and navigating to another Screen problem

get unique combination values of a correlation matrix - pandas

get unique combination values of a correlation matrix - pandas

2 Answers 2

Popular posts from this blog

make 2 or more post in bootsrap

Store custom data using WC_Cart add_to_cart() method in Woocommerce 3

React Native Navigation and navigating to another Screen problem

2 Answers
2