pandas: Grouping by two columns and then sorting it by the values of a third column

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP



pandas: Grouping by two columns and then sorting it by the values of a third column



I have the following line:



genre_df.groupby(['release_year', 'genres']).vote_average.mean()


genre_df.groupby(['release_year', 'genres']).vote_average.mean()



This gives me the following:


release_year genres
1960 Action 6.950000
Adventure 7.150000
Comedy 7.900000
Drama 7.600000
Fantasy 7.300000
History 6.900000
Horror 8.000000
Romance 7.600000
Science Fiction 7.300000
Thriller 7.650000
Western 7.000000
1961 Action 7.000000
Adventure 6.800000
Animation 6.600000
Comedy 7.000000
Crime 6.600000
Drama 7.000000
Family 6.600000
History 6.700000
Music 6.600000
Romance 7.400000
War 7.000000
...



What I'd like to see is the df grouped by release year and genre, but sorted by the highest vote average first.



AKA:


release_year genres
1960 Horror 8.000000
Comedy 7.900000
Action 6.950000
Thriller 7.650000
Drama 7.600000
Romance 7.600000
Fantasy 7.300000
Science Fiction 7.300000
Adventure 7.150000
Western 7.000000
History 6.900000



How can this be achieved?




2 Answers
2



Solution for 0.23.0+ - first create one column DataFrame by to_frame and then sort_values:


DataFrame


to_frame


sort_values


df = df.to_frame().sort_values(['release_year','vote_average'], ascending=[True, False])
print (df)
vote_average
release_year genres
1960 Horror 8.00
Comedy 7.90
Thriller 7.65
Drama 7.60
Romance 7.60
Fantasy 7.30
Science Fiction 7.30
Adventure 7.15
Western 7.00
Action 6.95
History 6.90
1961 Romance 7.40
Action 7.00
Comedy 7.00
Drama 7.00
War 7.00
Adventure 6.80
History 6.70
Animation 6.60
Crime 6.60
Family 6.60
Music 6.60



For oldier versions of pandas is necessary reset_index and set_index:


reset_index


set_index


df = (df.reset_index()
.sort_values(['release_year','vote_average'], ascending=[True, False])
.set_index(['release_year','genres']))



try this:


genre_df = genre_df.reset_index()
genre_df.sort_values(['vote_average'],ascending=False)






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Firebase Auth - with Email and Password - Check user already registered

Dynamically update html content plain JS

How to determine optimal route across keyboard