Filter , group by and count in pandas?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP



Filter , group by and count in pandas?



A TSV file contains some user event data :


user_uid category event_type
"11" "like" "post"
"33" "share" "status"
"11" "like" "post"
"42" "share" "post"



what is the best way to get the number of post events for each category and for each user_id?


post



we should show the following output:


user_uid category count
"11" "like" 2
"42" "share" 1




1 Answer
1



Clean up any trailing whitespace so that things group properly. Filter your DataFrame, and then apply groupby + size


DataFrame


groupby


size


df['category'] = df.category.str.strip()
df['user_uid'] = df.user_uid.str.strip()
df[df.event_type == 'post'].groupby(['user_uid', 'category']).size()



Output:


user_uid category
11 like 2
42 share 1
dtype: int64





Did you try it ? because the result not as you wrote in your answer .
– Ahmed Gamal
Aug 6 at 15:57






@AhmedGamal Yes, df = pd.read_clipboard() followed by my code gave me that answer. How is your answer different?
– ALollz
Aug 6 at 16:00



df = pd.read_clipboard()





The solution should work as expected. Was curious so tried out myself and it does give the expected result df=pd.DataFrame('user_uid':['11','22','11','42'],'category':["like","share","like","share"],'event_type':["post","status","post","post"]) df[df.event_type == 'post'].groupby(['user_uid', 'category']).size()
– mad_
Aug 6 at 16:12


df=pd.DataFrame('user_uid':['11','22','11','42'],'category':["like","share","like","share"],'event_type':["post","status","post","post"]) df[df.event_type == 'post'].groupby(['user_uid', 'category']).size()





@AhmedGamal I'm not sure how that's the output of unique because that by definition has to de-duplicate the array, and those values look duplicated. But it seems like you have issues with surrounding whitespace. Try df['category'] = df.category.str.strip() for both the category and user_id variables.
– ALollz
Aug 6 at 16:13


unique


df['category'] = df.category.str.strip()


category


user_id





I got it i think the categories has some spaces , now it works
– Ahmed Gamal
Aug 6 at 16:14






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Firebase Auth - with Email and Password - Check user already registered

Dynamically update html content plain JS

How to determine optimal route across keyboard