Python Pandas Series.isin doesn't work

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP



Python Pandas Series.isin doesn't work



I have a Series


8 [11820]
9 [11820]
10 [11820]
11 [11820]
12 [11820]
27 [10599]
28 [10599]
29 [10599]
31 [661, 10599]
32 [661, 10599]
33 [7322]
34 [0]
37 [661]
39 [661]
40 [661]
49 [0, 661, 662, 663]



I want to filter this Series with something like points[points.isin([0])] to get


points[points.isin([0])]


34 [0]
49 [0, 661, 662, 663]



but as a result I get 0 Features.





You are searching for lists. .isin() looks for values
– lhay86
Aug 10 at 9:12





@lhay86 yes, I want to find all features, which lists contain my value. Not only strict accordance
– sailestim
Aug 10 at 9:15






Possible duplicate of Python & Pandas: How to query if a list-type column contains something?
– Shaido
Aug 10 at 9:18




2 Answers
2



Simple way to check, if your value (0) is in the list, is by using apply on your series:


apply


s = s[s.apply(lambda x: 0 in x)]



Some explanation:

For every row it checks, whether 0 is in the list.



Apply returns a "True/False" series, where for every row True means that 0 is in the list inside the row.



After that your first series (s) is being filtered by this "True/False" series via .



Sample code:


# This is your series
s = pd.Series([[0],
[11820],
[11820],
[10599],
[0, 661, 662, 663]])

# This is the solution
s = s[s.apply(lambda x: 0 in x)]

# Print the result
print(s)

0 [0]
4 [0, 661, 662, 663]
Name: A, dtype: object



pd.Series.isin works by hashing and works on the entire element, i.e. it won't consider a partial match. Even for an exact match, since a list cannot be hashed, pd.Series.isin won't work with a series of lists.


pd.Series.isin


pd.Series.isin



You can use a custom function with pd.Series.apply:


pd.Series.apply


df = pd.DataFrame('A': [[1, 2], [0], [0, 2, 3]])

search_list = [0] # list of scalars

mask = df['A'].apply(lambda x: any(i in x for i in search_list))
res = df[mask]

print(res)

A
1 [0]
2 [0, 2, 3]



You can convert your series to tuples, which are hashable, before any comparison. Then compare your series of tuples with a list of tuples.


search_list = [[0]] # list of lists

mask = df['A'].map(tuple).isin(list(map(tuple, search_list)))
res = df[mask]

print(res)

A
1 [0]



Note operations with object dtype series will necessarily be inefficient. If possible, you should split your series of lists into multiple series of integers. Although, in this case, this may be cumbersome given the inconsistent list lengths.


object






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Firebase Auth - with Email and Password - Check user already registered

Dynamically update html content plain JS

How to determine optimal route across keyboard