Set the values out of the defined interval limits to a given value (f.e. NaN) for a column in pandas data frame
Clash Royale CLAN TAG#URR8PPP
Set the values out of the defined interval limits to a given value (f.e. NaN) for a column in pandas data frame
Having a defined interval limits of valid values, all the pandas data frame column values out of it should be set to a given value, f.e. NaN
. The values defining limits and data frame contents can be assumed to be of numerical type.
NaN
Having the following limits and data frame:
min = 2
max = 7
df = pd.DataFrame('a': [5, 1, 7, 22],'b': [12, 3 , 10, 9])
a b
0 5 12
1 1 3
2 7 10
3 22 9
Setting the limit on column a
would result in:
a
a b
0 5 12
1 NaN 3
2 7 10
3 NaN 9
2 Answers
2
Using where
with between
where
between
df.a=df.a.where(df.a.between(min,max),np.nan)
df
Out[146]:
a b
0 5.0 12
1 NaN 3
2 7.0 10
3 NaN 9
Or clip
clip
df.a.clip(min,max)
Out[147]:
0 5.0
1 NaN
2 7.0
3 NaN
Name: a, dtype: float64
clip
@DSM yep using mask is little bit redundant, since we have where already :-)
– Wen
Aug 6 at 14:54
Note: if min is greater than max then all the values will be converted to NaN using
df.a.between(min,max)
– Krzysztof Słowiński
Aug 7 at 13:01
df.a.between(min,max)
you can use .loc
with between
also
.loc
between
import pandas as pd
import numpy as np
df = pd.DataFrame('a': [5, 1, 7, 22],'b': [12, 3 , 10, 9])
min = 2
max = 7
df.loc[~df.a.between(min,max), 'a'] = np.nan
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
I'd forgotten about
clip
! It's better than my where/between, but I think your mask is a little ugly.– DSM
Aug 6 at 14:48