Pandas: update column values from another column if criteria [duplicate]
Clash Royale CLAN TAG#URR8PPP
Pandas: update column values from another column if criteria [duplicate]
This question already has an answer here:
I have a DataFrame:
A B
1: 0 1
2: 0 0
3: 1 1
4: 0 1
5: 1 0
I want to update each item column A of the DataFrame with values of column B if value from column A equals 0.
DataFrame I want to get:
A B
1: 1 1
2: 0 0
3: 1 1
4: 1 1
5: 1 0
I've already tried this code
df['A'] = df['B'].apply(lambda x: x if df['A'] == 0 else df['A'])
df['A'] = df['B'].apply(lambda x: x if df['A'] == 0 else df['A'])
It raise an error :The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
3 Answers
3
Use where
where
In [348]: df.A = np.where(df.A.eq(0), df.B, df.A)
In [349]: df
Out[349]:
A B
1: 1 1
2: 0 0
3: 1 1
4: 1 1
5: 1 0
df['A'] = df.apply(lambda x: x['B'] if x['A']==0 else x['A'], axis=1)
Output
A B
1: 1 1
2: 0 0
3: 1 1
4: 1 1
5: 1 0
You can perform this by using a mask:
df = pd.DataFrame()
df['A'] = [0,0,1,0,1]
df['B'] = [1,0,1,1,0]
mask = (df.A == 0)
df.loc[mask,'A'] = df.loc[mask,'B']
A B
0 1 1
1 0 0
2 1 1
3 1 1
4 1 0
EDIT:
Ok this is actually a unefficient solution:
%timeit df.loc[mask,'A'] = df.loc[mask,'B']
%timeit df.apply(lambda x: x['B'] if x['A']==0 else x['A'], axis=1)
%timeit np.where(df.A.eq(0), df.B, df.A)
5.52 ms ± 556 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
1.27 ms ± 167 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
796 µs ± 89.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
So thanks to zero for this efficient solution with np.where!
Which solution is more efficient by time, yours or by Rusabh?
– sailestim
Aug 10 at 13:12