How to do 2 tests when filtering a RDD in pyspark?
Clash Royale CLAN TAG#URR8PPP
How to do 2 tests when filtering a RDD in pyspark?
I have 2 parameters:
NB_line =10
NB2_line=11
I have a python
function, where I did a test of a number of the lines in my dataframe if is not OK.
The dataframe that take 2 cases of number of lines, is NB_line=10
or NB2_line=11
.
python
NB_line=10
NB2_line=11
in the begin it was like this my dataframe:
rddLignesErreur=rddstats.filter(lambda x : len(x) != NB_line)
After evolution of a use case, I modified it like this:
rddLignesErreur=rddstats.filter(lambda x : len(x) != NB_line or len(x) != NB2_line)
Is it true or I or no ? ==> I'm beginning in python.
Thank you
or
1 Answer
1
Why not just use not in
?
not in
lambda x: len(x) not in NB_line, NB2_line
Thank you, I will try it.
– vero
Aug 6 at 10:25
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
The
or
is correct. Within a lambda expression, you have to write plain python code. Note also that if NB_line and NB2_line are different, your condition will always be true.– Oli
Aug 6 at 9:58