remove multiple consecutive occurrences of .* in a string with a single .* python
Clash Royale CLAN TAG#URR8PPP
remove multiple consecutive occurrences of .* in a string with a single .* python
Below is my code to check for the multiple consecutive occurrences of ".*"
in string. If there are multiple consecutive occurrences of ".*"
then replace it with single ".*"
. Forg:
".*"
".*"
".*"
import re
dot_star_check = re.compile('(.*)2,')
k = ".*.*.*.*.*foo.*"
k = k.replace(?,".*") if dot_star_check.search(k) else k
print k
What i should write instead of ? to replace multiple consecutive occurrences of .*
with single .*
.*
.*
So, expected output is .*foo.*
.*foo.*
other egs:
1.) foo.*.*.*.*bar.* ->foo.*bar.*
foo.*.*.*.*bar.* ->foo.*bar.*
2.) .*foobar.*.*.*.*.* ->.*foobar.*
.*foobar.*.*.*.*.* ->.*foobar.*
2 Answers
2
You can use re.sub
and (?:.*)+
as your pattern:
re.sub
(?:.*)+
import re
dot_star_check = re.compile('(?:.*)+')
k = ".*.*.*.*.*foo.*"
k = re.sub(dot_star_check,'.*',k)
print (k)
Prints:
.*foo.*
You could additionally improve the efficiency of the pattern so that substitution is performed only when there are 2 or more occurences using (?:.*)2,
:
(?:.*)2,
import re
dot_star_check = re.compile('(?:.*)2,')
k = ".*.*.*.*.*foo.*"
k = re.sub(dot_star_check,'.*',k)
print (k)
Prints:
.*foo.*
It is a non capturing group. Using a non capturing group is generally more efficient than using a capturing group.
– UnbearableLightness
Aug 10 at 22:25
multiple consecutive occureences of .* is equivalent to single .* right?
– zubug55
Aug 10 at 22:28
I am not sure I follow, could you please clarify? What do you mean by your comment?
– UnbearableLightness
Aug 10 at 22:30
I am trying to look for data matching patterns -> .*.*.*.*foo.*.* or bar.*.*.*foo.*.*.* or anything else ; so instead of making a search with the above regexp strings; i am removing multiple consecutive .* with single .*? so , i am asking if there is any difference between .*.*.*.*foo.*.* and .*foo.* or not? I am doing this to make search efficient
– zubug55
Aug 10 at 22:35
I think you can just use re.sub
and (.*)+
as your regex:
re.sub
(.*)+
s = 'foo.*.*.*.*bar.*'
s2 = '.*foobar.*.*.*.*.* '
k = ".*.*.*.*.*foo.*"
>>> re.sub('(.*)+','.*',s)
'foo.*bar.*'
>>> re.sub('(.*)+','.*',s2)
'.*foobar.* '
>>> re.sub('(.*)+','.*',k)
'.*foo.*'
Using a character set in this case is not correct, since it would also match
*.
for instance rather than .*
.– UnbearableLightness
Aug 10 at 22:21
*.
.*
You're right, my mistake, edited
– sacul
Aug 10 at 22:22
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
why u have ? and : at the start?
– zubug55
Aug 10 at 22:24