remove multiple consecutive occurrences of .* in a string with a single .* python

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP



remove multiple consecutive occurrences of .* in a string with a single .* python



Below is my code to check for the multiple consecutive occurrences of ".*"
in string. If there are multiple consecutive occurrences of ".*" then replace it with single ".*". Forg:


".*"


".*"


".*"


import re

dot_star_check = re.compile('(.*)2,')

k = ".*.*.*.*.*foo.*"

k = k.replace(?,".*") if dot_star_check.search(k) else k

print k



What i should write instead of ? to replace multiple consecutive occurrences of .* with single .*


.*


.*



So, expected output is .*foo.*


.*foo.*



other egs:



1.) foo.*.*.*.*bar.* ->foo.*bar.*


foo.*.*.*.*bar.* ->foo.*bar.*



2.) .*foobar.*.*.*.*.* ->.*foobar.*


.*foobar.*.*.*.*.* ->.*foobar.*




2 Answers
2



You can use re.sub and (?:.*)+ as your pattern:


re.sub


(?:.*)+


import re
dot_star_check = re.compile('(?:.*)+')
k = ".*.*.*.*.*foo.*"
k = re.sub(dot_star_check,'.*',k)

print (k)



Prints:


.*foo.*



You could additionally improve the efficiency of the pattern so that substitution is performed only when there are 2 or more occurences using (?:.*)2, :


(?:.*)2,


import re
dot_star_check = re.compile('(?:.*)2,')
k = ".*.*.*.*.*foo.*"
k = re.sub(dot_star_check,'.*',k)

print (k)



Prints:


.*foo.*





why u have ? and : at the start?
– zubug55
Aug 10 at 22:24





It is a non capturing group. Using a non capturing group is generally more efficient than using a capturing group.
– UnbearableLightness
Aug 10 at 22:25





multiple consecutive occureences of .* is equivalent to single .* right?
– zubug55
Aug 10 at 22:28





I am not sure I follow, could you please clarify? What do you mean by your comment?
– UnbearableLightness
Aug 10 at 22:30





I am trying to look for data matching patterns -> .*.*.*.*foo.*.* or bar.*.*.*foo.*.*.* or anything else ; so instead of making a search with the above regexp strings; i am removing multiple consecutive .* with single .*? so , i am asking if there is any difference between .*.*.*.*foo.*.* and .*foo.* or not? I am doing this to make search efficient
– zubug55
Aug 10 at 22:35



I think you can just use re.sub and (.*)+ as your regex:


re.sub


(.*)+


s = 'foo.*.*.*.*bar.*'
s2 = '.*foobar.*.*.*.*.* '
k = ".*.*.*.*.*foo.*"

>>> re.sub('(.*)+','.*',s)
'foo.*bar.*'
>>> re.sub('(.*)+','.*',s2)
'.*foobar.* '
>>> re.sub('(.*)+','.*',k)
'.*foo.*'





Using a character set in this case is not correct, since it would also match *. for instance rather than .*.
– UnbearableLightness
Aug 10 at 22:21


*.


.*





You're right, my mistake, edited
– sacul
Aug 10 at 22:22






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Firebase Auth - with Email and Password - Check user already registered

Dynamically update html content plain JS

How to determine optimal route across keyboard