remove multiple consecutive occurrences of .* in a string with a single .* python

remove multiple consecutive occurrences of .* in a string with a single .* python

Below is my code to check for the multiple consecutive occurrences of ".*"
in string. If there are multiple consecutive occurrences of ".*" then replace it with single ".*". Forg:

".*"

import re dot_star_check = re.compile('(.*)2,') k = ".*.*.*.*.*foo.*" k = k.replace(?,".*") if dot_star_check.search(k) else k print k

What i should write instead of ? to replace multiple consecutive occurrences of .* with single .*

.*

So, expected output is .*foo.*

.*foo.*

other egs:

1.) foo.*.*.*.*bar.* ->foo.*bar.*

foo.*.*.*.*bar.* ->foo.*bar.*

2.) .*foobar.*.*.*.*.* ->.*foobar.*

.*foobar.*.*.*.*.* ->.*foobar.*

2 Answers
2

You can use re.sub and (?:.*)+ as your pattern:

re.sub

(?:.*)+

import re dot_star_check = re.compile('(?:.*)+') k = ".*.*.*.*.*foo.*" k = re.sub(dot_star_check,'.*',k) print (k)

Prints:

.*foo.*

You could additionally improve the efficiency of the pattern so that substitution is performed only when there are 2 or more occurences using (?:.*)2, :

(?:.*)2,

import re dot_star_check = re.compile('(?:.*)2,') k = ".*.*.*.*.*foo.*" k = re.sub(dot_star_check,'.*',k) print (k)

Prints:

.*foo.*

why u have ? and : at the start?
– zubug55
Aug 10 at 22:24

It is a non capturing group. Using a non capturing group is generally more efficient than using a capturing group.
– UnbearableLightness
Aug 10 at 22:25

multiple consecutive occureences of .* is equivalent to single .* right?
– zubug55
Aug 10 at 22:28

I am not sure I follow, could you please clarify? What do you mean by your comment?
– UnbearableLightness
Aug 10 at 22:30

I am trying to look for data matching patterns -> .*.*.*.*foo.*.* or bar.*.*.*foo.*.*.* or anything else ; so instead of making a search with the above regexp strings; i am removing multiple consecutive .* with single .*? so , i am asking if there is any difference between .*.*.*.*foo.*.* and .*foo.* or not? I am doing this to make search efficient
– zubug55
Aug 10 at 22:35

I think you can just use re.sub and (.*)+ as your regex:

re.sub

(.*)+

s = 'foo.*.*.*.*bar.*' s2 = '.*foobar.*.*.*.*.* ' k = ".*.*.*.*.*foo.*" >>> re.sub('(.*)+','.*',s) 'foo.*bar.*' >>> re.sub('(.*)+','.*',s2) '.*foobar.* ' >>> re.sub('(.*)+','.*',k) '.*foo.*'

Using a character set in this case is not correct, since it would also match *. for instance rather than .*.
– UnbearableLightness
Aug 10 at 22:21

*.

.*

You're right, my mistake, edited
– sacul
Aug 10 at 22:22

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

tzmy122ldul6PIzu e2Fq7z,ZCudDInJ7QNG Gf795,StMY 1S,TfrYtqzRH52S1W8c

搜尋此網誌

Sfyjdyy