Storing Python RegEx multiple groups

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP



Storing Python RegEx multiple groups



I'm webscrapping a site using python. The returned results have the following format, ( https://regex101.com/r/irr14u/10 ), where everything works ok apart from the last occassion where i get 2 matches for the dates (1st match:Thur.-Sun., Tue., Wed. and second match: Mon.)



I'm using the following code to get the values that i want. I use BeautifoulSoup to get movieDate string, but here i hardcoded it.


movieDate="Thur.-Sun., Tue., Wed.: 20.50/ 23.00, Mon. 23.00"

weekDays=re.match(',? *(?P<weekDays>[^d:n]+):? *(?P<startTime>[^,n]+)', movieDate).groupdict()['weekDays']
startTime=re.match(',? *(?P<weekDays>[^d:n]+):? *(?P<startTime>[^,n]+)', movieDate).groupdict()['startTime']



I want to create a dictionary as following (it has two keys because the are two startTime values);
The first key will be Thur.-Sun., Tue., Wed. with value =20.50/ 23.00
and the second key will be Mon. with value=23:00.
There might be occassions with one or more than two keys. So the dictionary will be as following;


dictionary= Thur.-Sun., Tue., Wed.: 20.50/ 23.00, Mon.: 23.00



Any suggestions to achieve that in a non boggy way?





Can you provide details on where you are stuck and how things work?
– Mike Tung
Aug 8 at 17:37





I dont know how to create the dictionary and especially how to get the multiple regex matches
– sotokan80
Aug 8 at 17:56





Can you provide a minimal complete verifiable example? Please read this
– Mike Tung
Aug 8 at 18:23




1 Answer
1



You can achieve the desired output using finditer function, appending result of the captured groups to a dict dynamically.


finditer



Python snippet:


import re
movieDate = """
Thur.-Sun., Tue., Wed.: 20.50/ 23.00, Mon. 23.00
"""

d = dict();
r = re.compile(',? *(?P<weekDays>[^d:n]+):? *(?P<startTime>[^,n]+)')
for m in r.finditer(movieDate):
d[m.group(1)] = m.group(2)

print(d)



Prints:


'Thur.-Sun., Tue., Wed.': '20.50/ 23.00', 'Mon. ': '23.00'





Thank you very much it worked fine. Another thing i want to ask is; weekDays=re.match(',? *(?P<weekDays>[^d:n]+):? *(?P<startTime>[^,n]+)', movieDate).groupdict()['weekDays'] prints only first match, how i have to modify it to print all the matches?
– sotokan80
Aug 8 at 19:09






You are welcome, I believe that is because re.match is anchored at the beginning of the string. An alternative is to re.search instead. You may refer to this answer for more details.
– UnbearableLightness
Aug 8 at 19:15



re.match


re.search






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Firebase Auth - with Email and Password - Check user already registered

Dynamically update html content plain JS

How to determine optimal route across keyboard