Aggregation json elements by sub-string

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP



Aggregation json elements by sub-string



I have the following structure:


[

"name": "a-v1",
"date": "2018-05-08T08:40:35.000Z"
,

"name": "a-v2",
"date": "2018-05-20T08:40:35.000Z"
,

"name": "a-v3",
"date": "2018-05-22T08:40:35.000Z"
,

"name": "b-v1",
"date": "2018-02-08T08:40:35.000Z"
,

"name": "b-v2",
"date": "2018-05-08T08:40:35.000Z"
,

"name": "b-v3",
"date": "2018-05-10T08:40:35.000Z"
,

"name": "c-v1",
"date": "2018-10-08T08:40:35.000Z"
,

"name": "c-v2",
"date": "2018-11-08T08:40:35.000Z"
,

"name": "d-v1",
"date": "2018-08-08T08:40:35.000Z"

]



Each name combines from type and version (In a-v1 for example, a in the name and v1 is the type).


name


a-v1



How can i create a list of all the name which are not the 2 latest versions?
In our case, The output would be:


name


a-v1
b-v1



Any idea how to do that in Python? I've been thinking about counting sub-strings. For example: To use - as a delimiter, And count how many times i find the left side of the string (aa, b, c). Is this possible to implement such as thing in Python? Any better ideas?


-





I don't see any problem with the approach you proposed .
– apple apple
12 mins ago






The output should contain also a-v3,d-v1,... ? why only a-v1 and b-v1?
– newbie
12 mins ago





or you may use something like priority queue with limit size, maybe overkill I think.
– apple apple
7 mins ago






@newbie I have 3 versions of a, And i want to keep only the 2 latest version, So and output would be a-v1 (Which is the oldest version). Same as for b. As for c and d, I don't have more than 2 versions of each, So the output would be empty for them.
– Omri
5 mins ago



a


a-v1


b


c


d





do you sort by postfix like v1 or you account for dates as well? do you need to check the order of v-somehting is in proper date?
– EPo
4 mins ago


v1




2 Answers
2



The problem would be easier with a slightly different data format.



You didn't write any code so I won't give you a complete answer:


data = ['name': 'a-v1', 'date': '2018-05-08T08:40:35.000Z', 'name': 'a-v2', 'date': '2018-05-20T08:40:35.000Z', 'name': 'a-v3', 'date': '2018-05-22T08:40:35.000Z', 'name': 'b-v1', 'date': '2018-02-08T08:40:35.000Z', 'name': 'b-v2', 'date': '2018-05-08T08:40:35.000Z', 'name': 'b-v3', 'date': '2018-05-10T08:40:35.000Z', 'name': 'c-v1', 'date': '2018-10-08T08:40:35.000Z', 'name': 'c-v2', 'date': '2018-11-08T08:40:35.000Z', 'name': 'd-v1', 'date': '2018-08-08T08:40:35.000Z']
temp = [d['name'].split('-') for d in data]
# [['a', 'v1'], ['a', 'v2'], ['a', 'v3'], ['b', 'v1'], ['b', 'v2'], ['b', 'v3'], ['c', 'v1'], ['c', 'v2'], ['d', 'v1']]
versions = [(letter, int(v[1:])) for letter, v in temp]
sorted(versions)



It outputs:


[('a', 1),
('a', 2),
('a', 3),
('b', 1),
('b', 2),
('b', 3),
('c', 1),
('c', 2),
('d', 1)]



You could now try to use itertools.groupby to group the versions by letter and remove every version but the last two ones for each group.


itertools.groupby



Assuming your list L is pre-sorted, you can use itertools.groupby:


L


itertools.groupby


from itertools import groupby
from operator import itemgetter

groups = [list(vals)[:-2] for _, vals in groupby(map(itemgetter('name'), L),
key=lambda x: x.split('-')[0])]
res = list(chain.from_iterable(filter(None, groups)))

# ['a-v1', 'b-v1']






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Firebase Auth - with Email and Password - Check user already registered

Dynamically update html content plain JS

How to determine optimal route across keyboard