PowerShell Parsing Name and Version out of a String

Clash Royale CLAN TAG#URR8PPP
PowerShell Parsing Name and Version out of a String
I want to parse the name and the version of a string.
The schema of the string is the following:
EntityFramework.6.2.0
EntityFramework.Functions.1.4.1
What I want to have is an array or an object with the name of the package and the version.
The version number can have 1,2,3 or 4 digits and the name can also have "."
$version = @()
$name = @()
"EntityFramework.Functions.1.4.1".Split('.') | %
if ($_ -match "^d+$")
$version += $_
else
$name += $_
$name -join "."
$version -join "."
This works but I think there is a better way to do it.
Any Idea to short this snippet or make it more smart.
3 Answers
3
This can be improved by just relying on regex from the start:
$null = 'EntityFramework.Functions.1.4.1' -match '(?<name>[^d]+)(?<version>d.+)'
$name, $version = $Matches['name'].TrimEnd('.'), [version]$Matches['version']
$name
>> EntityFramework.Functions
$version
>> Major Minor Build Revision
>> ----- ----- ----- --------
>> 1 4 1 -1
Explained:
( // Capture a group
?<name> // Name it "name"
[^d]+ // Capture until you find a digit
) // End capture group
( // Capture a group
?<version> // Name it "version"
d.+ // Start at a digit and wildcard catch everything after
) // End capture group
Shortened (for haxxorz):
if ('EntityFramework.Functions.1.4.1' -match '(.*?(?=.d)).(.+)')
$name, [version]$version = $matches[1, 2]
(gottagoshort):
$name,$version='EntityFramework.Functions.1.4.1'-split'(?<=[^d]).(?=d)'
Nice, this looks like smart. Thank you for the improvement.
– Wimpy
Aug 10 at 19:33
@Wimpy I updated my answer with the
if check, but I also suggest looking at mklement0's answer for an even better solution.– TheIncorrigible1
Aug 10 at 20:19
if
I found a special case which not work with your improvement. Could you check the string with your regex solution. System.Security.Cryptography.X509Certificates.4.3.2
– Wimpy
Aug 11 at 8:13
Note: This is an optimized variation of the original answer below, courtesy of TheIncorrigible1.
By using the -split operator with a separator regex that uses lookaround assertions, it is possible to split the string in the desired location with a single operation:
-split
# Stores 'EntityFramework.Functions' in $name
# and '1.4.1' in $version
$name, $version = "EntityFramework.Functions.1.4.1" -split '(?<=[^d]).(?=d)'
(?<=[^d]).(?=d) uses a look-behind assertion ((?<=...) and a look-ahead assertion ((?=...)) to provide the desired context for matching the literal . (.):
(?<=[^d]).(?=d)
(?<=...)
(?=...)
.
.
The regex matches the . only if preceded by a character that is not a digit ([^d]) and is followed by a digit, which is where we want to split: between the end of the package name and the start of the version number.
.
[^d]
Regex assertions in general do not capture characters, so that even though the surrounding character are looked at, it is only the . that is considered the separator, ensuring that the tokens on either side of it are returned in full.
.
The result of the -split operation is a 2-element array, whose elements can be assigned to indidvidual variables via a destructuring assignment ($name, $version = ...)
-split
$name, $version = ...
Original answer:
Note: While the regex used below is slightly shorter than the one above, its interplay with -split is actually conceptually more complex, and the solution requires an additional operation to filter out an empty result element (-ne '').
-split
-ne ''
A more concise solution that uses the -split operator with a regex (regular expression):
-split
# Stores 'EntityFramework.Functions' in $name
# and '1.4.1' in $version
$name, $version = "EntityFramework.Functions.1.4.1" -split '^([^d]+).' -ne ''
^([^d]+). starts matching at the start of the string (^) and matches one or more (+) non-digit characters ([^d]) followed by a literal . (.)
^([^d]+).
^
+
[^d]
.
.
This matches EntityFramework.Functions., but, due to enclosing only the part before the trailing . in (...) to form a capture group, only EntityFramework.Functions is returned.
(By default, what the separator regex matches is not returned - after all, you just want the tokens between the separators - but a capture group embedded in the regex can be used to deliberate include part of the separator in the result array).
EntityFramework.Functions.
.
(...)
EntityFramework.Functions
The separator regex is by definition not found again in the input string (because it is anchored at the start of the string with ^, so the remainder of the string - 1.4.1 - is considered the 2nd and only remaining token.
^
1.4.1
-ne '' filters out the empty first element of the resulting array that is a side effect of the string starting with a match of the separator regex expression.
-ne ''
'foo,bar;baz' -split '[,;]'
'foo', 'bar', 'baz'
Will the
-split always generate an empty string?– TheIncorrigible1
Aug 10 at 19:56
-split
@TheIncorrigible1: No, only if the string starts with something that matches the separator regex.
– mklement0
Aug 10 at 19:57
@TheIncorrigible1: I've added an explanation to your variant, but I've also kept the original answer, because juxtaposing the two approaches may be interesting. The look-around regex, while slightly longer, is not only preferable because you don't need the
ne '', but it is also conceptually simpler in the context of -split. Thanks for making this answer better.– mklement0
Aug 10 at 20:19
ne ''
-split
@(
'EntityFramework.6.2.0',
'EntityFramework.Functions.1.4.1'
) | %
[pscustomobject]@
name = $_ -replace '.([0-9]).*([0-9])$'
version = $_ -replace '^([A-Za-z]).*([A-Za-z]).'
This separates each item based on a group of character types.
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Ok, I added my working solution.
– Wimpy
Aug 10 at 19:19