extract value from the last row in table in shell script [closed]
Clash Royale CLAN TAG#URR8PPP
extract value from the last row in table in shell script [closed]
I have a file (data.txt) with below content. It has multiple rows in it separated by sequence of -
. It looks like a graphical table put in a file. In the below file, first row has all the column names and all other rows are the actual data for all those columns.
-
Connecting to the ControlService endpoint
Found 3 rows.
Requests List:
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Client ID | Client Type | Service Type | Status | Trust Domain | Data Instance Name | Data Version | Creation Time | Last Update | Scheduled Time |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
REFRESH_ROUTINGTIER_ARTIFACTS_1465901168866 | ROUTINGTIER_ARTIFACTS | SYSTEM | COMPLETED | RRA Bulk Client | soa_server1 | 18.2.2.0.0 | 2016-06-14 03:49:55 -07:00 | 2016-06-14 03:49:57 -07:00 | --- |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
500333443 | CREATE | [FA_GSI] | COMPLETED | holder | soa_server1 | 18.3.2.0.0 | 2018-08-07 11:59:57 -07:00 | 2018-08-07 12:04:37 -07:00 | --- |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
500333446 | CREATE | [FA_GSI] | COMPLETED | holder-test | soa_server1 | 18.3.2.0.0 | 2018-08-07 12:04:48 -07:00 | 2018-08-07 12:08:52 -07:00 | --- |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Now I want to parse the above file and extract value from the last row. I want to extract value of "Client ID" and "Trust Domain" column in the last row which is:
Client ID: 500333446
Trust Domain: holder-test
Is this possible to do in shell script or perl or python?
Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.
If the issue is resolved please mark one of the answers as correct
– mrangry777
Aug 8 at 4:02
Please avoid "Give me the codez" questions. Instead show the script you are working on and state where the problem is. Also see How much research effort is expected of Stack Overflow users?
– jww
Aug 8 at 8:38
4 Answers
4
The solution by @paragbaxi is fine I would only add an condition to filter out the lines containing only "------". Like so:
import csv
lines_to_skip = 4
with open('data.csv', 'r') as f:
reader = csv.reader(f, delimiter='|')
for i in range(lines_to_skip):
next(reader) #Skipping lines
data =
for line in reader:
if line[0].find("---") != 0: #Check what position has symbol "---" if 0 then skip
print(line)
data.append(line)
print("Last row:n".format(data[-1]))
print("Client ID: Domain:".format(data[-1][0].replace(" ",""),data[-1][4].replace(" ",""))) #replace() just removes unnecessary spaces
Output:
[' Client ID ', ' Client Type ', ' Service Type ', ' Status ', ' Trust Domain ', ' Data Instance Name ', ' Data Version ', ' Creation Time ', ' Last Update ', ' Scheduled Time ', ' ']
[' REFRESH_ROUTINGTIER_ARTIFACTS_1465901168866 ', ' ROUTINGTIER_ARTIFACTS ', ' SYSTEM ', ' COMPLETED ', ' RRA Bulk Client ', ' soa_server1 ', ' 18.2.2.0.0 ', ' 2016-06-14 03:49:55 -07:00 ', ' 2016-06-14 03:49:57 -07:00 ', ' --- ', ' ']
[' 500333443 ', ' CREATE ', ' [FA_GSI] ', ' COMPLETED ', ' holder ', ' soa_server1 ', ' 18.3.2.0.0 ', ' 2018-08-07 11:59:57 -07:00 ', ' 2018-08-07 12:04:37 -07:00 ', ' --- ', ' ']
[' 500333446 ', ' CREATE ', ' [FA_GSI] ', ' COMPLETED ', ' holder-test ', ' soa_server1 ', ' 18.3.2.0.0 ', ' 2018-08-07 12:04:48 -07:00 ', ' 2018-08-07 12:08:52 -07:00 ', ' --- ', ' ']
Last row:
[' 500333446 ', ' CREATE ', ' [FA_GSI] ', ' COMPLETED ', ' holder-test ', ' soa_server1 ', ' 18.3.2.0.0 ', ' 2018-08-07 12:04:48 -07:00 ', ' 2018-08-07 12:08:52 -07:00 ', ' --- ', ' ']
Client ID:500333446 Domain:holder-test
Process finished with exit code 0
yeah this make sense as I can see what's the last row. now how can I extract value of
Client ID
and Trust Domain
from last row? Basically it should give me these 500333446
and holder-test
– flash
Aug 8 at 3:48
Client ID
Trust Domain
500333446
holder-test
@flash Yes, check my edited answer, this fragment
.replace(" ","")
replaces spaces so data looks more compacted you could also just use a substring to remove them– mrangry777
Aug 8 at 3:54
.replace(" ","")
yeah this works fine.. thanks for the help.. also if I just need to print domain value without any of its key then will it be like this
print("".format(data[-1][4].replace(" ","")))
. I tried this and it prints nothing. Basically I just want to print holder-test
that's all.– flash
Aug 8 at 4:05
print("".format(data[-1][4].replace(" ","")))
holder-test
You are using print incorrectly it should be like so
print("".format(data[-1][4].replace(" ","")))
the
is a placeholder for where to inject the value in a string– mrangry777
Aug 8 at 4:07
print("".format(data[-1][4].replace(" ","")))
yeah I fixed it after checking that out again.
– flash
Aug 8 at 4:11
Yes, it's possible to do in python. I suggest csv module and customize delimiter to '|'.
import csv
with open('s', 'r') as f:
reader = csv.reader(f, delimiter='|')
for row in reader:
print(row)
Gives the following list:
['Connecting to the ControlService endpoint']
['Found 3 rows.']
['Requests List:']
['-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------']
[' Client ID ', ' Client Type ', ' Service Type ', ' Status ', ' Trust Domain ', ' Data Instance Name ', ' Data Version ', ' Creation Time ', ' Last Update ', ' Scheduled Time ', ' ']
['-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------']
[' REFRESH_ROUTINGTIER_ARTIFACTS_1465901168866 ', ' ROUTINGTIER_ARTIFACTS ', ' SYSTEM ', ' COMPLETED ', ' RRA Bulk Client ', ' soa_server1 ', ' 18.2.2.0.0 ', ' 2016-06-14 03:49:55 -07:00 ', ' 2016-06-14 03:49:57 -07:00 ', ' --- ', ' ']
['-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------']
[' 500333443 ', ' CREATE ', ' [FA_GSI] ', ' COMPLETED ', ' holder ', ' soa_server1 ', ' 18.3.2.0.0 ', ' 2018-08-07 11:59:57 -07:00 ', ' 2018-08-07 12:04:37 -07:00 ', ' --- ', ' ']
['-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------']
[' 500333446 ', ' CREATE ', ' [FA_GSI] ', ' COMPLETED ', ' holder-test ', ' soa_server1 ', ' 18.3.2.0.0 ', ' 2018-08-07 12:04:48 -07:00 ', ' 2018-08-07 12:08:52 -07:00 ', ' --- ', ' ']
['-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------']
You could easily skip the first 4 rows in the resulting list.
>>> import csv
... from itertools import islice
...
... with open('infile', 'r') as f:
... reader = csv.DictReader(islice(f, 5, None, 2), delimiter='|')
... result = [
... k.strip(): v.strip() for k, v in line.items() for line in reader
... ]
...
>>> last_row = result[-1]
>>> import json; print(json.dumps(last_row, indent=2))
"Client ID": "500333446",
"Client Type": "CREATE",
"Service Type": "[FA_GSI]",
"Status": "COMPLETED",
"Trust Domain": "holder-test",
"Data Instance Name": "soa_server1",
"Data Version": "18.3.2.0.0",
"Creation Time": "2018-08-07 12:04:48 -07:00",
"Last Update": "2018-08-07 12:08:52 -07:00",
"Scheduled Time": "---",
"": ""
>>> last_row['Client ID']
'500333446'
>>> last_row['Trust Domain']
'holder-test'
One in awk:
awk 'BEGIN"!/^-+/c=$1;t=$5ENDprint "Client ID:" c ORS "Trust Domain:" t' file
Explained:
$ awk '
BEGIN " # pipe-separator
!/^-+/ # process if doesnt start with dashes
c=$1 # client value
t=$5 # trust domain value
END # in the end
print "Client ID:" c ORS "Trust Domain:" t # output the last value pair
' file
Output:
Client ID: 500333446
Trust Domain: holder-test
It's customary to comment when downvoting. Especially since you downvoted everyone.
– James Brown
Aug 8 at 9:00
Yes, it's possible to do in python. I suggest csv module and customize delimiter to '|'.
– paragbaxi
Aug 8 at 3:11