Replace commas except those in quotation marks
Clash Royale CLAN TAG#URR8PPP
Replace commas except those in quotation marks
Date,Time,Ref,Sen,ATN,Flow,PCB temp,Status,Battery,BC
2015/04/23,12:30:00,779581,908043,"-15,254",49,31,0,100,
2015/04/23,12:35:00,778715,907084,"-15,259",49,31,0,100,-127
2015/04/23,12:40:00,778299,906419,"-15,239",49,32,0,100,461
(...)
Hi, I have an ascii file like the one above where I am trying to replace the commas for semicolon. This is the code I am using:
filein = open('Prueba1.txt')
fileout = open('Fin.txt', 'wt')
for line in filein:
if line.startswith('20'):
fileout.write( line.replace(',', ';') )
filein.close()
fileout.close()
The problem is that now I want to maintain the commas for the 5th column and also take of the quotation marks.
Any ideas?
4 Answers
4
I cannot comment because of not having enough reputation points.
Have you considered saving this as a csv file.
after saving it as csv file:
import pandas as pd
v = pd.read_csv("check_delete.csv")
output:
Date Time Ref Sen ATN Flow PCB temp Status Battery BC
0 2015/04/23 12:30:00 779581 908043 -15,254 49 31 0 100 NaN
1 2015/04/23 12:35:00 778715 907084 -15,259 49 31 0 100 -127.0
2 2015/04/23 12:40:00 778299 906419 -15,239 49 32 0 100 461.0
You can read this as a dataframe.
v.to_csv('new.csv',sep=';',index=False)
;
Thanks a lot. This panda library is amazing!
– Kevin Soler Carracedo
Aug 9 at 20:51
This is an answer making use of the split()
method if you do not want to use .csv files.
split()
Let's consider the line to be:
line = '2015/04/23,12:35:00,778715,907084,"-15,259",49,31,0,100,-127'
line = '2015/04/23,12:35:00,778715,907084,"-15,259",49,31,0,100,-127'
First, split the line into three parts. One before the part in the quotes, the quoted part, and the part after the part in quotes.
This can be done by line.split('"')
.
line.split('"')
This will give us a list consisting of these three parts:
alist = ['2015/04/23,12:35:00,778715,907084,', '-15,259', ',49,31,0,100,-127']
alist = ['2015/04/23,12:35:00,778715,907084,', '-15,259', ',49,31,0,100,-127']
Now, split the first and last elements of the list by the comma.
And add all the elements to an empty string.
Write this string to your new file.
Like this:
left_part = alist[0].split(',')
right_part = alist[2].split(',')
middle_part = alist[1]
final_list = left_part + [middle_part] + right_part
new_line = ''
for part in final_list:
# to prevent the empty strings to be added add the if condition.
if part:
new_line += part + ';'
The final result should look like this:
'2015/04/23;12:35:00;778715;907084;-15,259;49;31;0;100;-127;'
'2015/04/23;12:35:00;778715;907084;-15,259;49;31;0;100;-127;'
A solution using the csv
Python standard library:
csv
import csv
with open('example.csv', newline='') as inputfile,
open('parsedcsv.csv', 'w', newline='') as outpufile:
datareader = csv.reader(inputfile, delimiter=',', quotechar='"')
csvwriter = csv.writer(outpufile, delimiter=';',
quotechar="'", quoting=csv.QUOTE_MINIMAL)
csvwriter.writerows(datareader)
Quote from the documentation about the QUOTE_MINIMAL option:
csv.QUOTE_MINIMAL: Instructs writer objects to only quote those fields
which contain special characters such as delimiter, quotechar or any
of the characters in lineterminator.
The output file is:
Date;Time;Ref;Sen;ATN;Flow;PCB temp;Status;Battery;BC
2015/04/23;12:30:00;779581;908043;-15,254;49;31;0;130;
2015/04/23;12:35:00;778715;907084;-15,259;49;31;0;100;-127
2015/04/23;12:40:00;778299;906419;-15,239;49;32;0;100;461
Could you post what you are looking for as a result example? What do you mean by also take of the quotation marks
also take of the quotation marks
This might help
import re
x=
'''
015/04/23,12:30:00,779581,908043,"-15,254",49,31,0,100,
2015/04/23,12:35:00,778715,907084,"-15,259",49,31,0,100,-127
2015/04/23,12:40:00,778299,906419,"-15,239",49,32,0,100,461
'''
print re.sub(r"(,)(?!")", ";", x)
015/04/23;12:30:00;779581;908043,"-15;254";49;31;0;100;
2015/04/23;12:35:00;778715;907084,"-15;259";49;31;0;100;-127
2015/04/23;12:40:00;778299;906419,"-15;239";49;32;0;100;461
Just do an regex replace each line.
, matches the character , literally (case sensitive)
, matches the character , literally (case sensitive)
Negative Lookahead (?!")
Negative Lookahead (?!")
Assert that the Regex below does not match
Assert that the Regex below does not match
" matches the character " literally (case sensitive)
" matches the character " literally (case sensitive)
To me, it is more of a comment rather than an answer. Also, OP does not want to replace all the
,
with ;
– mad_
Aug 8 at 18:41
,
;
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
v.to_csv('new.csv',sep=';',index=False)
will save it as a new csv file with;
as the delimiter.– Pratik Kumar
Aug 8 at 18:49