im trying to process a csv and make it easier for sorting, and i need to remove the time and the dash from it. the file has entries like this:
James,07/20/2009-14:40:11
Steve,08/06/2006-02:34:37
John,11/03/2008-12:12:34
and parse it into this:
James,07/20/2009
Steve,08/06/2006
John,11/03/2008
im guessing sed is the right tool for this job?
thanks for your help.
From stackoverflow
-
cut -d '-' -f 1 < /path/to/your/file
Edit after comment: sed 's/-[0-9][0-9]:[0-9][0-9]:[0-9][0-9]//g' < /path/to/your/file
nmuntz : what if the name contains a dash?Alberto Zaccagni : If the name is something like Al-Ashrad then the output will be Al, which is wrong, thank you for pointing that out. I edited accordingly. -
Python
import csv import datetime rdr = csv.reader( open("someFile.csv", "rb" ) ) rows = list( reader ) rdr.close() def byDateTime( aRow ): return return datetime.datetime.strptime( aRow[1], "%m/%d/%Y-%H:%M:%S" ) rows.sort( key= byDateTime ) wtr = csv.writer( open("sortedFile.csv", "wb" ) ) wtr.writerows( rows ) wtr.close() -
just use awk
awk -F"," '{ split($2,_,"-"); print $1,_[1] }' OFS="," file -
Yes, I think sed is the right tool for the job:
sed 's/-[:0-9]*$//' file
0 comments:
Post a Comment