[lug] awk question

Joseph McDonald joem at uu.net
Mon Jan 27 14:35:23 MST 2003


Harris, James said:
> 
> Hi all --
> 
> Kinda stumped here.  I have a .csv file that I need to parse (using awk,
> preferably) that uses quotes to protect "real" commas in the data.  How can
> I deal with this in awk?  I've dug around the awk documentation and the
> closest idea that I have is to compose some type of regex for it, but I'm
> also stumped on how I would describe this scenerio with a regex?
> 
> How can I tell awk that ',' is the FS except when found within ""?  Is this
> going to be a portentially huge nightmare?  Perl really isn't an option cuz
> I don't know it (and don't really have the time to learn it -- reason #2,345
> that I need to learn perl).  The option to re-export the data with another
> FS is there, but not easy or quick... so if I can tell awk to deal with it
> quickly, then it's worth it, otherwise I'll take the human and compute
> cycles to re-export it.

Hmm..

Perhaps I'm missing something here:

nemo at otho.scare:[~]:% cat test.txt
"one","two","three,four","five"
"one","two","three,four","five"
"one","two","three,four","five"
"one","two","three,four","five"
"one","two","three,four","five"
nemo at otho.scare:[~]:% cat test.txt | awk -F\",\" '{print $3}'
three,four
three,four
three,four
three,four
three,four
nemo at otho.scare:[~]:%

Maybe your data isn't delimited like my example above.

	--joey



More information about the LUG mailing list