[lug] Convert JSON to CSV - adding header line

Davide Del Vento davide.del.vento at gmail.com
Fri Jan 15 18:56:38 MST 2021


Not an answer to your question, more of a meta-answer or $0.02 if you wish.

I think for things like this (any text parsing and reformatting) is better
to use a (well written and well documented) python script. IMHO, the best
way to write and document it is with individual functions and unit tests.
So you don't write a comment saying "this function spits out this output
for this input", instead you write a piece of code which asserts that the
expected output is actually spit out from that function when called with
that input.
Somebody may say it's a bloated way to do such little stuff, but then your
future self (or your Win friend) will thank you when they would need to
change one details, will be able to do it in a second, and the unit test
will tell them that yes, that one detail has changed as expected, but also
three other things have changed in unexpected ways.

On Fri, Jan 15, 2021 at 6:43 PM Jed S. Baer <blug at jbaer.cotse.net> wrote:

> Hi Everyone. Oh, so much fun we're having now!
>
> I'm off in the weeds here, looking at taking the ham DMR users database,
> available as a JSON file, and converting it to CSV. "DMR" is "Digital
> Mobile Radio". Yes, it's also available as a CSV, but the people producing
> it aren't enclosing the fields in quotation marks, so when the value
> includes a comma, it's problematic. GIGO! Grabbing it as JSON, then
> converting to CSV makes a better dataset, and it's also a time when I can
> selectively grab fields.
>
> So.... my incantation is:
> jq -r '.users[] | [.radio_id, .callsign, .fname, .surname, .city, .state,
> .country, .remarks] | @csv'
>
> And this works fine, even for 137K rows. Yes, the whole JSON file is one
> big JSON array, which seems odd to me, but whatever.
>
> However, in trying to find how to tack on a header row with column names,
> it's getting murky. The std. answers on StackOverflow, and elsewhere, all
> use the map command, plus a unique function, and assume that the intent is
> to grab all the columns. I don't want that, and in fact, have to slightly
> rename the fields anyway.
>
> Yes, I can make a file with the column headers, and just use cat to append
> that with the output of jq, but it seems to me that there ought to be a way
> to create a literal array that jq will include as the first line in the
> output. I'm thinking that somehow, that's what happens with all the piping
> and such in the examples using the map filter - somehow, this resolves to
> an array that winds up being the first line, before the converted data.
> Also, I'm hoping to hand this solution to another ham who will be using it
> in Windoze, so the more self-contained it is, the better.
>
> The first line is:
> RADIO_ID,CALLSIGN,FIRST_NAME,LAST_NAME,CITY,STATE,COUNTRY,REMARKS
>
> I actually don't know that it has to be all caps - the software used to
> program the radio is very poorly documented. I also don't know whether
> it'll barf if the above field names are quoted. Hmmm, I also don't whether
> it'll barf if any of the field values are quoted. Well, once I have the CSV
> file in a better state, and can deal with that.
>
> Thanks for any advice.
> --
> All operating systems suck, but Linux just sucks less
>  - Linus Torvalds
> _______________________________________________
> Web Page:  http://lug.boulder.co.us
> Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
> Join us on IRC: irc.hackingsociety.org port=6667 channel=#hackingsociety
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lug.boulder.co.us/pipermail/lug/attachments/20210115/201ceaff/attachment.html>


More information about the LUG mailing list