I gave a talk in November to a local high school about computer science as a career field. Aha, I think – I’ve given this talk before – I’ll just brush up my well-prepared slide deck.
Well, now it’s 2021. The slide information needs to be updated, and Mr. van der Ende has not updated his image, but he was kind enough to make available his source code and a handy README file which walks (loosely) through how to get the data.
Challenges then solved so far:
- getting access to BigQuery
- finding new sources of the data, since the dataset van der Ende references doesn’t seem to exist anymore
- making BigQuery convinced that I have permission to run queries
- updating the query to match the new data source, including figuring out how to flatten arrays – really not in his original flow
- downloading mysql to my developer machine and setting up a database and username/password combo
- updating van der Ende’s code to read directly from a CSV, rather than assuming I’m using a JSON file
- getting php to work on my developer workstation – this particular box has done lots of things for me lately, but php hasn’t been one of them
- figuring out how to populate the languages list the code asked for, given the languages represented in the dataset I downloaded. (For the record, awk, sort, uniq was the happy combo.)
- uh, figuring out a better way to ingest the CSV, since pulling in the full file at once took up too much memory for my computer
- (more to come undoubtedly to get it working…)
Note: I ultimately ran into enough things with it that I left the original image. Still on my todo list to bring this to resolution…