ICM Help Session Week 8: Data
Mimi
Today we mostly walked through how to get “live” data from the internet.
- Strings are just an array of characters
- Operations on Strings – loadStrings(), join(), split(), match(), matchAll()
- Basic Processing Functions – http://processing.org/reference/
Getting Data
Formats
CSV: Comma Separate Files
- Mostly stuff you download
- Tables, spreadsheets
- Line breaks represents rows of data
- For each row, column values are separate by commas
- A lot of “static” or periodically released data comes in this format (e.g. Goverment data that’s released once a year, etc)
XML: Extensible Markup Language
- Looks like html, data is encapsulated in between “open and close tags”
- <TITLE>Beauty and the Beast</TITLE>
JSON: Javascript Object Notation
- Data format native to Javascript
- Very user-friendly in terms of readability
- Data stored as key-value pairs (e.g. Title: Beauty and the Beast)
- Values can be Strings, Arrays (lists) or another Object containing key-value pairs
- Most “live” constantly updated data is now released in JSON (e.g. Tweets, NYT, etc)
If you’re wondering about how to get data, try googling, API and JSON or API and XML with the the name of the service you’re interested in.
We focused mostly on JSON…
Processing doesn’t know about the JSON data format. So you’ll need to use Jer Thorpe’s library to parse any JSON you get from the web.
Jer walks through pulling data from the NYT by using the NYT API. At a high-level, here are the main steps:
- DONWLOAD JSON LIBRARY: Put the entire “json” folder from the zip file into Processing>>libraries.
- Go get yourself an API Key from the NYT. (You’ll need to log in with a NYT account.) The API key is basically a big long string of letters and numbers and it’s how NYT associates the data requests you’re making with you (in case you do something bad
- Try running this example.
Resources
- Your JSON data is going to come back as one continuous string of characters. Use this to format it nicely with line breaks and indentation so it’s easier to read.
- Dan’s latest write up on working with text and data, updated for Processing 2.0.
- Documentation on how to construct queries for getting NYT data.
Where to get Data