The Basics of JSON and Why It Matters

JSON is short for JavaScript Object Notation, and it’s the modern contender to XML for data exchange. JSON is lightweight, minifies well, and is easily parsed. JSON is built on simple, near-universal data structures and avoids the complexity of other data-interchange formats. JSON is heavily influenced by Object-Oriented Principles.

What Uses JSON?

Almost every REST API uses JSON. Most web applications use JSON. Every single modern browser supports JSON. It’s supported by basically everything web-based which is even vaguely modern.

A lot of microservice architectures employ REST APIs which typically use JSON to communicate. JSON opens the gate to almost every modern API or web service.

Image by Thomas Ulrich from Pixabay

The Theory of JSON

JSON is composed of two basic types of data structures, both of which are present in every modern programming language. Data is arranged in a key-value pair setup. Standard objects are either a single key and value, or else a collection of keys and values which are equivalent to a hash table in most languages (learn about hash tables in Lua). Data can also be ordered like an array (learn about arrays in Lua).

JSON is more rigid than XML which makes it much easier to parse and understand. There may be more than one way to do something, but an exact table layout implemented in JSON will be trivially different from another implementation. The order of elements can vary, but the ultimate structure will be the same.

Example JSON

{ "inbox":"[email protected]",
	"display":"John Doe",
	"settings":{
		"date":"us",
		"time":"UTC",
		"theme":"dark"
	},
	"emails":[
		{"to":"[email protected]", "from":"[email protected]", "subject":"RE:", "body":"Yes" },
		{"to":"[email protected]", "from":"[email protected]", "subject":"RE: RE:", "body":"How are you?" },
		{"to":"[email protected]", "from":"[email protected]", "subject":"RE: RE: RE:", "body":"I'm running out of ideas for emails." }
	],
	"inboxsize":5000
}

Above is an example of JSON covering all of the basic structural features. JSON uses { and } to denote an object (as well as an entire JSON object). An object can be a single key and value, or else an entire collection of data which can span down as far as you would like. Keys are quoted, values are typically quoted except for booleans and numbers. Each key-value pair is separated by a comma.

Our first line has our “inbox” item set to “[email protected]”, then we have a “display” value. The third line is a bit different, we have “settings”:{…} which differentiates an object under our “settings” key. We then have our fourth line with “emails”:[…] which differentiates an array. Our array is composed of smaller, combined objects. This is how JSON builds a tree and complex structure with simple pieces.

Scalars In JSON

{ "inbox":"[email protected]",
	"display":"John Doe",
	…
	"inboxsize":5000
}

We have a couple keys which have a plain value right after. JSON supports several types of scalars (learn about scalars in Lua) which are also known as primitives in some languages. JSON allows strings, numbers, integers, booleans, and null values. Numbers and integers don’t really differ from the viewpoint of how they’re implemented, but they affect strongly typed languages like C or Java.

{ "string":"my string",
	"number":123.456,
	"integer":123,
	"boolean":true,
	"null":null,
	"object":{ … },
	"array":[ … ]
}

The above code shows all of the types available in JSON. Also, note the commas; you should not end with a comma if nothing else follows it. Some parsers support it, but it’s not in the spec. I have pulled out trailing commas (even in snippets) just to drill this in.

Numbers don’t need to be wrapped in quotes, nor do booleans or nulls. Objects and arrays are just collections of other objects, arrays, and scalars. You don’t need to differentiate what type you’re working with in JSON.

Objects In JSON

"settings":{
	"date":"us",
	"time":"UTC",
	"theme":"dark"
}

Our “settings” key has an object under it building a branch on our tree. We have a “date”, “time”, and “theme” key and value in this object. JSON does not care what type of object is tied to what key, just that each key has a value. The value can be a scalar (a single value) or anything derived from a composite type including arrays and hash tables. Objects can contain anything regular JSON can so you can have levels of different types.

"object":{
	"string":"my string",
	"number":123.4,
	"object":{ "integer":123, "boolean":false },
	"array":[ null, 1, 2, 3, 4 ]
}

Breaking Down Arrays

"array":[ null, 1, 2, 3, 4 ]

This is the most basic type of array in JSON. Our [ and ] differentiates an array. Our “array” object has 5 scalar objects in it. Our first item in the array is null. The order is preserved for each of these. In pseudocode, you would access the null with something like: “myjson.array[ 0 ]”. Most languages start their arrays with an index of 0, though this isn’t the case with Lua which starts at 1.

For another example, let’s look back to our bigger piece of example JSON:

"emails":[
		{"to":"[email protected]", "from":"[email protected]", "subject":"RE:", "body":"Yes" },
		{"to":"[email protected]", "from":"[email protected]", "subject":"RE: RE:", "body":"How are you?" },
		{"to":"[email protected]", "from":"[email protected]", "subject":"RE: RE: RE:", "body":"I'm running out of ideas for emails." }
	]

As mentioned before, JSON uses curly-braces to differentiate an object. We have an array of objects above. Our first object in the array is:

{"to":"[email protected]", "from":"[email protected]", "subject":"RE:", "body":"Yes" }

This object gives us a “to”, a “from”, a “subject”, and a “body” and puts it all in the very first item in the array. The second line does the same but for the second item in the array. This continues until the array is populated. This email data will never be in an arbitrary order, but will always retain its order when parsed.

More On Objects

Objects can have an arbitrary order when read. We used our arrays above because we theoretically care what order the email is in. Objects lack any enforcement of order, though some languages may return the data in the same order as the JSON. Don’t rely on this behavior if you see it though. For instance, with how hashes work in Perl, you may get consistent results iterating over them for a while until something arbitrary changes either with something else in the code or in the JSON file.

We made all of the objects in our array match, but you can use arbitrary structures if you want. You could add a “CC” field for an email, or remove a “subject” when there isn’t one (though this doesn’t make much sense). Objects don’t have to be standardized and can be populated however is necessary as long as the code reading it knows what to do. Best practices state you either define a field to know what you’re working with or that each sub-item be identically formed.

What Is XML?

Image by Markus Spiske from Pixabay

XML stands for eXtensible Markup Language. XML is extremely flexible, but also notoriously hard to parse. HTML is an example of something extremely common which is (for all intents and purposes, arguably) XML.

XML uses tags to define where a section begins, but it allows a flexibility which JSON does not. If you look at some of the example comparisons, you can see that XML is also a bit more verbose than JSON in a lot of situations. XML allows scalars in the tag as well to make it even more complicated.

What Makes JSON Better Than XML?

Image by Dariusz Sankowski from Pixabay

Both XML and JSON are equally valid for use, but JSON sacrifices some human readability for much better machine readability. XML is far more flexible, but this flexibility makes it slower and harder to sanely parse for RFC compliant code. Many languages have incomplete XML parsers where features are convoluted, or just plain missing.

XML is complicated to the point I’d rather write a single JSON parser in every single computer language I’ve ever dabbled in instead of a single XML parser in my best language. I’m counting Batch scripting as a programming language for this exercise too.

JSON is simple, portable, and easy for a machine to read. It’s simple enough that it’s easily readable before being minified. XML beats it at readability when well written, but can easily become a horrific mess as well. JSON limits your flexibility which forces consistent habits.

I have moved all of my projects away from XML wherever I can help it. Even though I personally think the best parts of JavaScript (outside of JSON) can be boiled down to this, JSON is a treasure for computationally cheap data transfer in a human readable format.

Applying JSON to Projects

Virtually every modern language has multiple JSON parsing libraries which are LGPL or MIT licensed along with even more with other licenses (GPL for FOSS). With proper planning of the JSON, you can pretty much change out the parser alone to replace XML handling code with JSON code. A lot of parsers return the same basic structures between the two (usually a generic tree object) and can be made to format the structure similarly.

JSON solves the complexity problem of XML and limits how it is written to make more consistent data. JSON is easy to learn and extremely powerful for data handling. It has close to the same readability as XML in a smaller package which is easier for a machine to parse.

Featured image by Goumbik from Pixabay