Skip to content

Content negotiation

Dimitri van Hees edited this page Mar 18, 2016 · 3 revisions

With the rise of data more and more publishing mechanisms seem to become important on the web. While RESTful JSON APIs might be the most popular at the moment, Geodata services, Linked Data endpoints, good old XML and even CSV dumps are also being used by a large amount of consumers.

Accept Header

The Accept Header is a request header sent by the (unknown) client which determines what kind of format the client wants to receive from the server. If a client accepts one or multiple formats, you should make sure to respect this wish. If you can't serve it: be honest and don't. Imagine the client asks for application/json but you serve text/xml. That's not fair, unless the client prefers application/json above text/xml indicated by the relative quality parameter q (e.g. Accept: application/json,text/xml;q=0.9) and text/xml is the only thing you can serve. If you start serving application/json in the future you will start serving this client application/json automatically according to it's Accept Header and it's up to the client to be prepared for that because it sent a list of multiple accepted formats. In real life this almost never happens, except if the client is a web browser. And yes, web browsers are prepared for the kind of response formats they prefer according to their Accept Headers.

URL based content negotiation

While the Accept header might be the only correct instrument to do content negotiation, some clients insist other ways to request specific media types. A nice example is the way most triplestores import RDF: provide a link to an external RDF file and it will import. However, if the triplestore doesn't send an Accept header, the server doesn't know which media type to submit. The same counts for references in XML sitemaps: Geographic resources can also be represented as KML. KML could be indexed by search engines, but you'll need to point to them in the XML sitemap. While you could retrieve the KML from http://example.com/resource-1 using the Accept header, you can't point directly to it without altering the URL. So, you'll need to support http://example.com/resource-1.kml or http://example.com/resource-1?format=kml next to the original URL. Also a lot of GIS software clients rely on loading external URLs, expecting GeoJSON or GML without sending the correct Accept headers.

Hypermedia

Imagine the client demands application/json and you serve the data with hypermedia controls like HAL (application/hal+json) or JSON API (application/vnd.api+json). Now it becomes interesting: there is no difference with the case described above where we served text/xml upon a demanded application/json. Of course, application/hal+json and application/vnd.api+json are in fact valid application/json formats, but this strategy eliminates the ability to serve multiple hypermedia formats in the future. Which seems fair enough, because there are client libraries that work out of the box with JSON API while different client libraries work out of the box with HAL. Besides of that, the client might eventually demand the plain JSON without hypermedia (especially when it comes to collections) for whatever use-case is out there. We simply don't know and therefore should not make any assumptions.

JSON-LD and Hydra

Another example is JSON-LD (JSON with Linked Data) format. JSON-LD distinguishes itself from JSON with the content type application/ld+json. One of the nice things about JSON-LD is that you can specify a link to an external application/ld+json file through a Link header in the original application/json document, so that even if the client asks for application/json, an 'invisible' link to the JSON-LD context file is included in the response headers which then can be detected by JSON-LD parsers to create valid Linked Data. Things change if you embed your JSON-LD context in your JSON body and if like to use JSON-LD features which can't be placed in external files, which means you'll have to edit the structure of your body. This is the case when you start adding identifiers and types to the data, which is quite useful if you want to get the most benefits out of Linked Data (RDF triples or Structured Data using the Schema.org vocabulary).

The hypermedia framework for JSON-LD is called Hydra and it's content type is application/ld+json... the same as JSON-LD without hypermedia. Where you are able to serve multiple types of hypermedia for the plain JSON version of your data using either application/json, application/hal+json or application/vnd.api+json, there is no default way to use the Accept Header to demand 'plain' JSON-LD. But hypermedia controls shouldn't be part of the data itself. While JSON-LD can be used to convert data to RDF, this means when using Hydra the RDF will also contain triples regarding hypermedia, such as pagination. And that's wrong. We want our information (whether it's enriched by JSON-LD or not) to be separated from hypermedia. Take a book for example: when you decrease the font size in another release the page numbers and total pages might be different, but the information itself doesn't change at all.

RDF

It's very easy to convert JSON-LD to RDF triples which can be loaded in a triplestore to create Linked Data. However, we just saw that the JSON-LD we created contains Schema.org markup, while triplestores can't work with that. The same counts for geographic information: we can have GeoJSON in our JSON-LD representation, but most triplestores work with another format, known as Well Known Text (WKT). This said, the Linked Data representation (either in RDF or JSON-LD) differs per target datastore. Geodata needs to be structured as a WKT string if it's intended for triplestores, while it needs to be structured according to the schema.org vocabulary if it's intended to be embedded in HTML to serve search engines.

Examples

Below there are some examples between different content-types and data structures which seem to conflict.

Example 1: application/json (Plain JSON)

{
	"companies": [{
		"id": 1,
		"companyName": "Apiwise",
		"companyAddress": "Burgemeester Broxklaan 1000, Tilburg, The Netherlands",
		"emailAddress": "info@apiwise.nl",
		"foundingPartners": [
			"Dimitri van Hees",
			"Joost Farla"
		]
	}]
}

Example 2: application/hal+json (HAL)

{
	"_embedded": {
		"companies": [{
			"id": 1,
			"companyName": "Apiwise",
			"companyAddress": "Burgemeester Broxklaan 1000, Tilburg, The Netherlands",
			"emailAddress": "info@apiwise.nl",
			"foundingPartners": [
				"Dimitri van Hees",
				"Joost Farla"
			],
			"_links": {
				"self": {
					"href": "/companies/1"
				}
			}
		}]
	},
	"_links": {
		"self": {
			"href": "/companies"
		},
		"next": {
			"href": "/companies?page=2"
		}
	}
}

Example 3: application/vnd.api+json (JSON API)

{
	"links": {
		"self": "/companies",
		"next": "/companies?page=2",
		"last": "/companies?page=3"
	},
	"data": [{
		"type": "companies",
		"id": 1,
		"attributes": {
			"companyName": "Apiwise",
			"companyAddress": "Burgemeester Broxklaan 1000, Tilburg, The Netherlands",
			"emailAddress": "info@apiwise.nl",
			"foundingPartners": [
				"Dimitri van Hees",
				"Joost Farla"
			]
		}
	}]
}

Example 4: application/ld+json (context.jsonld)

{
	"@context": {
		"schema": "http://schema.org/",
		"companies": "@graph",
		"companyName": "schema:name",
		"companyAddress": "schema:address",
		"emailAddress": "schema:email",
		"foundingPartners": "schema:founder"
	}
}

When the following header is included in the Plain JSON application/json response (example 1) a JSON LD parser will interpret the resource as linked data (see below):

Link: </context.jsonld>; rel="http://www.w3.org/ns/json-ld#context"; type="application/ld+json"

{
	"@graph": [{
		"http://schema.org/address": "Burgemeester Broxklaan 1000, Tilburg, The Netherlands",
		"http://schema.org/email": "info@apiwise.nl",
		"http://schema.org/founder": [
			"Dimitri van Hees",
			"Joost Farla"
		],
		"http://schema.org/name": "Apiwise"
	}]
}

Example 5: application/ld+json (custom JSON-LD structure using Schema.org vocabulary)

[{
  "@context": "http://schema.org",
  "@id": "http://companies/1",
  "@type": "Organization",
  "name": "Apiwise",
  "address": {
    "@type": "PostalAddress",
    "addressLocality": "Tilburg, The Netherlands",
    "streetAddress": "Burgemeester Broxklaan 1000"
  },
  "email": "info@apiwise.nl",
  "founder": [{
    "@type": "Person",
    "name": "Dimitri van Hees"
  }, {
    "@type": "Person",
    "name": "Joost Farla"
  }]
}]