API doc.tex

\documentclass[11pt]{article}
\usepackage{geometry}                % See geometry.pdf to learn the layout options. There are lots.
\usepackage{listings}
\usepackage{xcolor}
\usepackage{bera}% optional: just to have a nice mono-spaced font
\usepackage{url}
\PassOptionsToPackage{hyphens}{url}
%\usepackage{hyperref}
\geometry{letterpaper}                   % ... or a4paper or a5paper or ... 

%Taken from https://tex.stackexchange.com/questions/83085/how-to-improve-listings-display-of-json-files
\colorlet{punct}{red!60!black}
\definecolor{background}{HTML}{EEEEEE}
\definecolor{delim}{RGB}{20,105,176}
\colorlet{numb}{magenta!60!black}

\lstdefinelanguage{json}{
    basicstyle=\normalfont\ttfamily,
    numbers=left,
    numberstyle=\scriptsize,
    stepnumber=1,
    numbersep=8pt,
    showstringspaces=false,
    breaklines=true,
    frame=lines,
    backgroundcolor=\color{background},
    literate=
     *{0}{{{\color{numb}0}}}{1}
      {1}{{{\color{numb}1}}}{1}
      {2}{{{\color{numb}2}}}{1}
      {3}{{{\color{numb}3}}}{1}
      {4}{{{\color{numb}4}}}{1}
      {5}{{{\color{numb}5}}}{1}
      {6}{{{\color{numb}6}}}{1}
      {7}{{{\color{numb}7}}}{1}
      {8}{{{\color{numb}8}}}{1}
      {9}{{{\color{numb}9}}}{1}
      {:}{{{\color{punct}{:}}}}{1}
      {,}{{{\color{punct}{,}}}}{1}
      {\{}{{{\color{delim}{\{}}}}{1}
      {\}}{{{\color{delim}{\}}}}}{1}
      {[}{{{\color{delim}{[}}}}{1}
      {]}{{{\color{delim}{]}}}}{1},
}


\title{LMFDB API2 Overview}
\author{C. Brady and H. Ratcliffe}
\date{}                                           % Activate to display a given date or no date

\begin{document}
\maketitle
\section{Overview of API2}
LMFDB's existing API was designed to be a very thin HTTP wrapper over the database core of LMFDB. As such, it requires low level knowledge of the structure of the database, provides no guarantees of stability of the interface and no guarantees or information about what is stored in the database. API2 was designed to provide a middleware layer providing some level of stability without enforcing a rigid design paradigm on the LMFDB developer community.

It works by providing "searcher" objects that describe packaged sets of search operations that are permitted. Each searcher then allows a set of possible query parameters, one or more of which are required to perform a valid search. Human readable descriptions of searchers and search parameters are provided through the API.

\subsection{API2 versioning}
API2 uses a three part version string major.minor.revision. It is expected that breaking changes would only be introduced on a version change. Current version is 1.0.0

\section{API2 resources}
API2 consists of a single resource

\begin{itemize}
\item the \url{/api2} resource is responsible for describing what searches may be performed through the API, detailing the valid parameters for each search and returning the data from a performed search
\end{itemize}

\subsection{API2 requests}
Since API2 is intended for returning data from the server only all requests should be issued as HTTP(S) GET requests. There are six end points

\begin{itemize}
\item \url{/api2/description/searchers} - Describe the list of possible searchers
\item \url{/api2/description/<searcher>} - Describe the list of possible search parameters for a given searcher
\item \url{/api2/inventory/<searcher>} - Describe the list of possible fields that can be returned by a searcher
\item \url{/api2/data/<searcher>} - Perform a search using a given searcher. Query parameters taken from the searcher description are provided as an ampersand delimited query string to this endpoint. Non searching parameters are available and are all prepended by "\_"
\item \url{/api2/pretty/<other_path>} - This takes a valid other API path and produces a human readable (just) HTML page from the data contained. The other path should be the part after "api2" in the URL
\item \url{/api2/singleton/<path>} - This is intended to provide a URL that allows canonical reference to a single element of the database. The format of the singletons is not standardised and should not in general be used for machine readable access
\item \url{/api2/livepg/<database>} - This is intended to replicate the behaviour of the existing API. By specifying a database name and then using a standard search string (as per the /data/<searcher> endpoint) a search is performed on the raw database. This is not intended for normal access, but for debugging and testing purposes only.
\end{itemize}

\subsubsection{API2 search query strings}
The description parts of API2 take all of their parameters through their URL. Searches are performed by passing a query string. The main part of an API2 searcher query is name=value keys representing values for possible search parameters obtained from the \texttt{description} endpoint. These have the simple obvious form for equality, for example \url{/api2/data/elliptic_curves_q?rank=0&conductor=11} is a search for rank=0, conductor = 11 in the elliptic curves over Q. Searching for conditions other than simple equality are performed by prepending an operator immediately after the = symbol. Currently there are 3 operators
\begin{itemize}
\item ">" - Test for greater than a value
\item "<" - Test for less than a value
\item "\%" - Test if a value is contained in a list
\end{itemize}
For example \url{/api2/data/elliptic_curves_q?rank=0&conductor=>11} is a search for rank=0, conductor > 11
There are also other possible query parameters that are not directly related to searching, but to controlling the data returned. These are

\begin{itemize}
\item \_view\_start - Record to start viewing the list of returned fields. Should be given the value returned in the view\_next field of the information returned by a search to get the next "page" of the data
\item \_max\_count - Maximum number of records to return from this search. Will not exceed the maximum number specified on the server but will be respected if lower than that value. For example, set to 1 if you are merely interested in the existence or non-existence of records of the specified type
\item \_fields - The fields that should be returned by the search. The valid lists of fields should be inferred from a call to \url{/api2/inventory/<searcher>} 
\item \_exclude - If set = 1, change the operation of the \_fields key from being the fields that should be returned for found records to being fields that should be excluded from the return
\end{itemize}

\subsection{API2 responses}
All API2 responses are as JSON documents complying with the ECMA-404 standard. Compliance with RFC 8259 is attempted but not guaranteed. The outer layer of the response is the same for all API2 requests

\begin{lstlisting}[language=json,firstnumber=1]
{"api_version":"1.0.0",
"type":"API_SEARCHERS",
"key": "GLOBAL",
"built_at": "2010-04-20T20:08:21.634121",
"data": [...]}
\end{lstlisting}
\begin{itemize}
\item api\_version - major.minor.revision string value showing version of API2 interface that the response is complying with
\item type - String describing the type of response to be parsed from the "data" key. May be
\subitem API\_SEARCHERS - Data should be parsed as a list of searcher objects
\subitem API\_DESCRIPTIONS - Data should be parsed as description of a searcher object
\subitem API\_INVENTORY - Data should be parsed as a description of the fields that will be returned by a searcher
\subitem API\_RECORDS - Data should be parsed as a list of records requested by a search
\subitem API\_ERROR - Data should be parsed as an error object. Request failed
\item built\_at - ISO8601 UTC date time string of when request was processed
\item key - string identifying operation, may be GLOBAL
\item data - Valid ECMA-404 JSON document containing the detailed response
\end{itemize}

\subsubsection{API\_SEARCHERS}
If the \texttt{type} field of the API2 response is "\texttt{API\_SEARCHERS}" then the \texttt{key} field will be \texttt{GLOBAL}. The \texttt{data} field will consist of a JSON object the keys of which correspond to unique named searchers. The names are used when querying or using the searcher. Each value for these keys has the following format

\begin{lstlisting}[language=json,firstnumber=1]
{"elliptic_curves_q":{
	"human_name":"Elliptic curves over q",
	"desc":"Search elliptic curves defined over the rationals"}
}
\end{lstlisting}
\begin{itemize}
\item human\_name - Human readable name that should be presented to a user when selecting search options
\item desc - Human readable description of the purpose of the searcher. Should be presented to a user if the search source has a way of displaying extended text descriptions
\end{itemize}

\subsubsection{API\_DESCRIPTIONS}
If the \texttt{type} field of the API2 response is "\texttt{API\_DESCRIPTIONS}" then the \texttt{key} field will be the name of the searcher requested. The \texttt{data} field will consist of a JSON object with keys corresponding to the names of all available query fields for this searcher. A single key (representing a single query field) has the following structure.

\begin{lstlisting}[language=json,firstnumber=1]
{    "cm": {
      "description": "CM code. 0 (for no CM), or a negative discriminant", 
      "example": "-163", 
      "type": "integer",
      "cname":"CMCODE",
      "codec":"CODEC_TO_COME"
    }
}
\end{lstlisting}
It is important to note that a searcher is valid if it contains any subset of these fields, or is also valid if it is null.
\begin{itemize}
\item description - Human readable description that should be presented to a user from clients that have the ability to display long form text when a search is being selected
\item example - An example of the record that would be a valid search term. This is especially important if there is no \texttt{codec} field because the user will have to provide a valid code manually
\item type - This is a human readable representation of the type of the data that would be valid as a search. It may be auto generated or written by a human, so should not be parsed for type information
\item canonical\_name - The canonical name of the item. This name should be unique and unchanging. If a client wants a given {\it mathematical} property then it should search for this property by canonical name and then use the name of the query field having the correct canonical name to actually search for the data. This allows substantial changes to the underlying representation of the data while retaining API layer compatability
\item codec - The codec information encoding the type of the data that is expected for this field
\end{itemize}

\subsubsection{API\_INVENTORY}
If the \texttt{type} field of the API2 response is "\texttt{API\_INVENTORY}" then the \texttt{key} field will be the name of the searcher requested. The \texttt{data} field will consist of a JSON object with keys corresponding to the names of all available return fields for this searcher. A single key (representing a single query field) has the following structure.

\begin{lstlisting}[language=json,firstnumber=1]
{    "cm": {
      "description": "CM code. 0 (for no CM), or a negative discriminant", 
      "example": "-163", 
      "type": "integer",
      "cname":"CMCODE",
      "codec":"CODEC_TO_COME"
    }
}
\end{lstlisting}
It is important to note that a return field is valid if it contains any subset of these fields, or is also valid if it is null.
\begin{itemize}
\item description - Human readable description that should be presented to a user from clients that have the ability to display long form text if requested by a user
\item example - An example of the record that will be returned. This should be shown to a user as an example of what the field will return. This is especially important if there is no \texttt{codec} field because the user will have to parse the result manually
\item type - This is a human readable representation of the type of the data that will be returned by the search. It may be auto generated or written by a human, so should not be parsed for type information
\item canonical\_name - The canonical name of the item. This name should be unique and unchanging. If a client wants a given {\it mathematical} property then it should search for this property by canonical name and then use the name of the query field having the correct canonical name to actually search for the data. This allows substantial changes to the underlying representation of the data while retaining API layer compatability
\item codec - The codec information encoding the type of the data that is expected for this field
\end{itemize}

You will notice that the returns of API\_DESCRIPTIONS and API\_INVENTORY are often very similar, but this is not guaranteed. In the simple case the list of items that you can search on and fields that you will be returned will be the same but in general both must be parsed by a client.

\subsubsection{API\_RECORDS}
If the \texttt{type} field of the API2 response is "\texttt{API\_RECORDS}" then the \texttt{key} field will be a string containing the "data/" followed by the name of the searcher that generated the records. The \texttt{data} field will be a JSON object containing information about the number of records found in total, the number returned in this request and how to request the next tranche of records. It also contains a list of JSON objects returning the results from the database. The response has the following format

\begin{lstlisting}[language=json,firstnumber=1]
{"record_count":6000,
"view_start":0,
"view_count":100,
"view_next":99,
records:{...}
}
\end{lstlisting}
\begin{itemize}
\item record\_count - Total number of records returned by the search
\item view\_start - First record returned in this "view" of the data. First element is zero
\item view\_count - Number of records in this "view" of the data.
\item view\_next - Offset that should be issued next request to move the view to the next set of records. Will be negative if you have reached the end of the available data
\item records - List of JSON objects containing data from the database. This is guaranteed to be a list of valid JSON object, but the data contained is not guaranteed in any way
\end{itemize}

\subsubsection{Interpreting the records}
The fields returned by an API\_RECORDS object have no general form, but the \url{/api2/inventory/:searcher} result allows a client to match them to a canonical name and description. For speed, this remapping is not handled by the API but must be handled by the client. See the section on accessing the inventory.

\subsubsection{API\_ERROR}
If the \texttt{type} field of the API2 response is "\texttt{API\_ERROR}" then the \texttt{key} field will be a string containing the string "GLOBAL". The \texttt{data} field will be a string containing additional information about the error. A blank string is valid, but should not normally be returned. Your request has failed.

\section{An example API2 client operation}
This is an example of how a valid client would interact with API2

\subsection{Find searchers}
Client would connect to \url{/api2/description/searchers}. Result is

\begin{lstlisting}[language=json,firstnumber=1]
[
{"api_version":"1.0.0",
"type":"API_SEARCHERS",
"key": "GLOBAL",
"built_at": "2010-04-20T20:08:21.634121",
"data": [
	{""elliptic_curves_q":{
	"human_name":"Elliptic curves over q",
	"desc":"Search elliptic curves defined over the rationals"}
	},
	{"classical_modular_forms":{
	"human_name":"Classical Modular Forms",
	"desc":"Search classical modular forms"}
	}]
\end{lstlisting}
(In a fully working implementation there would be many more than this, but a client should cope with what is available). The client decides that it wishes to search in \texttt{elliptic\_curve\_q}.
\subsection{Query searcher}
Client connects to \url{/api2/description/elliptic_curves_q}. Result is
\begin{lstlisting}[language=json,firstnumber=1]
{"api_version":"1.0.0",
"type":"API_DESCRIPTIONS",
"key": "description/elliptic_curves_q",
"built_at": "2010-04-20T20:08:21.634121",
"data": [
	...
	...
	"label":"{
	      "description": "Cremona label", 
	      "example": "100002a1", 
	      "type": "string",
	      "cname":"CREMONA_LABEL"
	  },
	  "lmfdb_label": {
	      "description": "LMFDB label", 
	      "example": "100.a1", 
	      "type": "string"
	      "cname":"LMFDB_LABEL"
	    },
	    ...
	    ...
	]
}
\end{lstlisting}
The client parses this response and knows that it seeks to search on the canonical \texttt{CREMONA\_LABEL} so it checks every returned search query term to find the one with the matching canonical name. It finds that this corresponds to the search term \texttt{label}. Since this client only wants to search on a single item this ends this step. If the client wanted to search on multiple query terms then it would continue to parse until it has found all of them.
\subsection{(Optional) Get inventory information}
If a client is purely recovering data for a human to manage then this step can be skipped, but a typical client will get inventory information about the data returned by a search on the selected searcher.

Client connects to \url{/api2/inventory/elliptic_curves_q}. Result is
\begin{lstlisting}[language=json,firstnumber=1]
{"api_version":"1.0.0",
"type":"API_INVENTORY",
"key": "inventory/elliptic_curves_q",
"built_at": "2010-04-20T20:08:21.634121",
"data": [
    "2adic_index": {
      "description": "index in GL(2,Z2) of the 2-adicrepresentation (or 0 for CM curves)", 
      "example": "1", 
      "type": "integer",
      "cname":"2ADIC_INDEX"
    }, 
    "2adic_label": {
      "description": "Rouse label of the associated modular curve (null for CM curves). Based on Rouse, Zureik-Brown classification", 
      "example": "X1", 
      "type": "string",
      "cname":"2ADIC_LABEL"
    }, 
    ...
    ...
    ]
}
\end{lstlisting}

client is interested in returning the key with canonical name \texttt{2ADIC\_INDEX}, so once again searches through the responses until it finds that the return key will be \texttt{2adic\_index}.

\subsection{Get results}
Knowing what it wants to search on and what it wants returned the client connects to \url{/api2/data/elliptical_curve_q?label="100a3"&_fields="2adic\_index"}. Result is
\begin{lstlisting}[language=json,firstnumber=1]
{"api_version":"1.0.0",
"type":"API_INVENTORY",
"key": "inventory/elliptic_curves_q",
"built_at": "2010-04-20T20:08:21.634121",
"data": [
  "record_count":1,
  "view_start":0,
  "view_count":1,
  "view_next":-1,
  records:[{2adic_index:12}]]
}
\end{lstlisting}

\end{document}