Skip to content

Commit

Permalink
Prinz API chapter (#10)
Browse files Browse the repository at this point in the history
* Added Prinz API chapter

* whitespace fixes

* Review changes
  • Loading branch information
jkukowski authored Jun 4, 2021
1 parent b29e914 commit 1969a78
Show file tree
Hide file tree
Showing 2 changed files with 52 additions and 0 deletions.
51 changes: 51 additions & 0 deletions chapters/prinz-api.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
\chapter{Prinz API}
\label{chap:prinz-api}

This chapter describes the API provided by Prinz.
The main goal of the API is to allow developers to add new integrations in a simple and unified way.
Each consecutive integration consists of customizable classes that extend traits defined in the API.
Traits that have to be implemented in every integration include \texttt{SignatureProvider}, \texttt{Model}, \texttt{ModelInstance}, and \texttt{ModelRepository}.
Moreover, the optional \texttt{ApiIntegrationSpec} trait includes a convenient way of testing the implemented integration and its use is highly recommended.

\section{ModelSignature}
\label{sec:model-signature}

The \texttt{ModelSignature} trait describes the correct format of input and output data for a given model.
Each signature consists of two lists of \texttt{SignatureField}s - one for the input and one for the output.
A \texttt{SignatureField} describes the format of a single column of data.
It consists of a \texttt{SignatureName} - the name of the given column and a \texttt{SignatureType} - the type of data in this column, which can be interpreted by Nussknacker.

\subsection{SignatureProvider}

The \texttt{SignatureProvider} trait was created to separate the logic of extracting the signature from a given machine learning model.
This separation makes the process of accessing the signature customizable for each integration.
The trait consists of a single function \texttt{SignatureProvider.provideSignature}, which takes one argument: an object representing the location metadata of the signature.
If a signature can be provided using this metadata, the function returns a \texttt{ModelSignature}.
In some cases, the integrated tools may use a different typing system than the one available in Nussknacker.
Therefore, the role of a \texttt{SignatureProvider} is also to translate all types present in the signature to Nussknacker-based ones.

\section{Model and ModelInstance}

The \texttt{Model} trait represents a top-level description of the machine learning model.
It allows access to all necessary information about the machine learning model but is not runnable by itself.
Calling the \texttt{Model.toModelInstance} function returns an instance of this model, which can later be scored using provided data.
Moreover, through \texttt{Model.getMetadata}, the trait allows access to additional information about the model.
This information includes the model version, name, and signature.

The \texttt{ModelInstance} trait represents a runnable instance of the machine learning model.
The scoring proccess is started by calling the \texttt{ModelInstance.run} function.
This function takes one argument: a multimap containing input data for the model instance.
The keys of this multimap refer to the column names of the model input signature.
During the scoring process, the input data is first verified against the model input signature.
The \texttt{ModelInstance.verify} function is predefined, and it checks whether the column names in input data exactly match those present in the signature.
If the input is correctly verifed, the \texttt{ModelInstance.runVerified} function is called.
This function directly scores each row of data and is entirely implementation-dependent.
If the whole process finishes without errors, a multimap containing output data is returned.
The keys of the returned multimap refer to the column names of the model output signature.

\section {ModelRepository}

The \texttt{ModelRepository} trait provides a consistent way of listing models for each integration.
Each instance of a \texttt{ModelRepository} contains information about the configuration of a given integration.
Based on this information, the \texttt{ModelRepository.listModels} function returns a list of all \texttt{Model}s available in this repository.
Those models can be accessed in the Nussknacker UI and scored using provided data after compiling the Nussknacker process.
1 change: 1 addition & 0 deletions thesis.tex
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@
\input{chapters/architecture-overview.tex}
\input{chapters/mlflow-integration.tex}
\input{chapters/dev-env.tex}
\input{chapters/prinz-api.tex}
\input{chapters/project-development.tex}
\input{chapters/responsibilities.tex}

Expand Down

0 comments on commit 1969a78

Please # to comment.