This Helm Chart deploys a Large Language Model (LLM)-enabled chat bot application.
This Helm Chart creates the following components:
By default the chatbot-ai-sample
supports the llama.cpp
inference server and related to the ai-lab-recipes model server.
However, the usage of vLLM
model services or existing model services is also supported:
- For the
vLLM
model service case, theValues.model.vllmSelected
value should betrue
, theValues.model.vllmModelServiceContainer
and theValues.model.modelName
should be configured too. - For the existing model service case, the
Values.model.existingModelServer
value should betrue
and theValues.model.modelEndpoint
should be set to the URL of the existing model endpoint we would like to use for this deployment. - In case the existing model service requires bearer authentication the
Values.model.includeModelEndpointSecret
should be set totrue
, theValues.model.modelEndpointSecretName
and theValues.model.modelEndpointSecretKey
should be configured.
A Streamlit application to interact with the model service which is based on the related Chatbot Template.
A job which takes care of the creating the application github repository.
A repository which connects our application with the pipeline-as-code-controller
which allows us to manage all webhooks received from our GitHub Application.
- A Github Application with
create repository
permissions for the GitHub Organization where the application will be created. - Access to an OpenShift 4.x cluster with
- permissions to deploy an application
- an existing Namespace where your application will be deployed
- correctly installed and configured OpenShift Pipelines Operator which is connected to your GitHub Applications webhook
- a Secret (of
key/value
type) in the existing Namespace containing a Github Token with these permissions to the given Github Organization
This Helm Chart can be installed from the Developer Catalog using the OpenShift Developer Console.
This Helm Chart can be installed via the command line by running the following command:
helm upgrade --install <release-name> --namespace <release-namespace> .
NOTE:
You can create a private-values.yaml
file that will be ignored by git to pass values to your Helm Chart.
Just copy the existing values.yaml
file in this directory to private-values.yaml
and make any necessary edits, then update your installation command as shown below:
helm upgrade --install <release-name> --namespace <release-namespace> -f ./private-values.yaml .
Name | Url | |
---|---|---|
Red Hat AI Development Team | https://github.com/redhat-ai-dev |
Kubernetes: >= 1.27.0-0
Key | Type | Default | Description |
---|---|---|---|
application.appContainer | string | "quay.io/redhat-ai-dev/chatbot:latest" |
The image used for the initial chatbot application interface |
application.appPort | int | 8501 |
The exposed port of the application |
gitops.gitDefaultBranch | string | "main" |
The default branch for the chatbot application Github repository |
gitops.gitSecretKeyToken | string | "password" |
The name of the Secret's key with the Github token value |
gitops.gitSecretName | string | "github-secrets" |
The name of the Secret containing the required Github token |
gitops.gitSourceRepo | string | "redhat-ai-dev/ai-lab-samples" |
The Github Repository with the contents of the ai-lab sample chatbot application |
gitops.githubOrgName | string | "" |
[REQUIRED] The Github Organization name that the chatbot application repository will be created in |
gitops.quayAccountName | string | "" |
[REQUIRED] The quay.io account that the application image will be pushed |
model.existingModelServer | bool | false |
The bool variable for support of existing model server |
model.includeModelEndpointSecret | bool | false |
The bool variable for support of bearer token authentication for existing model server authentication |
model.initContainer | string | "quay.io/redhat-ai-dev/granite-7b-lab:latest" |
The image used for the initContainer of the model service deployment |
model.maxModelLength | int | 4096 |
The maximum sequence length of the model. It is used only for the vllm case and the default value is 4096. |
model.modelEndpoint | string | "" |
The endpoint url of the model for the existing model service case. Is used only if existingModelServer is set to true. |
model.modelEndpointSecretKey | string | "" |
The name of the secret field storing the bearer value for the existing model service if the endpoint requires bearer authentication. Is used only if includeModelEndpointSecret is set to true. |
model.modelEndpointSecretName | string | "" |
The name of the secret storing the credentials for the existing model service if the endpoint requires bearer authentication. Is used only if includeModelEndpointSecret is set to true. |
model.modelInitCommand | string | "['/usr/bin/install', '/model/model.file', '/shared/']" |
The model service initContainer command |
model.modelName | string | "" |
The name of the model. By defaults it is set to instructlab/granite-7b-lab. It is used only for vllm and/or existing model service cases. |
model.modelPath | string | "/model/model.file" |
The path of the model file inside the model service container |
model.modelServiceContainer | string | "quay.io/ai-lab/llamacpp_python:latest" |
The image used for the model service. For the VLLM case please see vllmModelServiceContainer |
model.modelServicePort | int | 8001 |
The exposed port of the model service |
model.vllmModelServiceContainer | string | "" |
The image used for the model service for the VLLM use case. |
model.vllmSelected | bool | false |
The bool variable for support of vllm instead of llama_cpp. Be sure that your system has GPU support for this case. |
NOTE: Your helm release's name will be used as the name of the application github repository