Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Issue/#41/add dev chat server instructions #45

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 49 additions & 2 deletions docs/development.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,15 +69,62 @@ npm run pretty
npm run type-check
```

### Summary of Server-Side Rendering and Client-Side Data Handling for Jobs and Chat Routes
## Local Dev Chat Environment

### 1) Using the ilab command line tool

For the chat functionality to work you need a ilab model chat instance. To run this locally:

`cd server`

[https://github.com/instructlab/instructlab?tab=readme-ov-file#-getting-started](https://github.com/instructlab/instructlab?tab=readme-ov-file#-getting-started)

After you use the `ilab serve` command you should have, by default, a chat server instance running on port 8000.

### 2) Using Podman

#### Current issues

- The docker image that runs the server does not utilise Mac Metal GPU and therefore is very slow when answering prompts
- The docker image is very large as it contains the model itself. Potential to have the model incoperated via a docker volume to reduce the size of the actual image.

`docker run -p 8000:8000 aevo987654/instructlab_chat_8000:v2`

This should run a server on port 8000

### Configuring the chat environment to use a local ilab model chat instance

Return back to the root of the repo (ui) and run `npm run dev` and visit [http://localhost:3000/playground/endpoints](http://localhost:3000/playground/endpoints).

Click the `Add Endpoint` button and a popup modal will appear.

![enter image description here](../public/dev-local-chat-server/add-endpoint.png)

- URL - add `http://127.0.0.1:8000`
- Model Name - add `merlinite-7b-lab-Q4_K_M.gguf`
- API Key - add some random characters

Click the `Save` button

![enter image description here](../public/dev-local-chat-server/added-endpoint.png)

Go to the chat interface [http://localhost:3000/playground/chat](http://localhost:3000/playground/chat) and select the `merlinite-7b-lab-Q4_K_M.gguf` model.

![enter image description here](../public/dev-local-chat-server/select-the-correct-model.png)

The chat interface should now use the server.

![enter image description here](../public/dev-local-chat-server/successful-chat.png)

## Summary of Server-Side Rendering and Client-Side Data Handling for Jobs and Chat Routes

We are leveraging Next.js's app router to handle
[server-side rendering](https://nextjs.org/docs/pages/building-your-application/rendering/server-side-rendering)
(SSR) and client-side data interactions for both jobs and chat functionalities.
Below is a summary of how we manage server-side rendering and client-side data
handling for these routes.

#### Server-Side Rendering (SSR)
### Server-Side Rendering (SSR)

**API Routes**:

Expand Down
Binary file added public/dev-local-chat-server/add-endpoint.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added public/dev-local-chat-server/added-endpoint.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added public/dev-local-chat-server/successful-chat.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
21 changes: 21 additions & 0 deletions server/Containerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
FROM python:3.11

# Set working directory
WORKDIR /app

RUN pip install --upgrade pip
RUN pip install --no-cache-dir instructlab==0.16.1

# Copy project files to the working directory
COPY config.yaml .

# Download the merlinite model
RUN ilab download

# Copy project files to the working directory
COPY . .

EXPOSE 8000

# Run the chat server with the specified model family and model file
CMD ["ilab", "serve", "--model-family", "merlinite", "--model-path", "models/merlinite-7b-lab-Q4_K_M.gguf"]
26 changes: 26 additions & 0 deletions server/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
chat:
context: default
greedy_mode: false
logs_dir: data/chatlogs
max_tokens: null
model: models/merlinite-7b-lab-Q4_K_M.gguf
session: null
vi_mode: false
visible_overflow: true
general:
log_level: INFO
generate:
chunk_word_count: 1000
model: models/merlinite-7b-lab-Q4_K_M.gguf
num_cpus: 10
num_instructions: 100
output_dir: generated
prompt_file: prompt.txt
seed_file: seed_tasks.json
taxonomy_base: origin/main
taxonomy_path: taxonomy
serve:
gpu_layers: -1
host_port: 0.0.0.0:8000
max_ctx_size: 4096
model_path: models/merlinite-7b-lab-Q4_K_M.gguf