Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

QA: release 1.0.9 #1920

Closed
TC117 opened this issue Feb 4, 2025 · 7 comments
Closed

QA: release 1.0.9 #1920

TC117 opened this issue Feb 4, 2025 · 7 comments
Assignees
Labels
Milestone

Comments

@TC117
Copy link

TC117 commented Feb 4, 2025

QA details:

Version: v1.0.9

OS (select one)

  • Windows 11 (online & offline)
  • Ubuntu 24, 22 (online & offline)
  • Mac Silicon OS 14/15 (online & offline)
  • Mac Intel (online & offline)

1. Manual QA (CLI)

Installation

  • it should install with local installer (default; no internet required during installation, all dependencies bundled)
  • it should install with network installer
  • it should install 2 binaries (cortex and cortex-server) [mac: binaries in /usr/local/bin]
  • it should install with correct folder permissions
  • it should install with folders: /engines /logs (no /models folder until model pull)
  • It should install with Docker image https://cortex.so/docs/installation/docker/

Data/Folder structures

  • cortex.so models are stored in cortex.so/model_name/variants/, with .gguf and model.yml file
  • huggingface models are stored huggingface.co/author/model_name with .gguf and model.yml file
  • downloaded models are saved in cortex.db with the right fields: model, author_repo_id, branch_name, path_to_model_yaml (view via SQL)

Cortex Update

  • cortex -v should check output current version and check for updates
  • cortex update replaces the app, installer, uninstaller and binary file (without installing cortex.llamacpp)
  • cortex update should update from ~3-5 versions ago to latest (+3 to 5 bump)
  • cortex update should update from the previous version to latest (+1 bump)
  • cortex update -v 1.x.x-xxx should update from the previous version to specified version
  • cortex update should update from previous stable version to latest
  • it should gracefully update when server is actively running

Overall / App Shell

  • cortex returns helpful text in a timely* way (< 5s)
  • cortex or cortex -h displays help commands
  • CLI commands should start the API server, if not running [except
  • it should correctly log to cortex-cli.log and cortex.log
  • There should be no stdout from inactive shell session

Engines

  • llama.cpp should be installed by default
  • it should run gguf models on llamacpp
  • it should list engines
  • it should get engines
  • it should install engines (latest version if not specified)
  • it should install engines (with specified variant and version)
  • it should get default engine
  • it should set default engine (with specified variant/version)
  • it should load engine
  • it should unload engine
  • it should update engine (to latest version)
  • it should update engine (to specified version)
  • it should uninstall engines
  • it should gracefully continue engine installation if interrupted halfway (partial download)
  • it should gracefully handle when users try to CRUD incompatible engines (No variant found for xxx)
  • it should run trtllm models on trt-llm [WIP, not tested]
  • it shoud handle engine variants [WIP, not tested]
  • it should update engines versions [WIP, not tested]

Server

  • cortex start should start server and output localhost URL & port number
  • users can access API Swagger documentation page at localhost URL & port number
  • cortex start can be configured with parameters (port, logLevel [WIP]) https://cortex.so/docs/cli/start/
  • it should correctly log to cortex logs (logs/cortex.log, logs/cortex-cli.log)
  • cortex ps should return server status and running models (or no model loaded)
  • cortex stop should stop server

Model Pulling

  • Pulling a model should pull .gguf and model.yml file
  • Model download progress should appear as download bars for each file
  • Model download progress should be accurate (%, total time, download size, speed)

cortex.so

  • it should pull by built in model_ID
  • pull by model_ID should recommend default variant at the top (set in HF model.yml)
  • it should pull by built-in model_id:variant

huggingface.co

  • it should pull by HF repo/model ID
  • it should pull by full HF url (ending in .gguf)

Interrupted Download

  • it should allow user to interrupt / stop download
  • pulling again after interruption should accurately calculates remainder of model file size neeed to be downloaded (Found unfinished download! Additional XGB needs to be downloaded)
  • it should allow to continue downloading the remainder after interruption

Model Management

  • it should list downloaded models
  • it should get a local model
  • it should update model parameters in model.yaml
  • it should delete a model
  • it should import models with model_id and model_path

Model Running

  • cortex run <cortexso model> - if no local models detected, shows pull model menu
  • cortex run - if local model detected, runs the local model
  • cortex run - if multiple local models detected, shows list of local models (from multiple model sources eg cortexso, HF authors) for users to select (via regex search)
  • cortex run <invalid model id> should return gracefully Model not found!
  • run should autostart server
  • cortex run <model> starts interactive chat (by default)
  • cortex run <model> -d runs in detached mode
  • cortex models start <model>
  • terminate StdIn or exit() should exit interactive chat

Hardware Detection / Acceleration [WIP, no need to QA]

  • it should auto offload max ngl
  • it should correctly detect available GPUs
  • it should gracefully detect missing dependencies/drivers
    CPU Extension (e.g. AVX-2, noAVX, AVX-512)
    GPU Acceleration (e.g. CUDA11, CUDA12, Vulkan, sycl, etc)

Uninstallation / Reinstallation

  • it should uninstall 2 binaries (cortex and cortex-server)
  • it should uninstall with 2 options to delete or not delete data folder
  • it should gracefully uninstall when server is still running
  • uninstalling should not leave any dangling files
  • uninstalling should not leave any dangling processes
  • it should reinstall without having conflict issues with existing cortex data folders

--

2. API QA

Checklist for each endpoint

  • Upon cortex start, API page is displayed at localhost:port endpoint
  • Endpoints should support the parameters stated in API reference (towards OpenAI Compatibility)
  • https://cortex.so/api-reference is updated

Endpoints

Chat Completions

Engines

  • List engines: GET /v1/engines
  • Get engine: GET /v1/engines/{name}
  • Install engine: POST /v1/engines/install/{name}
  • Get default engine variant/version: GET v1/engines/{name}/default
  • Set default engine variant/version: POST v1/engines/{name}/default
  • Load engine: POST v1/engines/{name}/load
  • Unload engine: DELETE v1/engines/{name}/load
  • Update engine: POST v1/engines/{name}/update
  • uninstall engine: DELETE /v1/engines/install/{name}
  • remote engine: ...

Pulling Models

  • Pull model: POST /v1/models/pull starts download (websockets)
  • Pull model: websockets /events emitted
  • Stop model download: DELETE /v1/models/pull (websockets)
  • Stop model download: websockets /events stopped
  • Import model: POST v1/models/import

Running Models

  • List models: GET v1/models
  • Start model: POST /v1/models/start
  • Stop model: POST /v1/models/stop
  • Get model: GET /v1/models/{id}
  • Delete model: DELETE /v1/models/{id}
  • Update model: PATCH /v1/models/{model} updates model.yaml params

Threads

  • List threads: GET v1/threads
  • Get threads with ID: GET `v1/threads/{id}
    ....

Server

  • CORs [WIP]
  • health: GET /healthz
  • terminate server: DELETE /processManager/destroy

Test list for reference:

@TC117 TC117 added the type: QA checklist QA checklist label Feb 4, 2025
@github-project-automation github-project-automation bot moved this to Investigating in Menlo Feb 4, 2025
@TC117 TC117 self-assigned this Feb 4, 2025
@TC117 TC117 moved this from Investigating to QA in Menlo Feb 4, 2025
@TC117 TC117 added this to the v1.0.9 milestone Feb 4, 2025
@TC117 TC117 changed the title QA: [1.0.9] QA: release 1.0.9 Feb 4, 2025
@TC117
Copy link
Author

TC117 commented Feb 4, 2025

@vansangpfiev auto start server but not show on cli
cortex-beta - window - VM 114
step to reproduce:

  • Install with network version
  • Run command to start server
    Image

@TC117
Copy link
Author

TC117 commented Feb 5, 2025

cortex ps not show loaded models
1.0.9-rc7 - window
Step:

  • Pull a model
  • start models with API
  • on CLI run cortex ps to check loaded model

Image

@vansangpfiev
Copy link
Contributor

@vansangpfiev auto start server but not show on cli cortex-beta - window - VM 114 step to reproduce:

  • Install with network version
  • Run command to start server
    Image

Image

Seems like the Server was running at that time. @TC117 Please help to verify.

@vansangpfiev vansangpfiev self-assigned this Feb 5, 2025
@TC117
Copy link
Author

TC117 commented Feb 5, 2025

Image

@vansangpfiev auto start server but not show on cli cortex-beta - window - VM 114 step to reproduce:

  • Install with network version
  • Run command to start server
    Image

Image

Seems like the Server was running at that time. @TC117 Please help to verify.

yes it only happen the 1st time so just a minor issue

@TC117
Copy link
Author

TC117 commented Feb 5, 2025

@vansangpfiev Cant download new engines its stop in the middle
1.0.9-rc7 - Window - VM 114

Image

@TC117
Copy link
Author

TC117 commented Feb 5, 2025

CURL request fail
MAC - Linux
1.0.9-rc7
Image

@vansangpfiev
Copy link
Contributor

vansangpfiev commented Feb 6, 2025

CURL request fail MAC - Linux 1.0.9-rc7 Image

@TC117 This should be fixed by cortex-beta-rc8 and llama-engine v0.1.49-4a0e548
The root cause is the llama-server does not have execute permission on macOS/Ubuntu after we pull llama-engine from github.

@TC117 TC117 closed this as completed Feb 6, 2025
@TC117 TC117 moved this from QA to Completed in Menlo Feb 6, 2025
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
Archived in project
Development

No branches or pull requests

2 participants