We recommend using Conda for environment management.
Execute the following commands to create a conda environment and install the required dependencies:
conda create -n glm-4-demo python=3.12
conda activate glm-4-demo
pip install -r requirements.txt
Please note that this project requires Python 3.10 or higher. In addition, you need to install the Jupyter kernel to use the Code Interpreter:
ipython kernel install --name glm-4-demo --user
You can modify ~/.local/share/jupyter/kernels/glm-4-demo/kernel.json
to change the configuration of the Jupyter
kernel, including the kernel startup parameters. For example, if you want to use Matplotlib to draw when using the
Python code execution capability of All Tools, you can add "--matplotlib=inline"
to the argv
array.
To use the browser and search functions, you also need to start the browser backend. First, install Node.js according to the instructions on the Node.js official website, then install the package manager PNPM and then install the browser service dependencies:
cd browser
npm install -g pnpm
pnpm install
- Modify
BING_SEARCH_API_KEY
inbrowser/src/config.ts
to configure the Bing Search API Key that the browser service needs to use:
export default {
BROWSER_TIMEOUT: 10000,
BING_SEARCH_API_URL: 'https://api.bing.microsoft.com/v7.0',
BING_SEARCH_API_KEY: '<PUT_YOUR_BING_SEARCH_KEY_HERE>',
HOST: 'localhost',
PORT: 3000,
};
- The Wenshengtu function needs to call the CogView API. Modify
src/tools/config.py
, provide the Zhipu AI Open Platform API Key required for the Wenshengtu function:
BROWSER_SERVER_URL = 'http://localhost:3000'
IPYKERNEL = 'glm4-demo'
ZHIPU_AI_KEY = '<PUT_YOUR_ZHIPU_AI_KEY_HERE>'
COGVIEW_MODEL = 'cogview-3'
- Start the browser backend in a separate shell:
cd browser
pnpm start
- Run the following commands to load the model locally and start the demo:
streamlit run src/main.py
Then you can see the demo address from the command line and click it to access it. The first access requires downloading and loading the model, which may take some time.
If you have downloaded the model locally, you can specify to load the model from the local
by export *_MODEL_PATH=/path/to/model
. The models that can be specified include:
-
CHAT_MODEL_PATH
: used for All Tools mode and document interpretation mode, the default isTHUDM/glm-4-9b-chat
. -
VLM_MODEL_PATH
: used for VLM mode, the default isTHUDM/glm-4v-9b
.
The Chat model supports reasoning using vLLM. To use it, please install vLLM and
set the environment variable USE_VLLM=1
.
The Chat model also supports reasoning using OpenAI API. To use it, please run openai_api_server.py
in basic_demo
and set the environment variable USE_API=1
. This function is used to deploy inference server and demo server in different machine.
If you need to customize the Jupyter kernel, you can specify it by export IPYKERNEL=<kernel_name>
.
GLM4 Demo has three modes:
- All Tools mode
- VLM mode
- Text interpretation mode
You can enhance the model's capabilities by registering new tools in tool_registry.py
. Just use @register_tool
decorated function to complete the registration. For tool declarations, the function name is the name of the tool, and
the function docstring
is the description of the tool; for tool parameters, use Annotated[typ: type, description: str, required: bool]
to
annotate the parameter type, description, and whether it is required.
For example, the registration of the get_weather
tool is as follows:
@register_tool
def get_weather(
city_name: Annotated[str, 'The name of the city to be queried', True],
) -> str:
"""
Get the weather for `city_name` in the following week
"""
...
This mode is compatible with the tool registration process of ChatGLM3-6B.
- Code capability, drawing capability, and networking capability have been automatically integrated. Users only need to configure the corresponding Key as required.
- System prompt words are not supported in this mode. The model will automatically build prompt words.
Users can upload documents and use the long text capability of GLM-4-9B to understand the text. It can parse pptx, docx, pdf and other files.
- Tool calls and system prompt words are not supported in this mode.
- If the text is very long, the model may require a high amount of GPU memory. Please confirm your hardware configuration.
Users can upload images and use the image understanding capabilities of GLM-4-9B to understand the images.
- This mode must use the glm-4v-9b model.
- Tool calls and system prompts are not supported in this mode.
- The model can only understand and communicate with one image. If you need to change the image, you need to open a new conversation.
- The supported image resolution is 1120 x 1120