Name		Name	Last commit message	Last commit date
parent directory ..
CMakeLists.txt		CMakeLists.txt
README.md		README.md
casual_lm.cpp		casual_lm.cpp
convert_tokenizers.py		convert_tokenizers.py
set_up_and_run.sh		set_up_and_run.sh

README.md

Casual LM

This application showcases inference of a casual language model (LM). It doesn't have many configuration options to encourage the reader to explore and modify the source code. There's a Jupyter notebook which corresponds to this pipeline and discusses how to create an LLM-powered Chatbot: https://github.com/openvinotoolkit/openvino_notebooks/tree/main/notebooks/254-llm-chatbot.

Note

This project is not for production use.

How it works

The program loads a tokenizer, detokenizer, and a model (.xml and .bin) to OpenVINO. The model is reshaped to batch 1 and variable prompt length. A prompt is tokenized and passed to the model. The model greedily generates token by token until the special end of sequence (EOS) token is obtained. The predicted tokens are converted to chars and printed in a streaming fashion.

Install OpenVINO Runtime

Install OpenVINO Runtime from an archive: Linux. <INSTALL_DIR> below refers to the extraction location.

Build `Casual LM` and `user_ov_extensions`

git submodule update --init
source <INSTALL_DIR>/setupvars.sh
cmake -DCMAKE_BUILD_TYPE=Release -S ./ -B ./build/ && cmake --build ./build/ --config Release -j

Supported models

This pipeline can work with other similar topologies produced by optimum-intel with the same model signature.

Download and convert the model and tokenizers

The --upgrade-strategy eager option is needed to ensure optimum-intel is upgraded to the latest version.

source <INSTALL_DIR>/setupvars.sh
python -m pip install --upgrade-strategy eager transformers==4.35.2 "optimum[openvino]>=1.14" ../../../thirdparty/openvino_contrib/modules/custom_operations/[transformers] --extra-index-url https://download.pytorch.org/whl/cpu
optimum-cli export openvino -m meta-llama/Llama-2-7b-hf ./Llama-2-7b-hf/
python ./convert_tokenizers.py ./Llama-2-7b-hf/

Run

Usage: casual_lm <openvino_model.xml> <tokenizer.xml> <detokenizer.xml> "<prompt>"

Example: ./build/casual_lm ./Llama-2-7b-hf/openvino_model.xml ./tokenizer.xml ./detokenizer.xml "Why is the Sun yellow?"

To enable Unicode characters for Windows cmd open Region settings from Control panel. Administrative->Change system locale->Beta: Use Unicode UTF-8 for worldwide language support->OK. Reboot.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cpp

cpp

README.md

Casual LM

How it works

Install OpenVINO Runtime

Build `Casual LM` and `user_ov_extensions`

Supported models

Download and convert the model and tokenizers

Run

Files

cpp

Directory actions

More options

Directory actions

More options

Latest commit

History

cpp

Folders and files

parent directory

README.md

Casual LM

How it works

Install OpenVINO Runtime

Build Casual LM and user_ov_extensions

Supported models

Download and convert the model and tokenizers

Run

Build `Casual LM` and `user_ov_extensions`