Naked LLaMA

Build llama inference compute from scrath, only using torch/numpy base ops

Inspired by karpathy's awesome repo nanoGPT, I re-implemented a simple and clear llama model from scratch.

install

pip install torch >= 2.1.0

# transformers is used for convert model weights and compare results
pip install transformers >= 4.35.2

excute & result

git clone https://github.com/silencelamb/naked_llama.git

# convert huggingface model to npy file
python convert_hf_to_pkl.py  # default model_size is 7b

# default model_size is 7b
python naked_llama.py

# run 70 b
python naked_llama.py --model_size 70b

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
configs		configs
hf_example		hf_example
layers		layers
Kv cache与多轮对话 - 飞书云文档.pdf		Kv cache与多轮对话 - 飞书云文档.pdf
README.md		README.md
configuration_llama.py		configuration_llama.py
convert_hf_to_pkl.py		convert_hf_to_pkl.py
llama-in-framwork_vs_naked-llama.png		llama-in-framwork_vs_naked-llama.png
llama2_70b_image.png		llama2_70b_image.png
llama2_7b_image.png		llama2_7b_image.png
naked_llama2.py		naked_llama2.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Naked LLaMA

install

excute & result

references

About

Releases

Packages

Languages

ckfgihub/naked_llama

Folders and files

Latest commit

History

Repository files navigation

Naked LLaMA

install

excute & result

references

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages