I'm a NLPer interested in Large Language Model and graduated from SYSU with a master's degree.
In my free time, I like to write technical blogs on [Wechat Official Accounts: YeungNLP] and [Zhihu: 红雨瓢泼]
🔭 Experiences:
- Shopee, responsible for building NLP algorithm ability about Customer Service. (from 2022-04 to now)
- Tencent, responsible for building NLP algorithm ability about Product Understanding. (from 2021-06 to 2022-04)
- Alibaba, Internship at Alibaba (from 2020-06 to 2020-09).
⚙ Here are some my public projects:
Project | Description | Code |
---|---|---|
Firefly | One-stop training for LLMs. Some achievements: 1. firefly-llama2-13b ranked 3rd among all 13B models on Open LLM Leaderboard, only 0.5 points less than 1st. 2. firefly-llama-30b ranked 10th among all 30B models on Open LLM Leaderboard trained with single V100. 3. firefly-baichuan-13b achieves over 1.63 million downloads. 4. firefly-qwen1.5-en-7b-dpo improves 7.21 points compared with the official chat model. 5. firefly-gemma-7b improves 9.37 points compared with the official chat model. |
|
GPT2-chitchat | Chinese GPT2 for chitchat | |
Firefly-LLaMA2-Chinese | Chinese Llama2 with efficient and effective training method. | |
LongQLoRA | Efficient and Effective method for extending context length of Llama2 to 8192 with single V100. Technical Report | |
CPM | Chinese composition model based on CPM | |
CLIP-Chinese | Chinese CLIP model trained with 1.4 million image-text pairs | |
ClipCap-Chinese | Chinese image caption model based on clip and mengzi | |
OFA-Chinese | Chinese multi-modal unified pre-training model | |
LLMPruner | Prune vocabulary of LLMs to save memory in training. |
📁 Here are some my technical blogs:
- 📝 使用Firefly在单卡V100上对Qwen1.5进行SFT和DPO,大幅超越Qwen1.5和Gemma
- 📝 图解大模型推理优化之KV Cache
- 📝 Mixtral-8x7B MoE大模型微调实践,超越Llama2-65B
- 📝 LongQLoRA:单卡高效扩展LLaMA2-13B的上下文长度
- 📝 详解基于调整RoPE旋转角度的大模型长度外推方法
- 📝 图解RoPE旋转位置编码及其特性
- 📝 QLoRA轻量级增量预训练方案,及汉化Llama2的实践
- 📝 源码解析ChatGLM2多轮对话训练方法的不足,以及改进方法
- 📝 微调百川Baichuan-13B保姆式教程,手把手教你训练百亿大模型
- 📝 QLoRA文章解读 & 单卡高效微调bloom-7b1
- 📝 Firefly(流萤): 中文对话式大语言模型
- 📝 LLMPruner:大语言模型裁剪工具