Method | llava-med-zh-eval Qwen Score |
---|---|
GPT4 Ground Truth | 68.26 |
LLaVA-1.5-7B | 53.13 |
Chinese-LLaVA-Med-7B | 58.78 |
Dataset | Description |
---|---|
llava-med-zh-instruct-60k | 60k instruction tuning data |
llava-med-zh-eval | 115 evaluation data |
# install LLaMA-Factory
git clone https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e .[torch,metrics]
We recommend using full
finetuning, but you could also use lora
yaml.
# full finetuning
CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.run \
--nproc_per_node 2 \
--nnodes 1 \
--standalone \
../LLaMA-Factory/src/train.py config/llava1_5_full_sft.yaml
# export
# modify your own export_hub_model_id and hf_hub_token in the config/llava1_5_full_sft_export.yaml
CUDA_VISIBLE_DEVICES=0 llamafactory-cli export config/llava1_5_full_sft_export.yaml
# generate output results
python3 evaluation/generate_eval_content.py --model_name_or_path models/llava1_5-7b-med
# eval by qwen-1.5-14b-chat
python3 evaluation/eval_qwen_score.py --input_path outputs/llava_med_zh_eval_llava1_5-7b-med.json
# with final model
llamafactory-cli webchat config/llava1_5_full_sft_infer.yaml