This repository contains the source code used for finetuning the LLM phi-2 for programming-tasks. The methodologies used are SFT (supervised fine tuning) and DPO (direct preference optimization). To execute both script it's necessary to install the dependecies specified in the file requirements.txt. The repository also contains two notebooks for testing the performance of the fine-tuned or base model, on the HumanEval and HumanEval-X benchmarks. Regarding the latter, for computational purposes, only Java, C++ and JavaScript were selected as the languages for generating the samples; furthermore, for each language, were sampled 80 programming tasks from the corresponding json file of the HumanEval-X dataset (out of 164). Lastly, i tried to implement a simple memory for the LLM, which leverages ChromDB for storing and retrieving the embeddings to insert in the context of the model. Since not all informations retrieved are relevant to the information need explicited through the query by the user to the LLM, an hyperparameter, threshold, was added, in order to filter the least relevant informations.
-
Notifications
You must be signed in to change notification settings - Fork 0
eyess-glitch/phi-2-fine-tuning
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
This repository contains the source code used for finetuning the LLM phi-2 with several frameworks, such as DPO.
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published