Python Environment

Code for paper LOLA: LLM-Assisted Online Learning Algorithm for Content Experiments.

Python Environment

We recommend using conda and pip to manage the environment. To set up the environment:

conda create --name lola
conda activate lola
conda install pip
pip install datasets
pip install peft
pip install evaluate
pip install transformers -U
pip install -U scikit-learn
pip install -U matplotlib
pip install progressbar2
pip install openai

# to download the Llama-3 model (only needed for fine-tuning Llama-3), register on huggingface for access to the model and then run the following command
pip install -U "huggingface_hub[cli]"
huggingface-cli login
# type in your huggingface credentials

Reproducing the Results

We save some intermediate result to reproduce the results in the paper. Use these intermediate results can save time by skipping OpenAI API calls and finetuning the model.

For Prompt Engineering Method

run Pure LLM - Prompt/visualize_result.py, this will generate mean_differences_heatmap_multiple.pdf and p_values_heatmap_multiple.pdf.

For Embedding Method

run Pure LLM - Embedding/predict_with_embedding.py

For Finetuning

run Finetune CTR Prediction/plot.py

For LOLA

run Jupyter Notebook LOLA - Regret Minimize/LOLA_regret_minimize.ipynb

Below are the steps to run all the code, from getting intermediate results to getting final result.

Dataset and Code

The original dataset we used is https://osf.io/jd64p/.

The pre-processed dataset can be downloaded from Kaggle, or use the kaggle CLI command: kaggle datasets download -d shuffleofficial/lola-llm-assisted-online-learning-algorithm

For data processing
- Code Path Upworthy Data Processing.ipynb
- Running this code will generate a csv file named ctr-all.csv, along with various data splits
- Data used: upworthy-archive-holdout-packages-03.12.2020.csv, upworthy-archive-exploratory-packages-03.12.2020.csv and upworthy-archive-confirmatory-packages-03.12.2020.csv (these data are downloaded from https://osf.io/jd64p/)
For Prompt Engineering Method
- Code Path Pure LLM Approaches/Pure LLM - Prompt/main.py and Pure LLM Approaches/Pure LLM - Prompt/visualize_result.py
- Data used winner-all.csv
For CTR prediction using OpenAI and Word2Vec Embedding
- Run Pure LLM - Embedding/get_embedding.py to get the embedding for the dataset
- Run Pure LLM - Embedding/predict_with_embedding.py to get the prediction result.
- Data used: selected_pairs_df_005_256.csv and selected_pairs_df_005_3072.csv
For LOLA
- Code Path LOLA - Regret Minimize/LOLA_regret_minimize.ipynb
- Data used: LoRA CTR.csv and simulation_results_regret_min
Survey Results
- Code and data path Survey

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
Finetune CTR Prediction		Finetune CTR Prediction
Finetune ChatGPT		Finetune ChatGPT
LOLA - Best Arm Identification		LOLA - Best Arm Identification
LOLA - Regret Minimize		LOLA - Regret Minimize
Pure LLM - Embedding		Pure LLM - Embedding
Pure LLM - Prompt		Pure LLM - Prompt
Survey		Survey
common		common
README.md		README.md
Upworthy Data Processing.ipynb		Upworthy Data Processing.ipynb
test_openai.py		test_openai.py
train.jsonl		train.jsonl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Python Environment

Reproducing the Results

For Prompt Engineering Method

For Embedding Method

For Finetuning

For LOLA

Dataset and Code

About

Releases

Packages

Contributors 2

Languages

DDDOH/LLM_News

Folders and files

Latest commit

History

Repository files navigation

Python Environment

Reproducing the Results

For Prompt Engineering Method

For Embedding Method

For Finetuning

For LOLA

Dataset and Code

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages