alter_ego
is a library that allows you to run experiments with LLMs. ego
is a command-line helper included with this package.
alter_ego
allows you to run microexperiments using a simple shorthand. You can also create more advanced experiments. For turn-based interactive experiments, a builder is available.
Read our paper.
You can browse our source code in this repository. Here are autogenerated docs that can make it easier to find what you're looking for.
- Install Python, at least version 3.8. If you are on Windows, make sure to install Python into the PATH.
- Create a virtual environment and activate it. On Linux and macOS, this is very simple. Just open a terminal and execute the following commands:
user@host:~$ python -m venv env
user@host:~$ source env/bin/activate
Note: In this document, you may have to replace python
by python3
and pip
by pip3
. This depends on your system's settings.
On Windows, consider using this tutorial to create and activate a virtual environment.
Some editors also do this for you.
- Install
alter_ego
using
(env) user@host:~$ pip install -U alter_ego_llm
Note how (env)
signals that we are in the virtual environment created earlier.
For the remainder of this document, we assume that your editor's current directory is also your terminal's present working directory. From within your terminal, you can find out the present working directory using pwd
โ this should show the very same directory as opened in your editor.
Note: If you do not want to use GPT for now, simply change GPTThread
to CLIThread
in the examples below and skip this section.
- Obtain an API key from OpenAI. Here is more information. Your API key looks as follows:
sk-***
Copy this to your clipboard. - Create a new file in your editor.
- Put the content of your clipboard into the file
openai_key
in your current directory. The file must not have a file extensionโit is literally just calledopenai_key
.
New: ๐บ WATCH VIDEO TUTORIAL
Let's create a minimal experiment using alter_ego
's shorthand feature.
- Create a new file in your editor,
first_experiment.py
. Here's its code:
import alter_ego.agents
from alter_ego.utils import extract_number
from alter_ego.experiment import factorial
def agent():
return alter_ego.agents.GPTThread(model="gpt-3.5-turbo", temperature=1.0)
prompt = "Estimate the public approval rating of {{politician}} during the {{time}} of their presidency. Only return a single percentage from 0 to 100."
data = factorial(
prompt,
politician=["George W. Bush", "Barack Obama"],
time=["1st year", "8th year"]
).run(agent, extract_number, times=1)
for row in data:
print(row)
Note how we use variables within the prompt. The crucial feature of alter_ego
is how these variables are automatically replaced based on treatment.
- In the terminal, run
(env) user@host:~$ python first_experiment.py
- This will take a few seconds and give you output similar to this:
{'politician': 'George W. Bush', 'time': '1st year', 'result': None}
{'politician': 'George W. Bush', 'time': '8th year', 'result': None}
{'politician': 'Barack Obama', 'time': '1st year', 'result': 63}
{'politician': 'Barack Obama', 'time': '8th year', 'result': 8}
As you see, GPT did not give a valid response for George W. Bush. Let's debug by changing the line with run
to:
).run(agent, extract_number, times=1, keep_retval=True)
Rerunning our script gives:
{'politician': 'George W. Bush', 'time': '1st year', 'result': 62, 'retval': 'Approximately 62%.'}
{'politician': 'George W. Bush', 'time': '8th year', 'result': None, 'retval': "It is difficult to provide an accurate estimate without conducting a specific poll or analysis. However, based on historical data and trends, it is common for a president's approval rating to decline over the course of their second term. Taking into account various factors such as the economic recession and the ongoing Iraq War during George W. Bush's final year in office (2008), it is reasonable to estimate his public approval rating to be around 25-35%. Please note that this estimation is subjective and might not perfectly reflect the actual public sentiment at that time."}
(Here, I have shown only two rows of the output.)
As you see, in the cases where GPT returned only a single number, alter_ego
was able to correctly extract it. Unfortunately, GPT 3.5 tends to refuse requests to just return a single number. GPT 4 works better. If you have access to GPT 4 over the API, you can change model="gpt-3.5-turbo"
to model="gpt-4"
. This is the resulting output (where I have omitted the retval
once again):
{'politician': 'George W. Bush', 'time': '1st year', 'result': 57}
{'politician': 'George W. Bush', 'time': '8th year', 'result': 34}
{'politician': 'Barack Obama', 'time': '1st year', 'result': 57}
{'politician': 'Barack Obama', 'time': '8th year', 'result': 55}
Here you can view the documentation for run
. run
allows you to quickly execute an experiment defined by what highfalutin scientists call a โfactorial design.โ This is because the possibilities of politician
(George W. Bush, Barack Obama) were โmultipliedโ by the possibilities for time
(1st year, 8th year).
The nice thing about these microexperiments is that you can easily carry the output forward to Pandas, Polars, etc.โthis is because data
is only a โlist of dicts,โ and as such it is trivial to convert to a DataFrame. This allows you to analyze data received straight from an LLM.
Of course, you will often want to set the temperature to 0.0
or another low value. This depends on the nature of your use-case.
New: ๐บ WATCH VIDEO TUTORIAL
We offer a web app to build simple experiments between multiple LLMs. The builder can be found here, with its source code being available here.
-
For now, just read through the app (it showcases an example of a framed ultimatum game) and scroll down.
-
Copy the code shown below โExport or import scenarioโ on the web app into a new file. Call that file
built.json
in your current project directory. The file must be calledbuilt.json
. -
Open a terminal and execute
(env) user@host:~$ ego run built
-
This will show โSystem instructionsโ for two separate players. Note how they vary: One player (the first one) is the proposer and the second player is the responder.
-
The proposer is now asked to put in a proposal in JSON. Let's do it:
{"keep": 4.2}
- As you see, the responder is notified and can now
ACCEPT
orREJECT
. Let's accept:
ACCEPT
- This completes the experiment. You will see something like:
Experiment c6627c4e-f17f-4cdc-ba47-462eced3e489 OK
- Let's look at the data that was generated. We can get it in CSV format by executing:
(env) user@host:~$ ego data built c6627c4e-f17f-4cdc-ba47-462eced3e489 > data.csv
(You need to replace c6627c4e-f17f-4cdc-ba47-462eced3e489
with your actual experiment ID)
This should tell you that 2 lines were written. If you open data.csv
in your preferred spreadsheet calculator, you will see the following output:
choice | convo | experiment | i | round | tainted | thread | thread_type | treatment |
---|---|---|---|---|---|---|---|---|
{"keep": 4.2} | 03ba9edf-99c0-46d6-8c26-42c26683197c | c6627c4e-f17f-4cdc-ba47-462eced3e489 | 1 | 1 | False | 1ce5faaf-cc1f-438f-8231-8b7e0d96fb07 | CLIThread | take |
"ACCEPT" | 03ba9edf-99c0-46d6-8c26-42c26683197c | c6627c4e-f17f-4cdc-ba47-462eced3e489 | 2 | 1 | False | 9043cd57-3e83-42d2-8d62-c783725e05e7 | CLIThread | take |
This is obviously easy to post-process in whatever statistics software you use.
If you re-run the experiment, enter garbage instead of the expected inputs and re-export the data, you will see that the tainted
column becomes True
. You can check for tainted
to verify that inputs were received and processed as expected. Note that once any Thread responds invalidly, the Conversation will be stopped and all Threads will have tainted
set to True
. Thus, ego data
's output may be partial.
You can run your scenario five times by doing
(env) user@host:~$ ego run -n 5 built
Needless to say, but you can replace 5
by any integer whatsoever.
Feel free to experiment with our builder.
Note: Experts can set the environment variable BUILT_FILE
to have ego
use a different file name from built.json
.
New: ๐บ WATCH VIDEO TUTORIAL
oTree is a relatively popular framework for web-based experiments.
Here is oTree's own guide to installing it on your computer, and here is another one by us.
In general, after installing Python (and having the installer put it in your PATH), run
user@host:~$ python -m venv env
user@host:~$ source env/bin/activate
(env) user@host:~$ pip install -U alter_ego_llm # run this first
(env) user@host:~$ pip install -U otree # then run this
(env) user@host:~$ otree startproject my_project # this creates an oTree โprojectโ, say no to sample games
(env) user@host:~$ cd my_project # this enters the oTree โprojectโ
Put your OpenAI API key into the file openai_key
โ note how this file has no extension.
Go to otree/
in this repository. Copy any of the apps ego_human
or ego_chat
into your project directory. The app directory must be on the same level of settings.py
. In other words, here is how the folder structure looks if you decide to check out ego_chat
and you faithfully followed all previous instructions:
.
โโโ ego_chat
โย ย โโโ Chat.html
โย ย โโโ __init__.py
โย ย โโโ prompts
โย ย โย ย โโโ system.txt
โย ย โโโ Welcome.html
โโโ openai_key
โโโ Procfile
โโโ requirements.txt
โโโ settings.py
โโโ _static
โย ย โโโ global
โย ย โโโ empty.css
โโโ _templates
โโโ global
โโโ Page.html
Amend settings.py
as follows:
SESSION_CONFIGS = [
dict(
name="ego_chat",
app_sequence=["ego_chat"],
num_demo_participants=1,
),
]
Run otree devserver
and open your browser to localhost:8000. From there, you can click ego_chat
to invoke the app. Enjoy!
By the way, if you wish to use another model, you can change gpt-4
in line 31 in ego_chat/__init__.py
to gpt-3.5-turbo
or any other supported value.
Note: alter_ego
saves message histories automatically in .ego_output
in your oTree project folder. We did this so that nothing ever gets lost.
You can attach Threads (i.e., LLMs) and Conversations (i.e., bundles of LLMs) to oTree objects (participants, players, subsessions or sessions). This basically works as follows (in your app's __init__.py
:
from alter_ego.agents import *
from alter_ego.utils import from_file
from alter_ego.exports.otree import link as ai
...
def creating_session(subsession):
for player in subsession.get_players():
ai(player).set(GPTThread(model="gpt-3.5-turbo", temperature=1.0))
Here, each player
would get their own personal GPT agent. If you want to assign such an agent to a group
, just do this:
def creating_session(subsession):
for group in subsession.get_groups():
ai(group).set(GPTThread(model="gpt-3.5-turbo", temperature=1.0))
Then, within your code, you can access the agent using a context manager. Here's an example of a simple live_method
-based chat if we attached the Thread to the player
object:
class Chat(Page):
def live_method(player, data):
if isinstance(data, str):
with ai(player) as llm:
# this submits the player's message and gets the response
response = llm.submit(data, max_tokens=500)
# note: if you put the AI on the "group" object or somewhere other than
# the player, you may want to change this
return {player.id_in_group: response}
If you want to set the system prompt, you can do:
def before_next_page(player, timeout_happened):
with ai(player) as llm:
# this sets the "system" prompt
llm.system("You are participating in an experiment.")
You can do whatever you want, but always remember to open the LLM's context (using with ai(...) as llm
) before performing any action. (This extra step is necessary because of oTree's ORM, which otherwise couldn't notice changes deep down in the Thread.)
You can also attach a whole Conversation to the aforementioned oTree objects.
We provide a simple Chat in this repository, see the directory otree/ego_chat
.
Remember to put your API key into your oTree project folder. alter_ego
saves message histories automatically in .ego_output
in your oTree project folder.
New: ๐บ WATCH VIDEO TUTORIAL
You can use the primitives exposed by this library to develop full-fledged experiments that go beyond the capabilities of our builder. The directory scenarios/
contains a bunch of examples, including the code for our paper's machine--machine interaction example (ego_prereg.py
). Watch the video tutorial to get a feeling for what's possible.
When using any part of alter_ego
in a scientific context, cite the following work:
@article{ego,
title={Integrating Machine Behavior into Human Subject Experiments: A User-friendly Toolkit and Illustrations},
author={Engel, Christoph and Grossmann, Max R. P. and Ockenfels, Axel},
year={2023},
}
alter_ego
is ยฉ Max R. P. Grossmann et al., 2023. It is licensed under LGPLv3+. Please see LICENSE
for details.
This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. See the GNU Lesser General Public License for more details.