Additional code and contents will be added soon.
You can run experiments using the run_experiments.py
script. For example, the following command will compute utilities over outcomes for GPT-4o:
python run_experiments.py --experiments compute_utilities --models gpt-4o
For more details about the experiments and available options, please see the Utility Analysis README.
If you find this useful in your research, please consider citing:
@article{mazeika2025utility,
title={Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs},
author={Mazeika, Mantas and Yin, Xuwang and Tamirisa, Rishub and Lim, Jaehyuk and Lee, Bruce W and Ren, Richard and Phan, Long and Mu, Norman and Khoja, Adam and Zhang, Oliver and others},
journal={arXiv preprint arXiv:2502.08640},
year={2025}
}