Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs

Additional code and contents will be added soon.

Utility Analysis

You can run experiments using the run_experiments.py script. For example, the following command will compute utilities over outcomes for GPT-4o:

python run_experiments.py --experiments compute_utilities --models gpt-4o

For more details about the experiments and available options, please see the Utility Analysis README.

Citation

If you find this useful in your research, please consider citing:

@article{mazeika2025utility,
  title={Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs},
  author={Mazeika, Mantas and Yin, Xuwang and Tamirisa, Rishub and Lim, Jaehyuk and Lee, Bruce W and Ren, Richard and Phan, Long and Mu, Norman and Khoja, Adam and Zhang, Oliver and others},
  journal={arXiv preprint arXiv:2502.08640},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
assets		assets
utility_analysis		utility_analysis
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs

Utility Analysis

Citation

About

Releases

Packages

Contributors 3

Languages

License

centerforaisafety/emergent-values

Folders and files

Latest commit

History

Repository files navigation

Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs

Utility Analysis

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages