Elicit's debate work The code used to run the debate experiments for our "GPT-3.5 judges can supervise GPT-4o debaters in a capability asymmetric debate" blog post.