forked from Zeta36/chess-alpha-zero
-
Notifications
You must be signed in to change notification settings - Fork 14
Notes on losses and overfitting
Michael Pang edited this page Dec 29, 2017
·
2 revisions
- A network that predicts uniform policy every time will incur a cross-entropy loss of ln(C) where C is the number of categories. Here C=1968 so an "upper bound" of 7.585 loss.
- Conversely, if your loss is L, your model generally picks from the top e^L moves.
Loss | Top x moves |
---|---|
0 | 1 |
0.693 | 2 |
1.099 | 3 |
1.386 | 4 |
1.609 | 5 |
3.555 | 35 |
- Self play keeps resigning
- If you assume most "normal" chess games are pretty even until halfway through, you get a lower bound of 0.5 on asymptotic MSE. (I think) this is decreasing in the average elo of the players and also the elo difference between the players. If you assume half of GM games are draws too, the lower bound goes down to 0.25.
- AZ had 5000 TPUs running self play and only 64 running SGD.