forked from Zeta36/chess-alpha-zero
-
Notifications
You must be signed in to change notification settings - Fork 14
Home
Michael Pang edited this page Dec 23, 2017
·
22 revisions
Welcome to the chess-alpha-zero wiki!
Model: Diagram
- Input: 12 planes for pieces, 4 planes for castling, 1 plane for 50 move rule and 1 plane for en-passant (no history, flip-color transform). Simple and reduces overfitting (in theory)
- Hidden: conv3-256 + 7 residual, batchnorms in between (total 15 conv layers)
- Output: 1968-wide vector for policy, scalar for value
- All workers are multithreaded/multiprocess
- SL and opt are especially fast, loading thousands of games in minutes which is great for collecting more data!
- self-play/eval/uci are also several times faster.
- Weight policy by ELO
- Training on the material value of position
- Extraneous bias removal
- Implement MCTS in C++
- Variable regularization....
- Try 5x5 convs in the first few layers
- Get a model that beats the materialistic MCTS agent