-
-
Notifications
You must be signed in to change notification settings - Fork 19
NNUE training
Code: https://github.com/ianfab/variant-nnue-pytorch
The training data generator prints the required code changes for the training code when setting a given variant with setoption name UCI_Variant value yourvariantname
. Just check out a new branch with git and apply the changes for a given variant there. Usually you should simply rely on what the training data generator prints, so you likely won't need to manually change the values but just copy and paste the code fragments to the corresponding place in the code. These are the code fragments that need to be replaced:
-
lib/nnue_training_data_formats.h: The
PIECE_COUNT
is the maximum number of pieces on the board. TheKING_SQUARES
needs to be changed to 9 for Xiangqi/Janggi and to 1 for variants without kings. Remember to always recompile the training data loader after updating this file (https://github.com/ianfab/variant-nnue-pytorch#build-the-fast-dataloader).
#define FILES 8
#define RANKS 8
#define PIECE_TYPES 6
#define PIECE_COUNT 32
#define POCKETS false
#define KING_SQUARES FILES * RANKS
- variant.py: Similar updates are required here, and in addition to that the initial guesses for piece values need to be defined. This file defines the architecture of the input layer for the variant NNUE network that will be trained.
RANKS = 8
FILES = 8
SQUARES = RANKS * FILES
KING_SQUARES = RANKS * FILES
PIECE_TYPES = 6
PIECES = 2 * PIECE_TYPES
USE_POCKETS = False
POCKETS = 2 * FILES if USE_POCKETS else 0
PIECE_VALUES = {
1 : 126,
2 : 781,
3 : 825,
4 : 1276,
5 : 2538,
}
The training command works the same as for the official trainer, e.g.,
python train.py --threads 1 --num-workers 1 --gpus 1 --max_epochs 10 training_data.bin validation_data.bin
-
--max_epochs
: number of epochs for training. One epoch is 20M positions, so choose the number of epochs according to the amount of training data. E.g., for 200M positions in thetraining_data.bin
file--max_epochs
should be 10 (or slightly above).
If you want to continue training from an existing network, you need to first serialize it:
python serialize.py --features='HalfKAv2' somevariantnet.nnue startingpointfortraining.pt
Then, when running the training, you need to specify the serialized network as input to resume from:
python train.py --resume-from-model startingpointfortraining.pt ...
python serialize.py logs/default/version_0/checkpoints/last.ckpt yourvariant.nnue