Skip to content

Training data generation

Fabian Fichter edited this page Mar 30, 2022 · 25 revisions

Training data generation using Fairy-Stockfish

Training data generator: https://github.com/ianfab/Fairy-Stockfish/tree/tools

Example

uci
setoption name Use NNUE value false
setoption name Threads value 8
setoption name Hash value 2048
setoption name UCI_Variant value extinction
isready
generate_training_data depth 2 count 10000000 random_multi_pv 4 random_multi_pv_diff 100 random_move_count 8 random_move_max_ply 20 write_min_ply 5 eval_limit 10000 set_recommended_uci_options data_format bin output_file_name extinction.bin
quit

If you want to use an existing NNUE network for training data generation, you need to change Use NNUE to pure and set the EvalFile, e.g., something like

setoption name Use NNUE value pure
setoption name EvalFile value somevariant-1234567890ab.nnue

Settings

  • Since only bin format is supported, you need to specify data_format bin.
  • The count and depth of the training data are the main factors influencing the strength of the resulting NNUE net. Usually at least 100M positions should be used to get decent results. A higher depth generally should be better, but also takes much longer to generate. Depths 4-5 usually already give quite good results.
  • For variants with a low branching factor like losers/antichess, it is recommended to increase the random_multi_pv_diff in order to increase the variety of positions.
  • You can lower/increase the eval_diff_limit (default: 500) to be more/less restrictive in the definition of quiet positions, since this defines the filter threshold for the (absolute) difference between qsearch and static evaluation.

Generating data from old HalfKP networks

If you want to use an old HalfKP NNUE network to start generating training data, you can use the old generator code at https://github.com/ianfab/variant-nnue. However, since the training data format was changed in the meantime, this will only work with older versions of the trainer, the latest compatible version should be https://github.com/ianfab/variant-nnue-pytorch/tree/91c302941acb131fbabb441dd6ced992ec04dfcb. Also the syntax for the training data generation command looks slightly different. An example is:

gensfen depth 2 loop 100000000 random_multi_pv 4 random_multi_pv_diff 100 random_move_count 8 random_move_maxply 20 write_minply 5 write_maxply 200 eval_limit 10000 set_recommended_uci_options sfen_format bin output_file_name extinction.bin

Training data generation using YaneuraOu (for Shogi)

In order to generate data compatible to this trainer, you need to use the customized YaneuraOu training data generator from https://github.com/ianfab/YaneuraOu/tree/fairy_bin. Its syntax is slightly different from the Fairy-Stockfish data generator, see the example below.

Example

usi
setoption name Threads value 8
setoption name USI_Hash value 2048
isready
gensfen loop 20000000 depth 1 write_minply 6 random_multi_pv_diff 200 random_multi_pv 4 random_move_count 8 eval_limit 10000 output_file_name shogi.bin
quit
Clone this wiki locally