Skip to content
Fabian Fichter edited this page Apr 1, 2022 · 11 revisions

Training data format

The bin training data format used for this trainer is a modified version of the bin training data format used by the official Stockfish team. The original format uses a block of 256 bit to store a position. However, for most variants with large boards and/or many piece types 256 bit are not sufficient to store a position, or it would at least require a very specialized format for each variant, which is undesirable in this context where a generic solution/format is more suitable.

In principle with up to 26 different piece types and up to 120 squares (12x10 board) plus arbitrarily many pieces in hand, a training data format being able to store positions of any configurable variant in Fairy-Stockfish would need to be very big. However, this is not very practical as >95% of variants in practice fit into 512 bit. Therefore as a pragmatic decision the training data format was decided to use 512 bit. The current code from the training data generator that packs a position into 512 bit is at https://github.com/ianfab/Fairy-Stockfish/blob/a7f2df622b1e5719307265f492786d7709dd8085/src/tools/sfen_packer.cpp#L169-L230. It is structured as follows:

  • 1 bit for the side to move
  • 2 * 7 bit for the king squares (7 bit are required to encode one square on a 12x10 board)
  • 6 bit per non-king piece on the board (5 bit for the piece type and 1 bit for the color) and 1 bit per empty square, using huffman encoding (https://github.com/ianfab/Fairy-Stockfish/blob/a7f2df622b1e5719307265f492786d7709dd8085/src/tools/sfen_packer.cpp#L144-L163)
  • 5 bit per piece type and color to store the number of pieces in hand, i.e., a number between 0 and 31
  • 4 bit for castling rights
  • 1 or 8 bit for the en passant square
  • 6 bit for the half-move clock
  • 8 bit for the full move counter
  • 9 more bit related to half-move and fullmove counter

As you might see from this there is no easy rule when this is below 512 bit. You need to consider the board size, number of piece types, and number of pieces on board to calculate whether this is the case. However, the format designed in a way that e.g. shogi despite its large board, large number of pieces and piece types, as well as pieces in hand, still fits. The "worst case" example outlined in https://github.com/ianfab/Fairy-Stockfish/blob/a7f2df622b1e5719307265f492786d7709dd8085/src/tools/sfen_packer.cpp#L127-L136 would be shogi on a 12x10 board.

Variant NNUE HalfKAv2 architecture

TODO

Clone this wiki locally