-
Notifications
You must be signed in to change notification settings - Fork 17
Madness in computer chess
... some experiments on learning to solve a testsuite while losing general strength.
The testsuite EN-Test 2022.epd was taken from https://solistachess.jimdosite.com/testing/ and contains 120 positions sampled by a guy who constantly argues with testing the strength of a chess engine by counting how many positions of a testsuite can be solved. Let's see if RubiChess can learn to solve this testsuite (better).
Where we come from:
RubiChess.exe -bench -epdfile EN-Test_2022.epd -maxtime 5 > nul
Benchmark results
========================================================================================
RubiChess 20230225 NN-0cea6 (avx2) (Build Feb 27 2023 09:18:06 commit 80fb9ee Clang 9)
UCI compatible chess engine by Andreas Matthies
----------------------------------------------------------------------------------------
System: AMD Ryzen 7 3700X 8-Core Processor Family: 23 Model: 113
CPU-Features of system: sse2 ssse3 popcnt lzcnt bmi1 avx2
CPU-Features of binary: sse2 ssse3 popcnt lzcnt bmi1 avx2
========================================================================================
...
=============================================================================================================
Overall: 54/120 = 45.0% 588.022705 sec. 1050412894 nodes 1786347 nps
Commands to generate training positions with branch trainonepd (wip) which gives positions with correct root move a 'win' and positions with incorrect root move a 'loss':
sfen1: gensfen loop 10000000 book C:\Entwicklung\EPD\EN-Test_2022.epd random_book_pos 0 result_on_bm 1 write_minply 0 maxply 20 depth 9 disable_prune 1 random_multi_pv 5 random_multi_pv_depth 7 random_multi_pv_diff 100
sfen2: gensfen loop 10000000 book C:\Entwicklung\EPD\EN-Test_2022.epd random_book_pos 0 result_on_bm 1 write_minply 0 maxply 20 depth 8 disable_prune 1 random_multi_pv 4 random_multi_pv_depth 6 random_multi_pv_diff 100
sfen3: gensfen loop 10000000 book C:\Entwicklung\EPD\EN-Test_2022.epd random_book_pos 0 result_on_bm 1 write_minply 0 maxply 30 depth 7 disable_prune 1 random_multi_pv 4 random_multi_pv_depth 5 random_multi_pv_diff 150
sfen4: gensfen loop 10000000 book C:\Entwicklung\EPD\EN-Test_2022.epd random_book_pos 0 result_on_bm 1 write_minply 0 maxply 30 depth 7 disable_prune 1 random_multi_pv 4 random_multi_pv_depth 5 random_multi_pv_diff 150 bm_factor 5
sfen5: gensfen loop 10000000 book C:\Entwicklung\EPD\EN-Test_2022.epd random_book_pos 0 result_on_bm 1 write_minply 0 maxply 6 depth 7 disable_prune 1 random_multi_pv 4 random_multi_pv_depth 5 random_multi_pv_diff 150 bm_factor 10
Training on the resulting concatenated binpack using lambda = 0.5 starting from current master network:
python train.py C:\Schach\nnue-work\EN-train.binpack C:\Schach\nnue-work\EN-train.binpack --lambda 0.5 --threads 8 --num-workers 8 --gpus 1 --batch-size 8192 --smart-fen-skipping --random-fen-skipping 4 --features="HalfKAv2_hm^" --network-save-period 1 --resume-from-model master-0cea6.pt
Result after epoch 0 (network en-ep00-la05.nnue):
=============================================================================================================
Overall: 93/120 = 77.5% 588.023376 sec. 967252128 nodes 1644921 nps
Same results for epoch 1 and epoch 2.
Okay, already saturated. So lets try something even more extreme... even less lambda and no smart fen skipping:
python train.py C:\Schach\nnue-work\EN-train.binpack C:\Schach\nnue-work\EN-train.binpack --lambda 0.25 --threads 8 --num-workers 8 --gpus 1 --batch-size 8192 --random-fen-skipping 4 --features="HalfKAv2_hm^" --network-save-period 1 --resume-from-model master-0cea6.pt
Result after epoch 0 (network en-ep00-la025-nosfs.nnue):
=============================================================================================================
Overall: 95/120 = 79.2% 588.023438 sec. 991273523 nodes 1685772 nps
New record. Now lets try to introduce smart fen skipping again and reduce random fen skipping to 3 with lambda still 0.25:
Result after epoch 0 (network en-ep00-la025.nnue):
=============================================================================================================
Overall: 99/120 = 82.5% 588.023376 sec. 990447890 nodes 1684368 nps
Result after epoch 6 (network en-ep06-la025.nnue):
=============================================================================================================
Overall: 102/120 = 85.0% 588.023560 sec. 931192100 nodes 1583596 nps
Okay, this is good enough. Almost doubled the success rate on this test suite. Now lets test if this net is twice as strong in normal play... (network file is archived here).
Playing STC match between en-ep06-la025 and master ended in a disaster.
Score of EN-ep06-la025 vs Master: 0 - 400 - 0 [0.000] 400
Elo difference: -inf +/- nan, LOS: 0.0 %, DrawRatio: 0.0 %
By training on <50MB of data starting from the testsuite positions we managed to improve the results in solving this testsuite from 45% to 85% and at the same time decreased playing strength to... needs to be measure but very low for sure.