Skip to content
dkappe edited this page Mar 21, 2019 · 2 revisions

I wanted to see how many games it would take to train up a full size net (256x20-se) with a combination of supervised and reinforcement learning. I took the games from CCRL, CEGT and Kingbase, folded in 3 million Ender games, and added 3.5 million self-play games with 6 man tablebases, and trained with a sliding 600k window.

The result is promising, but goosing Maddex to a higher level than t10 and t30 will take a lot more games. So I’m leaving the field to others and focusing on non-leela style nets.

The net is named after my late friend Bill Maddex. You can see one of his near misses, where he almost caught up with Tal.

   # PLAYER         :  RATING  ERROR  POINTS  PLAYED   (%)  CFS(%)    W    D    L  D(%)
   1 ID32930        :      72     19   110.5     184  60.1      99   44  133    7  72.3
   2 ID11258        :      31     19   100.0     184  54.3      99   36  128   20  69.6
   3 maddex-1000    :       0     12   157.5     368  42.8     ---   27  261   80  70.95

White advantage = 40.19 +/- 9.59
Draw rate (equal opponents) = 75.96 % +/- 2.41