reversi-alpha-zero * Some other "Zero" projects in Python

<p><br></p><div class="timeline-comment-wrapper js-comment-container" style="box-sizing: border-box; position: relative; padding-left: 60px; margin-top: 0px; margin-bottom: 15px; border-top: 2px solid rgb(255, 255, 255); border-bottom: 2px solid rgb(255, 255, 255); color: rgb(36, 41, 46); font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 14px;"><div id="issue-276376576" class="comment previewable-edit js-comment js-task-list-container timeline-comment " data-body-version="4a1c215fb9915a9cdff846adb1733970" style="box-sizing: border-box; position: relative; border: 1px solid rgb(209, 213, 218); border-radius: 3px;"><div class="edit-comment-hide" style="box-sizing: border-box;"><table class="d-block" style="box-sizing: border-box; border-spacing: 0px; border-collapse: collapse; display: block !important;"><tbody class="d-block" style="box-sizing: border-box; display: block !important;"><tr class="d-block" style="box-sizing: border-box; display: block !important;"><td class="d-block comment-body markdown-body js-comment-body" style="box-sizing: border-box; padding: 15px; display: block !important; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 14px; line-height: 1.5; word-wrap: break-word; width: 698px; overflow: visible;"><p style="box-sizing: border-box; margin-bottom: 16px;">Some related (Zero way) projects in Python:</p><p style="box-sizing: border-box; margin-bottom: 16px;">Mokemokechicken did an wonderful adaptation of this methodology in the game of Reversi: <a href="https://github.com/mokemokechicken/reversi-alpha-zero" style="box-sizing: border-box; background-color: transparent; color: rgb(3, 102, 214);">https://github.com/mokemokechicken/reversi-alpha-zero</a></p><p style="box-sizing: border-box; margin-bottom: 16px;">His results were gorgeous and he got from scratch a superhuman (and super-classic AI) model.<br style="box-sizing: border-box;">I've done an adaptation of Mokemokechicken's code to apply it to the game of Connect4: <a href="https://github.com/Zeta36/connect4-alpha-zero" style="box-sizing: border-box; background-color: transparent; color: rgb(3, 102, 214);">https://github.com/Zeta36/connect4-alpha-zero</a></p><p style="box-sizing: border-box; margin-bottom: 16px;">Just with a CPU my model was able to play in a few hours an almost perfect game defeating all the online Connect4 games I've found in Internet.</p><p style="box-sizing: border-box; margin-bottom: 16px;">I did also an adaptation to chess: <a href="https://github.com/Zeta36/chess-alpha-zero" style="box-sizing: border-box; background-color: transparent; color: rgb(3, 102, 214);">https://github.com/Zeta36/chess-alpha-zero</a>, but unfortunately I don't have GPU to train this more complex game. Anyway the code is there and it's functional. If somebody has a bored GPU It'd great to know if the chess adaptation is able to learn to play at least as a good amateur (It'd take probably at least a week even with a powerful GPU).</p><p style="box-sizing: border-box; margin-bottom: 16px;">I'd like finally to point out the easy way in which the idea behind AlpgaGo Zero can be applied to a lot of other situations almost just by changing the environment model (state, action, reward).</p><p style="box-sizing: border-box;">I hope you like this projects.</p></td></tr></tbody></table><div class="comment-reactions has-reactions js-reactions-container " style="box-sizing: border-box; border-top: 1px solid rgb(225, 228, 232);"><xform accept-charset="UTF-8" action="https://github.com/users/gcp/reactions" class="js-pick-reaction" method="post" style="box-sizing: border-box;"><div style="box-sizing: border-box; margin: 0px; padding: 0px; display: inline;"></div><div class="comment-reactions-options" style="box-sizing: border-box;"><button disabled="" class="btn-link reaction-summary-item tooltipped tooltipped-se tooltipped-multiline " name="input[content]" type="submit" value="THUMBS_UP react" aria-label="qysnn, jkiliani, Armavica, Matuiss2, ssj-gz, Yery, saweiss, vd1, grolich, GeneralZero, and Kiv reacted with thumbs up emoji" style="font-style: inherit; font-variant: inherit; font-weight: inherit; font-stretch: inherit; font-size: inherit; line-height: 18px; font-family: inherit; overflow: visible; -webkit-appearance: none; border-radius: 0px 0px 0px 2px; padding: 9px 15px 7px; color: rgba(88, 96, 105, 0.498039); white-space: nowrap; user-select: none; border-width: 0px 1px 0px 0px; border-top-style: initial; border-right-style: solid; border-bottom-style: initial; border-left-style: initial; border-right-color: rgb(225, 228, 232); position: relative; float: left;"><g-emoji alias="+1" class="emoji mr-1" fallback-src="https://assets-cdn.github.com/images/icons/emoji/unicode/1f44d.png" ios-version="6.0" style="box-sizing: border-box; font-family: "Apple Color Emoji", "Segoe UI", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 18px; line-height: 20px; vertical-align: middle; margin-right: 4px !important;"><img class="emoji" alt=":+1:" height="20" width="20" src="https://img1.daumcdn.net/relay/cafe/original/?fname=https%3A%2F%2Fassets-cdn.github.com%2Fimages%2Ficons%2Femoji%2Funicode%2F1f44d.png" style="box-sizing: border-box; border-style: none;"></g-emoji> 11</button></div></xform></div></div></div></div><div class="timeline-comment-wrapper js-comment-container" style="box-sizing: border-box; position: relative; padding-left: 60px; margin-top: 15px; margin-bottom: 15px; border-top: 2px solid rgb(255, 255, 255); border-bottom: 2px solid rgb(255, 255, 255); color: rgb(36, 41, 46); font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 14px;"><div class="avatar-parent-child timeline-comment-avatar" style="box-sizing: border-box; position: relative; float: left; margin-left: -60px; border-radius: 3px;"><a href="https://github.com/grolich" style="box-sizing: border-box; background-color: transparent; color: rgb(3, 102, 214);"><img alt="@grolich" class="avatar rounded-1" height="44" src="https://img1.daumcdn.net/relay/cafe/original/?fname=https%3A%2F%2Favatars3.githubusercontent.com%2Fu%2F33759602%3Fs%3D88%26v%3D4" width="44" style="box-sizing: border-box; border-style: none; border-radius: 3px; display: inline-block; overflow: hidden; line-height: 1; vertical-align: middle;"></a></div><div id="issuecomment-348740565" class="comment previewable-edit js-comment js-task-list-container timeline-comment " data-body-version="0d5eb3124d29bf58a572acbcf89c35b6" style="box-sizing: border-box; position: relative; border: 1px solid rgb(209, 213, 218); border-radius: 3px;"><div class="timeline-comment-header clearfix" style="box-sizing: border-box; padding-right: 15px; padding-left: 15px; color: rgb(88, 96, 105); background-color: rgb(246, 248, 250); border-bottom: 1px solid rgb(209, 213, 218); border-top-left-radius: 3px; border-top-right-radius: 3px;"><div class="timeline-comment-actions" style="box-sizing: border-box; float: right; margin-right: -5px; margin-left: 10px;"></div><h3 class="timeline-comment-header-text f5 text-normal" style="box-sizing: border-box; margin-top: 0px; margin-bottom: 0px; font-size: 14px !important; font-weight: 400 !important; max-width: 78%; padding-top: 10px; padding-bottom: 10px;"><span style="box-sizing: border-box; font-weight: 600;"><a href="https://github.com/grolich" class="author text-inherit" style="box-sizing: border-box; background-color: transparent; color: rgb(88, 96, 105);">grolich</a> </span>commented <a href="https://github.com/gcp/leela-zero/issues/155#issuecomment-348740565" class="timestamp" style="box-sizing: border-box; background-color: transparent; color: inherit; white-space: nowrap;"><relative-time datetime="2017-12-03T04:58:44Z" title="2017年12月3日 GMT+8 下午12:58" style="box-sizing: border-box;">24 days ago</relative-time></a></h3></div><div class="edit-comment-hide" style="box-sizing: border-box;"><table class="d-block" style="box-sizing: border-box; border-spacing: 0px; border-collapse: collapse; display: block !important;"><tbody class="d-block" style="box-sizing: border-box; display: block !important;"><tr class="d-block" style="box-sizing: border-box; display: block !important;"><td class="d-block comment-body markdown-body js-comment-body" style="box-sizing: border-box; padding: 15px; display: block !important; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 14px; line-height: 1.5; word-wrap: break-word; width: 698px; overflow: visible;"><p style="box-sizing: border-box; margin-bottom: 16px;">First of all - wonderful projects. Really impressive that the idea can work in such a general way out of the box.<br style="box-sizing: border-box;">However, to prevent over hyping over the thought of a superhuman level agent trained without massively strong hardware getting to these results so fast, I have to note the following:</p><p style="box-sizing: border-box; margin-bottom: 16px;"><a href="https://github.com/mokemokechicken" class="user-mention" style="box-sizing: border-box; background-color: transparent; color: rgb(36, 41, 46); font-weight: 600; white-space: nowrap;">@mokemokechicken</a> 's model didn't reach super classic AI or even super human level yet. In fact, Even now, <a href="https://github.com/mokemokechicken" class="user-mention" style="box-sizing: border-box; background-color: transparent; color: rgb(36, 41, 46); font-weight: 600; white-space: nowrap;">@mokemokechicken</a> 's best model is struggling against a low level of a relatively mediocre program.</p><p style="box-sizing: border-box; margin-bottom: 16px;">However, the very fact that seemingly substantial growth has been made without a distributed learning environment is definitely impressive.</p><p style="box-sizing: border-box;">Excellent and impressive work :)</p></td></tr></tbody></table><div class="comment-reactions has-reactions js-reactions-container " style="box-sizing: border-box; border-top: 1px solid rgb(225, 228, 232);"><xform accept-charset="UTF-8" action="https://github.com/users/gcp/reactions" class="js-pick-reaction" method="post" style="box-sizing: border-box;"><div style="box-sizing: border-box; margin: 0px; padding: 0px; display: inline;"></div><div class="comment-reactions-options" style="box-sizing: border-box;"><button disabled="" class="btn-link reaction-summary-item tooltipped tooltipped-se tooltipped-multiline " name="input[content]" type="submit" value="THUMBS_UP react" aria-label="Zeta36 reacted with thumbs up emoji" style="font-style: inherit; font-variant: inherit; font-weight: inherit; font-stretch: inherit; font-size: inherit; line-height: 18px; font-family: inherit; overflow: visible; -webkit-appearance: none; border-radius: 0px 0px 0px 2px; padding: 9px 15px 7px; color: rgba(88, 96, 105, 0.498039); white-space: nowrap; user-select: none; border-width: 0px 1px 0px 0px; border-top-style: initial; border-right-style: solid; border-bottom-style: initial; border-left-style: initial; border-right-color: rgb(225, 228, 232); position: relative; float: left;"><g-emoji alias="+1" class="emoji mr-1" fallback-src="https://assets-cdn.github.com/images/icons/emoji/unicode/1f44d.png" ios-version="6.0" style="box-sizing: border-box; font-family: "Apple Color Emoji", "Segoe UI", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 18px; line-height: 20px; vertical-align: middle; margin-right: 4px !important;"><img class="emoji" alt=":+1:" height="20" width="20" src="https://img1.daumcdn.net/relay/cafe/original/?fname=https%3A%2F%2Fassets-cdn.github.com%2Fimages%2Ficons%2Femoji%2Funicode%2F1f44d.png" style="box-sizing: border-box; border-style: none;"></g-emoji> 1</button><button disabled="" class="btn-link reaction-summary-item tooltipped tooltipped-s tooltipped-multiline " name="input[content]" type="submit" value="LAUGH react" aria-label="Zeta36 reacted with laugh emoji" style="font-style: inherit; font-variant: inherit; font-weight: inherit; font-stretch: inherit; font-size: inherit; line-height: 18px; font-family: inherit; overflow: visible; -webkit-appearance: none; border-radius: 0px; padding: 9px 15px 7px; color: rgba(88, 96, 105, 0.498039); white-space: nowrap; user-select: none; border-width: 0px 1px 0px 0px; border-top-style: initial; border-right-style: solid; border-bottom-style: initial; border-left-style: initial; border-right-color: rgb(225, 228, 232); position: relative; float: left;"><g-emoji alias="smile" class="emoji mr-1" fallback-src="https://assets-cdn.github.com/images/icons/emoji/unicode/1f604.png" ios-version="6.0" style="box-sizing: border-box; font-family: "Apple Color Emoji", "Segoe UI", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 18px; line-height: 20px; vertical-align: middle; margin-right: 4px !important;"><img class="emoji" alt=":smile:" height="20" width="20" src="https://img1.daumcdn.net/relay/cafe/original/?fname=https%3A%2F%2Fassets-cdn.github.com%2Fimages%2Ficons%2Femoji%2Funicode%2F1f604.png" style="box-sizing: border-box; border-style: none;"></g-emoji> 1</button></div></xform></div></div></div></div><div class="timeline-comment-wrapper js-comment-container" style="box-sizing: border-box; position: relative; padding-left: 60px; margin-top: 15px; margin-bottom: 15px; border-top: 2px solid rgb(255, 255, 255); border-bottom: 2px solid rgb(255, 255, 255); color: rgb(36, 41, 46); font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 14px;"><div class="avatar-parent-child timeline-comment-avatar" style="box-sizing: border-box; position: relative; float: left; margin-left: -60px; border-radius: 3px;"><a href="https://github.com/GeneralZero" style="box-sizing: border-box; background-color: transparent; color: rgb(3, 102, 214);"><img alt="@GeneralZero" class="avatar rounded-1" height="44" src="https://img1.daumcdn.net/relay/cafe/original/?fname=https%3A%2F%2Favatars1.githubusercontent.com%2Fu%2F963773%3Fs%3D88%26v%3D4" width="44" style="box-sizing: border-box; border-style: none; border-radius: 3px; display: inline-block; overflow: hidden; line-height: 1; vertical-align: middle;"></a></div><div id="issuecomment-348741186" class="comment previewable-edit js-comment js-task-list-container timeline-comment " data-body-version="3d5e704b2aa9afb17d792aef13f4ea98" style="box-sizing: border-box; position: relative; border: 1px solid rgb(209, 213, 218); border-radius: 3px;"><div class="timeline-comment-header clearfix" style="box-sizing: border-box; padding-right: 15px; padding-left: 15px; color: rgb(88, 96, 105); background-color: rgb(246, 248, 250); border-bottom: 1px solid rgb(209, 213, 218); border-top-left-radius: 3px; border-top-right-radius: 3px;"><div class="timeline-comment-actions" style="box-sizing: border-box; float: right; margin-right: -5px; margin-left: 10px;"></div><h3 class="timeline-comment-header-text f5 text-normal" style="box-sizing: border-box; margin-top: 0px; margin-bottom: 0px; font-size: 14px !important; font-weight: 400 !important; max-width: 78%; padding-top: 10px; padding-bottom: 10px;"><span style="box-sizing: border-box; font-weight: 600;"><a href="https://github.com/GeneralZero" class="author text-inherit" style="box-sizing: border-box; background-color: transparent; color: rgb(88, 96, 105);">GeneralZero</a> </span>commented <a href="https://github.com/gcp/leela-zero/issues/155#issuecomment-348741186" class="timestamp" style="box-sizing: border-box; background-color: transparent; color: inherit; white-space: nowrap;"><relative-time datetime="2017-12-03T05:16:59Z" title="2017年12月3日 GMT+8 下午1:16" style="box-sizing: border-box;">24 days ago</relative-time></a></h3></div><div class="edit-comment-hide" style="box-sizing: border-box;"><table class="d-block" style="box-sizing: border-box; border-spacing: 0px; border-collapse: collapse; display: block !important;"><tbody class="d-block" style="box-sizing: border-box; display: block !important;"><tr class="d-block" style="box-sizing: border-box; display: block !important;"><td class="d-block comment-body markdown-body js-comment-body" style="box-sizing: border-box; padding: 15px; display: block !important; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 14px; line-height: 1.5; word-wrap: break-word; width: 698px; overflow: visible;"><p style="box-sizing: border-box;">I have a version for Tak <a href="https://github.com/GeneralZero/TakZeropy" style="box-sizing: border-box; background-color: transparent; color: rgb(3, 102, 214);">https://github.com/GeneralZero/TakZeropy</a>. I have about 4K games but takes forever to train on.</p></td></tr></tbody></table><div class="comment-reactions js-reactions-container " style="box-sizing: border-box;"></div></div></div></div><div class="timeline-comment-wrapper js-comment-container" style="box-sizing: border-box; position: relative; padding-left: 60px; margin-top: 15px; margin-bottom: 15px; border-top: 2px solid rgb(255, 255, 255); border-bottom: 2px solid rgb(255, 255, 255); color: rgb(36, 41, 46); font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 14px;"><div class="avatar-parent-child timeline-comment-avatar" style="box-sizing: border-box; position: relative; float: left; margin-left: -60px; border-radius: 3px;"><a href="https://github.com/Zeta36" style="box-sizing: border-box; background-color: transparent; color: rgb(3, 102, 214);"><img alt="@Zeta36" class="avatar rounded-1" height="44" src="https://img1.daumcdn.net/relay/cafe/original/?fname=https%3A%2F%2Favatars3.githubusercontent.com%2Fu%2F17341905%3Fs%3D88%26v%3D4" width="44" style="box-sizing: border-box; border-style: none; border-radius: 3px; display: inline-block; overflow: hidden; line-height: 1; vertical-align: middle;"></a></div><div id="issuecomment-348748205" class="comment previewable-edit js-comment js-task-list-container timeline-comment " data-body-version="797d5e164e36a570b956be10cb71b9ac" style="box-sizing: border-box; position: relative; border: 1px solid rgb(209, 213, 218); border-radius: 3px;"><div class="timeline-comment-header clearfix" style="box-sizing: border-box; padding-right: 15px; padding-left: 15px; color: rgb(88, 96, 105); background-color: rgb(246, 248, 250); border-bottom: 1px solid rgb(209, 213, 218); border-top-left-radius: 3px; border-top-right-radius: 3px;"><div class="timeline-comment-actions" style="box-sizing: border-box; float: right; margin-right: -5px; margin-left: 10px;"></div><h3 class="timeline-comment-header-text f5 text-normal" style="box-sizing: border-box; margin-top: 0px; margin-bottom: 0px; font-size: 14px !important; font-weight: 400 !important; max-width: 78%; padding-top: 10px; padding-bottom: 10px;"><span style="box-sizing: border-box; font-weight: 600;"><a href="https://github.com/Zeta36" class="author text-inherit" style="box-sizing: border-box; background-color: transparent; color: rgb(88, 96, 105);">Zeta36</a> </span>commented <a href="https://github.com/gcp/leela-zero/issues/155#issuecomment-348748205" class="timestamp" style="box-sizing: border-box; background-color: transparent; color: inherit; white-space: nowrap;"><relative-time datetime="2017-12-03T08:18:17Z" title="2017年12月3日 GMT+8 下午4:18" style="box-sizing: border-box;">24 days ago</relative-time></a></h3></div><div class="edit-comment-hide" style="box-sizing: border-box;"><table class="d-block" style="box-sizing: border-box; border-spacing: 0px; border-collapse: collapse; display: block !important;"><tbody class="d-block" style="box-sizing: border-box; display: block !important;"><tr class="d-block" style="box-sizing: border-box; display: block !important;"><td class="d-block comment-body markdown-body js-comment-body" style="box-sizing: border-box; padding: 15px; display: block !important; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 14px; line-height: 1.5; word-wrap: break-word; width: 698px; overflow: visible;"><p style="box-sizing: border-box; margin-bottom: 16px;">You are right <a href="https://github.com/grolich" class="user-mention" style="box-sizing: border-box; background-color: transparent; color: rgb(36, 41, 46); font-weight: 600; white-space: nowrap;">@grolich</a>. In fact, I've added a distributed option in the code so we could make use of multiple machines working at the same time.</p><p style="box-sizing: border-box; margin-bottom: 16px;">Also, I've just added a pre-training process using supervised learning games (with PGN file games) so we can help to the policy in the beginning before starting the self-play improvement. This is similar to what AlphaGo did in its original version.</p><p style="box-sizing: border-box;">Regards.</p></td></tr></tbody></table></div></div></div><p><br></p><p><br></p>