三人麻雀用の AI です。
自然言語モデルである BERT を利用して構築しています。
モデルは Masked Language Model で事前学習してから、Policy Value Network の学習という手順を踏みます。
強化学習は行っていません。
とりあえず動くように
wahaha on feat/poetry2 is 📦 v0.1.0 via 🐍 v3.11.5 (wahaha-py3.11) on ☁️
❯ python bot/client.py
wahaha on feat/poetry2 [!] is 📦 v0.1.0 via 🐍 v3.11.5 (wahaha-py3.11) on ☁️
❯ python bot/mjai.py
usage: mjai.py [-h] [--host HOST] [--room ROOM] [--port PORT] [--name NAME] model_path
mjai.py: error: the following arguments are required: model_path
test についてはこのようにすれば ok
wahaha on feat/poetry2 is 📦 v0.1.0 via 🐍 v3.11.5 (wahaha-py3.11) on ☁️
❯ python -m unittest test/test_mj2vec.py
.
----------------------------------------------------------------------
Ran 1 test in 0.000s
OK
- Google Colaboratory TPU を使用しました。
- 学習コード:train/wahaha_tpu.ipynb
本プログラムは三人麻雀ルールに対応した mjai プロトコルでの利用を想定しています。
下記の fork を利用してください。
天鳳/雀魂が採用している抜きドラ(北)ルールに対応するため、"type":"nukidora"`を導入しています。
(<-) Server to Client, (->) Client to Server
<- {"type":"tsumo","actor":0,"pai":"C"}
-> {"type":"nukidora","actor":0,"pai":"N"}
<- {"type":"nukidora","actor":0,"pai":"N"}
-> {"type":"none"}
<- {"type":"tsumo","actor":0,"pai":"E"}
-> {"type":"dahai","actor":0,"pai":"E","tsumogiri":true}
<- {"type":"dahai","actor":0,"pai":"E","tsumogiri":true}
-> {"type":"none"}
Google Colaboratory TPU で 7 日かけて学習しました。
- 学習データ
- 天鳳牌譜 鳳凰卓 3 人南 (2012 - 2020)
- 約 2 万半荘
- 天鳳牌譜 鳳凰卓 3 人南 (2012 - 2020)
- テストデータ
- 天鳳牌譜 鳳凰卓 3 人南 (2011)
--------------------------------------------------------------------------------
DATALOADER:0 TEST RESULTS
{'test_accuracy': 0.7972599864006042, 'test_loss': 0.5289502143859863}
--------------------------------------------------------------------------------
0 0 0 0.0
1 924221 1008436 0.7901532670392568
2 0 0 0.0
3 0 0 0.0
4 0 0 0.0
5 0 0 0.0
6 0 0 0.0
7 0 0 0.0
8 0 0 0.0
9 923944 981816 0.8000052963080658
10 31774 30244 0.7062888506811268
11 591399 570972 0.7586834380670155
12 399405 372718 0.7584608202447963
13 271825 247356 0.7592215268681576
14 232758 195329 0.7641824818639321
15 186057 183558 0.7017781845520217
16 231263 231702 0.7009175578976444
17 273533 261344 0.7425615281008938
18 399271 422965 0.7114654876881066
19 593032 579594 0.7522921217265879
20 32571 28184 0.7395685495316492
21 597560 558024 0.7723359568764068
22 406378 383433 0.755300143701768
23 277395 247779 0.7668648271241711
24 235896 210529 0.745768991445359
25 189512 164245 0.7473286858047429
26 236268 202835 0.75676781620529
27 276652 269120 0.7344307372175981
28 403260 372343 0.7646739699685505
29 599142 570094 0.7645616336954958
30 775968 834380 0.7506711570267743
31 797026 806371 0.7792666154908845
32 851573 893626 0.7757674910980656
33 22125 14497 0.730082085948817
34 747669 708730 0.7996246807670058
35 746071 759753 0.7723444329933544
36 750832 823257 0.7449678532949007
37 339661 367860 0.7343581797422932
38 427234 424245 0.8522433970936605
39 3071 923 0.6728060671722643
40 16318 20094 0.6654722802826715
41 31115 34469 0.7157155705126346
42 1178109 1210985 0.9567269619359448
43 390583 393297 0.992595926233864
44 13047 15005 0.8504498500499833
45 1115650 1119056 0.9390620308545774
Class | Token | Count | Offset | Range | Multiply | positional embedding | Note |
---|---|---|---|---|---|---|---|
Special | [PAD] | 1 | 0 | [0...0] | * | - | - |
[CLS] | 1 | 1 | [1...1] | 1 | - | - | |
[SEP] | 1 | 2 | [2...2] | 1 | - | - | |
[EOS] | 1 | 3 | [3...3] | 1 | - | - | |
[MASK] | 1 | 4 | [4...4] | 0..1 | - | - | |
[UNK] | 1 | 5 | [5...5] | 0 | - | - | |
Category | style | 2 | 6 | [6...7] | 1 | - | 東風[0] 半荘[1] |
player_id(absolute) | 3 | 8 | [8...10] | 1 | - | 東家[0],南家[1],西家[2] | |
bakaze | 3 | 11 | [11...13] | 1 | - | 東場[0], 南場[1], 西場[2] | |
kyoku | 3 | 14 | [14...16] | 1 | - | [0,1,2] | |
honba | 4 | 17 | [17...20] | 1 | - | min(honba, 4) | |
kyotaku | 3 | 21 | [21...23] | 1 | - | min(kyotaku 3) | |
Numeric | delta_score(自家 - 上家) | 97 | 24 | [24...120] | 1 | - | clip((delta_score/1000) + 48, 0, 96) |
delta_score(自家 - 下家) | 97 | 121 | [121...217] | 1 | - | clip((delta_score/1000) + 48, 0, 96) | |
num_pipais | 12 | 218 | [218...229] | 1 | - | clip(num_pipais, N) | |
Pai | dora_markers | 37 | 230 | [230...266] | 1..5 | - | tile37 multiply=1..5 |
tehai | 37 | 267 | [267...303] | 1..14 | - | tile136, (副露牌を含めない打牌可能な手牌. 自摸牌は含む.) | |
tsumo(自摸牌) | 37 | 304 | [304...340] | 0..1 | - | tile37, (直前の tsumo でツモった牌.dahai 後は空.) | |
possible | can_dahai | 1 | 341 | [341...341] | 0..1 | - | |
can_reach | 1 | 342 | [342...342] | 0..1 | |||
can_hora | 1 | 343 | [343...343] | 0..1 | - | ||
can_ryukyoku | 1 | 344 | [344...344] | 0..1 | - | ||
can_pon | 1 | 345 | [345...345] | 0..1 | - | ||
can_daiminkan | 1 | 346 | [346...346] | 0..1 | - | ||
can_ankan | 1 | 347 | [347...347] | 0..1 | - | ||
can_kakan | 1 | 348 | [348...348] | 0..1 | - | ||
Player0 | (player0)dahai | 74 | 349 | [349...422] | * | ✔ | tile37 * 2(tsumogiri = False[0..36], tsumogiri = True[37..73]) |
(relative) | reach | 1 | 423 | [423...423] | * | ✔ | - |
pon | 37 | 424 | [424...460] | * | ✔ | tile37 | |
daiminkan | 34 | 461 | [461...494] | * | ✔ | tile34 | |
ankan | 34 | 495 | [495...528] | * | ✔ | tile34 | |
kakan | 34 | 529 | [529...562] | * | ✔ | tile34 | |
nukidora | 1 | 563 | [563...563] | * | ✔ | - | |
Player1 | dahai | 74 | 564 | [564...637] | * | ✔ | (Player0 と同じ) |
(relative) | reach | 1 | 638 | [638...638] | * | ✔ | |
pon | 37 | 639 | [639...675] | * | ✔ | ||
daiminkan | 34 | 676 | [676...709] | * | ✔ | ||
ankan | 34 | 710 | [710...743] | * | ✔ | ||
kakan | 34 | 744 | [744...777] | * | ✔ | ||
nukidora | 1 | 778 | [778...778] | * | ✔ | ||
Player2 | dahai | 74 | 779 | [779...852] | * | ✔ | (Player0 と同じ) |
(relative) | reach | 1 | 853 | [853...853] | * | ✔ | |
pon | 37 | 854 | [854...890] | * | ✔ | ||
daiminkan | 34 | 891 | [891...924] | * | ✔ | ||
ankan | 34 | 925 | [925...958] | * | ✔ | ||
kakan | 34 | 959 | [959...992] | * | ✔ | ||
nukidora | 1 | 993 | [993...993] | * | ✔ |
Class | Token | Count | Offset | Range | - | Note | |
---|---|---|---|---|---|---|---|
Actual action | dahai | 37 | 0 | [0...36] | - | ||
reach | 1 | 37 | [37...37] | - | - | ||
pon | 1 | 38 | [38...38] | - | |||
daiminkan | 1 | 39 | [39...39] | - | |||
ankan | 1 | 40 | [40...40] | - | tile34 | ||
kakan | 1 | 41 | [41...41] | - | tile34 | ||
nukidora | 1 | 42 | [42...42] | - | - | ||
hora | 1 | 43 | [43...43] | - | |||
ryukyoku | 1 | 44 | [44...44] | - | - | ||
none(skip) | 1 | 45 | [45...45] | - | - |
Augmentation として下記の変換を行います。
- 三元牌をシフト(白->發、發->中、中->白)
- 筒子と索子を入れ替え
- 一萬と九萬を入れ替え