dariogoetz/keyboard_layout_optimizer

Total number of found ngrams changes after splitting modifiers

Closed this issue · 1 comments

Splitting of ngrams changes number of found ngrams

This for example forced 276d0d4 to be necessary.

Example: (the first number is the count before splitting, the second number is after the splitting.)

Edit: Actually, it might make sense this way. I'm slightly confused.

uni 108783127.16766545
uni 115993432.49160336

tri 108134291.69376425
tri 158193331.2444684

bi 108458004.65033875
bi 140483393.64971808

uni 108783127.16766545
uni 115993432.49160337

tri 108134291.69376425
tri 158195158.18040243

bi 108458004.65033875
bi 140483393.649718

[2022-04-01T20:35:22Z INFO  layout_optimization_sa::optimization] Process   0: Starting layout: .czjöqfsäxwt,lngmdiüyßbuaeoprhkv ( 643.6)
uni 108783127.16766545
uni 115993432.49160343

tri 108134291.69376425
tri 158193343.21547225

bi 108458004.65033875
bi 140483393.64971817

uni 108783127.16766545
uni 115993432.49160343

tri 108134291.69376425
tri 158186184.70580426

bi 108458004.65033875
bi 140482049.10099033

uni 108783127.16766545
uni 115993432.49160337

This is not a bug. Modifier splitting takes an n-gram with "higher-layer" symbols and generates multiple new ones with symbols solely on the base layer. The sum of their weights is not necessarily equal to the "starting weight".