RUB-SysSec/OMEN

Issue with Length data when training

lakiw opened this issue · 1 comments

lakiw commented

It appears that the length data saved in LN.level has an off by one error when training passwords of length 4 are being saved as length 5, length 5 are being saved as length 6, etc, with junk data being saved for length 4, (using ngrams = 4).

For example, consider the training set. Note this does not have "junk" data for length 4 but I've seen that appear on larger training sets like the RockYou list:

test
test1
test1
test12
test12
test12
test123
test123
test123
test123

So there is 1 of length 4, 2 of length 5, etc. Using the following command for training:

./createNG -F -v -n 4 --iPwdList test.txt

The following is my LN.count file:
...
0 1
0 2
0 3
0 4
1 5
2 6
3 7
4 8
0 9
0 10
0 11
....

lakiw commented

*Smacks head. Looks like the value in 4 is for length 3 passwords. For example if I modify the training set as:

tes
test
test1
test1
test12
test12
test12
test123
test123
test123
test123

I get the following in LN.count:

0 1
0 2
0 3
1 4
1 5
2 6
3 7
4 8
0 9