kimci86/bkcrack

Chinese encoding error - 中文编码错误

xmexg opened this issue · 9 comments

Can not use -c to find files when there are Chinese characters in the zip

like: GBK in .zip -- UTF-8 in Computer(linux and windows)
IMG_20240318_080744

How to reproduce:
mkdir 这是中文路径
cp one_file 这是中文路径
zip -r archive.zip 这是中文路径 -Ppassword
Better switch to a different encoded operating system
bkcrack -L archive.zip
let's try -c 这是中文路径/one_file

Hi, could you provide a minimal dummy zip file that reproduces this issue?
Are you sure your console encoding is set to UTF-8?

Anyways you can workaround the problem by using --cipher-index option instead of -c option to pass a numeric index instead of an entry name. The index is the number on the first column of bkcrack -L output.

This is a ZIP issue, it encodes the filename with system locale encoding instead of UTF-8.

There are two solutions: allowing users to manually specify the encoding (unzip-iconv) or automatically guessing the encoding (unarchiver).
But I'm not sure if this is necessary for a zip cracking tool.

ebsite-update-2.29-simple.zip
password is MirlKoi

--cipher-index is useful.

I deleted some files in zip, but now -L is garbled in my win10.
image

How amazing!
When I deleted some files using bandzip, Chinese characters can be displayed normally in linux.
image

Thank you @xmexg for providing the file. I can confirm the ZIP file contains names in GBK encoding or similar. Unfortunately there is no additional metadata in this ZIP archive that could help decode the name right automatically.

Thank you @Aloxaf for your enlightening explanation. You are right, handling encoding correctly in this case would require user input or guessing.

Adding a solution into bkcrack to deal with such a case would be nice, but as there is a workaround with --cipher-index, I don't think I'll attempt to implement it any time soon. Probably this deserve more documentation though.

Thank you for reporting the issue and for the feedback.
Do you have more comments about this issue? I will close it otherwise.

thank you

Why can't I get the password?
Screenshot_2024-03-20-23-05-01-21_84d3000e3f4017145260f7618db1d683
Screenshot_2024-03-20-23-11-12-19_9e8df3d0c7c1f50248b6ee043a653d26

I suspect the line endings do not match.
The desktop.ini file contains windows-style line ending CR+LF (in hexadecimal 0d 0a).

$ xxd desktop.ini
00000000: 5b4c 6f63 616c 697a 6564 4669 6c65 4e61  [LocalizedFileNa
00000010: 6d65 735d 0d0a 3131 3534 3330 3039 385f  mes]..115430098_
00000020: 7030 2e6a 7067 3d40 3131 3534 3330 3039  p0.jpg=@11543009
00000030: 385f 7030 2e6a 7067 2c30 0d0a            8_p0.jpg,0..

Maybe your known plaintext file try2.txt uses LF line ending.
This would not work:

$ xxd try2.txt
00000000: 5b4c 6f63 616c 697a 6564 4669 6c65 4e61  [LocalizedFileNa
00000010: 6d65 735d 0a                             mes].

You need this:

$ echo -en "[LocalizedFileNames]\r\n" > try_crlf.txt
$ xxd try_crlf.txt
00000000: 5b4c 6f63 616c 697a 6564 4669 6c65 4e61  [LocalizedFileNa
00000010: 6d65 735d 0d0a                           mes]..

Ok, thank you.