kimbely0320/update_privacy_info.py

异常:UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb8 in position 3080: invalid start byte

Closed this issue · 2 comments

Do you want to search for API usage 是否要搜索API使用情況 (y/n): y
Do you want to exclude certain directories for API search 您是否要為API搜索排除某些目錄 (y/n): n
Do you want to search for dependencies 是否要搜索套件是否有在列表中 (y/n): y
Do you want to exclude certain directories for dependencies search 您是否要為套件搜索排除某些目錄 (y/n): n
Do you want to download privacy_info for dependencies 是否要下載套件的 privacy_info (y/n): n
Progress: 78.44% (7596/9684)Traceback (most recent call last):
File "/Users/fri/works/zhipin/sawa/kkk.py", line 454, in
main()
File "/Users/fri/works/zhipin/sawa/kkk.py", line 428, in main
found_patterns, found_deps, search_tracking_auth = search_files(args.directory, excluded_dirs_api, excluded_dirs_deps, search_apis, search_deps)
File "/Users/fri/works/zhipin/sawa/kkk.py", line 254, in search_files
found_patterns, found_deps, search_tracking_auth = future.result()
File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/concurrent/futures/_base.py", line 438, in result
return self.__get_result()
File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/concurrent/futures/_base.py", line 390, in __get_result
raise self._exception
File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/concurrent/futures/thread.py", line 52, in run
result = self.fn(*self.args, **self.kwargs)
File "/Users/fri/works/zhipin/sawa/kkk.py", line 191, in process_file
lines = f.readlines()
File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb8 in position 3080: invalid start byte

我也出现这个问题了
image

很抱歉造成困擾,查看錯誤碼應該是因為不同編碼(非UTF8編碼)造成
有推一個檔案能列出問題檔案&讓程式碼繼續進行:

https://github.com/kimbely0320/update_privacy_info.py/blob/main/update_privacy_info_without_UTF8.py

由於有用到chardet 這個 python 套件進行搜索編碼,
需要在使用前先加入 :https://www.geeksforgeeks.org/how-to-install-python-chardet-on-macos/
感謝您的回報!