UnicodeEncodeError from Commit message to output file.
IvDinten opened this issue · 0 comments
IvDinten commented
Issue:
Encoding error; message from Commit cannot be printed to output file.
Project causing issues:
"Arduino": {"local": None, "remote": "https://github.com/esp8266/Arduino"}
Date fetched: Jun 21, 2021
Commit hash: b4774edbfb60a969eb89ec5ce8c8938bead4f829
Reproduce:
- Run repository_commits_mining.py script with the following parameters:
l7
, to select the local checkout of theArduino
project. - Error appears in the terminal output.
Terminal output:
C:\Users\Imara\PycharmProjects\CPS_repo_mining\env\Scripts\python.exe C:/Users/Imara/PycharmProjects/CPS_repo_mining/pd/repository_commits_mining.py l7
Input: ['l7']
Keywords: ['performance', 'memory', 'runtime', 'slow', 'slower', 'slowing', 'fast', 'faster', 'increase', 'decrease', 'memory-heap', 'memory-leak', 'bottleneck', 'overhead', 'deadlock', 'livelock', 'infinite', 'impasse', 'hang']
Traceback (most recent call last):
File "C:\Users\Imara\PycharmProjects\CPS_repo_mining\pd\repository_commits_mining.py", line 238, in <module>
main()
File "C:\Users\Imara\PycharmProjects\CPS_repo_mining\pd\repository_commits_mining.py", line 223, in main
dig(project, projects[project])
File "C:\Users\Imara\PycharmProjects\CPS_repo_mining\pd\repository_commits_mining.py", line 126, in dig
print_commit_header(commit)
File "C:\Users\Imara\PycharmProjects\CPS_repo_mining\pd\repository_commits_mining.py", line 48, in print_commit_header
print(f"\nhash: {commit.hash}\ndate: {commit.committer_date}\nmessage: {commit.msg}", file=sourcefile)
File "C:\Users\Imara\AppData\Local\Programs\Python\Python39\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\uff1a' in position 1170: character maps to <undefined>
Process finished with exit code 1