'charmap' codec can't decode byte 0x8d in position 13: character maps to <undefined>
gokul427 opened this issue · 6 comments
iwn = pyiwn.IndoWordNet()
2022-11-06:13:21:14,789 INFO [iwn.py:43] Loading hindi language synsets...
UnicodeDecodeError Traceback (most recent call last)
Cell In [5], line 2
1 # language defaults to Hindi
----> 2 iwn = pyiwn.IndoWordNet()
File ~\anaconda3\envs\py38torch\lib\site-packages\pyiwn\iwn.py:45, in IndoWordNet.init(self, lang)
43 logger.info(f'Loading {lang.value} language synsets...')
44 self._synset_idx_map = {}
---> 45 self._synset_df = self._load_synset_file(lang.value)
46 self._synset_relations_dict = self._load_synset_relations()
File ~\anaconda3\envs\py38torch\lib\site-packages\pyiwn\iwn.py:51, in IndoWordNet._load_synset_file(self, lang)
49 filename = os.path.join(*[constants.IWN_DATA_PATH, 'synsets', 'all.{}'.format(lang)])
50 f = open(filename)
---> 51 synsets = list(map(lambda line: self._load_synset(line), f.readlines()))
52 synset_df = pd.DataFrame(synsets, columns=['synset_id', 'synsets', 'pos'])
53 synset_df = synset_df.dropna()
File ~\anaconda3\envs\py38torch\lib\encodings\cp1252.py:23, in IncrementalDecoder.decode(self, input, final)
22 def decode(self, input, final=False):
---> 23 return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 13: character maps to
In line 50 : instead of this f = open(filename) change it to this f = open(filename, encoding='utf-8')
Hi, I am getting the same issue, is there a way to resolve it?
go to the pyiwn file and change line 50 from this f = open(filename) change it to this f = open(filename, encoding='utf-8')
its not working who can i solve this
UnicodeDecodeError Traceback (most recent call last)
Cell In[4], line 3
1 import pyiwn
2 pyiwn.download()
----> 3 iwn = pyiwn.IndoWordNet()
File c:\g\pproject\aimbot\bot\bot\lib\site-packages\pyiwn\iwn.py:45, in IndoWordNet.init(self, lang)
43 logger.info(f'Loading {lang.value} language synsets...')
44 self._synset_idx_map = {}
---> 45 self._synset_df = self._load_synset_file(lang.value)
46 self._synset_relations_dict = self._load_synset_relations()
File c:\g\pproject\aimbot\bot\bot\lib\site-packages\pyiwn\iwn.py:51, in IndoWordNet._load_synset_file(self, lang)
49 filename = os.path.join(*[constants.IWN_DATA_PATH, 'synsets', 'all.{}'.format(lang)])
50 f = open(filename)
---> 51 synsets = list(map(lambda line: self._load_synset(line), f.readlines()))
52 synset_df = pd.DataFrame(synsets, columns=['synset_id', 'synsets', 'pos'])
53 synset_df = synset_df.dropna()
File ~\anaconda3\lib\encodings\cp1252.py:23, in IncrementalDecoder.decode(self, input, final)
22 def decode(self, input, final=False):
---> 23 return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 13: character maps to
this is my error message .
can u pls a help me resolve this
i changed line 50 to the utf8 foormat also
srry it worked i forgot to restart my env