Weird failure on unicode/windows or bytes/linux build
pombredanne opened this issue · 3 comments
The windows tests with a unicode build and the linux tests with a non-unicode are failing this test:
On bytes/linux:
_____________________________________________________ TestTrieIterators.test_items _____________________________________________________
self = <test_unit.TestTrieIterators testMethod=test_items>
def test_items(self):
A = self.A
I = []
for i, w in enumerate(self.words):
A.add_word(conv(w), i + 1)
I.append((conv(w), i + 1))
L = [x for x in A.items()]
self.assertEqual(len(L), len(I))
> self.assertEqual(set(L), set(I))
E AssertionError: Items in the first set but not the second:
E (b'a\x00h', 3)
E (b'p\x00y\x00t\x00', 2)
E (b'c\x00o\x00r\x00a\x00', 4)
E (b'w\x00o\x00', 1)
E Items in the second set but not the first:
E (b'word', 1)
E (b'python', 2)
E (b'aho', 3)
E (b'corasick', 4)
tests/test_unit.py:431: AssertionError
on windows/unicode:
________________________ TestTrieIterators.test_items _________________________
self = <test_unit.TestTrieIterators testMethod=test_items>
def test_items(self):
A = self.A
I = []
for i, w in enumerate(self.words):
A.add_word(conv(w), i + 1)
I.append((conv(w), i + 1))
L = [x for x in A.items()]
self.assertEqual(len(L), len(I))
> self.assertEqual(set(L), set(I))
E AssertionError: Items in the first set but not the second:
E ('w\x00o\x00', 1)
E ('p\x00y\x00t\x00', 2)
E ('a\x00h', 3)
E ('c\x00o\x00r\x00a\x00', 4)
E Items in the second set but not the first:
E ('corasick', 4)
E ('python', 2)
E ('aho', 3)
E ('word', 1)
D:\a\pyahocorasick\pyahocorasick\tests\test_unit.py:422: AssertionError
I wonder if this is because there are some narrow vs. wide Python unicode builds done on windows?
It feels as if a null was being injected after each letter and as if Windows was built with bytes and not the unicode define.
It feels as if a null was being injected after each letter and as if Windows was built with bytes and not the unicode define.
True, it looks as you described. I have no windows machine to check this.
True, it looks as you described. I have no windows machine to check this.
No worries! I am looking into this with tests ... and will push some investigation in my WIP branch for 2.0
The issue is also on Linux FWIW