WojciechMula/aspell-python

Speller.suggest() segfaults under Python 3

Troon opened this issue · 7 comments

System is an Ubuntu 14.04 Linode, with the latest zip file built as per instructions (python3 setup.py build && python3 setup.py install for Python 3). All dependencies are satisfied, I believe (aspell, aspell-en, libaspell-dev, libaspell15).

Python 2:

>>> import aspell
>>> s = aspell.Speller()
>>> s.check('test')
1
>>> s.suggest('test')
['test', 'testy', 'teat', 'tests', 'rest', 'yest', 'Tet', 'EST', 'est', 'Tess', 'text', 'Best', 'TESL', 'West', 'Zest', 'best', 'fest', 'jest', 'lest', 'nest', 'pest', 'tent', 'vest', 'west', 'zest', "Te's", "test's", "Tet's"]

Python 3:

>>> import aspell
>>> s = aspell.Speller()
>>> s.check('test')
True
>>> s.suggest('test')
Segmentation fault

Any suggestions? Note the different result type for Speller.check()...

Please try to run pyaspell.py directly, i.e.: python3 pyaspell.py.

Is you machine running 32-bit or 64-bit system?

Interesting: that works fine. The check() and suggest() tests work as expected.

root@vps:/usr/src/aspell-python-master# python3 pyaspell.py
True
True
["b'when'", "b'wen'", "b'wean'", "b'ween'"]
["b'ween'", "b'when'", "b'wen'", "b'wean'"]
(etc)

It also works when I import pyaspell.py:

root@vps:/usr/src/aspell-python-master# python3
Python 3.4.0 (default, Apr 11 2014, 13:05:11)
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyaspell
>>> s = pyaspell.Aspell()
>>> s.check('test')
True
>>> s.suggest('test')
["b'test'", "b'testy'", "b'teat'", "b'tests'", "b'rest'", "b'yest'", "b'Tet'", "b'EST'", "b'est'", "b'Tess'", "b'text'", "b'Best'", "b'TESL'", "b'West'", "b'Zest'", "b'best'", "b'fest'", "b'jest'", "b'lest'", "b'nest'", "b'pest'", "b'tent'", "b'vest'", "b'west'", "b'zest'", 'b"Te\'s"', 'b"test\'s"', 'b"Tet\'s"']

It's a 64-bit VPS running Xen with two Xeon E5-2680 cores. OS is Ubuntu Server 14.04. There's not much else on the box: it's been cleaned out to see if moving my project to Python 3 is feasible. The build process generates a couple of warnings that look like they might be key to this:

root@vps:/usr/src/aspell-python-master# python3 setup.py build
running build
running build_ext
building 'aspell' extension
creating build
creating build/temp.linux-x86_64-3.4
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.4m -c aspell.c -o build/temp.linux-x86_64-3.4/aspell.o
aspell.c: In function ‘m_suggest’:
aspell.c:465:2: warning: passing argument 5 of ‘get_arg_string’ from incompatible pointer type [enabled by default]
  buf = get_arg_string(self, args, 0, &word, &length);
  ^
aspell.c:393:18: note: expected ‘Py_ssize_t *’ but argument is of type ‘int *’
 static PyObject* get_arg_string(
                  ^
aspell.c: In function ‘m_addtoSession’:
aspell.c:526:2: warning: passing argument 5 of ‘get_arg_string’ from incompatible pointer type [enabled by default]
  buf = get_arg_string(self, args, 0, &word, &length);
  ^
aspell.c:393:18: note: expected ‘Py_ssize_t *’ but argument is of type ‘int *’
 static PyObject* get_arg_string(
                  ^
aspell.c: In function ‘m_addReplacement’:
aspell.c:556:2: warning: passing argument 5 of ‘get_arg_string’ from incompatible pointer type [enabled by default]
  Mbuf = get_arg_string(self, args, 0, &mis, &ml);
  ^
aspell.c:393:18: note: expected ‘Py_ssize_t *’ but argument is of type ‘int *’
 static PyObject* get_arg_string(
                  ^
aspell.c:562:2: warning: passing argument 5 of ‘get_arg_string’ from incompatible pointer type [enabled by default]
  Cbuf = get_arg_string(self, args, 1, &cor, &cl);
  ^
aspell.c:393:18: note: expected ‘Py_ssize_t *’ but argument is of type ‘int *’
 static PyObject* get_arg_string(
                  ^
aspell.c: In function ‘PyInit_aspell’:
aspell.c:734:12: warning: variable ‘dict’ set but not used [-Wunused-but-set-variable]
  PyObject *dict;
            ^
creating build/lib.linux-x86_64-3.4
x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.4/aspell.o -L/usr/local/lib/ -laspell -o build/lib.linux-x86_64-3.4/aspell.cpython-34m.so

Hope that helps.

Thanks for the details, it seems there are some issues on 64-bit architecture. Unfortunately I have no access to 64-bit machine, and thus can't track down the causes of problem.

So the only solution I could suggest is to use the python module. It's api is the same as a C extension.

OK, thanks. It's not a problem for me right now, as there are several other blockers to moving to Python 3 — but it's good to know there's a solution for this issue. See your personal email.

Fixed it myself. The fix is too simple for a pull request, just change four ints to Py_ssize_t. It then compiles and works fine here.

--- /usr/src/aspell-python-master/aspell.c      2013-09-14 12:18:23.000000000 +0100
+++ aspell.c    2014-05-20 10:32:57.673569112 +0100
@@ -458,7 +458,7 @@
 /* method:suggest ************************************************************/
 static PyObject* m_suggest(PyObject* self, PyObject* args) {
        char* word;
-       int   length;
+       Py_ssize_t   length;
        PyObject* buf;
        PyObject* list;

@@ -520,7 +520,7 @@
 /* method:addtoSession ********************************************************/
 static PyObject* m_addtoSession(PyObject* self, PyObject* args) {
        char *word;
-       int   length;
+       Py_ssize_t   length;
        PyObject* buf;

        buf = get_arg_string(self, args, 0, &word, &length);
@@ -548,8 +548,8 @@

 /* method:addReplacement ******************************************************/
 static PyObject* m_addReplacement(PyObject* self, PyObject* args) {
-       char *mis; int ml;
-       char *cor; int cl;
+       char *mis; Py_ssize_t ml;
+       char *cor; Py_ssize_t cl;
        PyObject* Mbuf;
        PyObject* Cbuf;

Thank you very much for your help and for this fix. I'll update sources in coming days.

I've applied your patch. Thanks again.