endorno/python-tesseract

Installed to custom prefix, can't import tesseract module

GoogleCodeExporter opened this issue · 11 comments

I am running on a redhat system where I do no have admin priviledges or 
priviledges to write to /usr/local.  Instead, I have recreated the /usr 
directory structure under /foo/bar/usr where /foo/bar is a directory i have 
full rwx priviledges.  I had to build and install from source python 2.7.3 and 
swig 2.0.8 to the /foo/bar/usr.  Then I repeated these steps for tesseract and 
its dependencies.  Then after getting python-tesseract to build and install, I 
can't successfully import the tesseract module.  The output I get is:

The output I get is as follows:

>>> import tesseract
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "tesseract.py", line 26, in <module>
    _tesseract = swig_import_helper()
  File "tesseract.py", line 18, in swig_import_helper
    import _tesseract
ImportError: libtesseract.so.3: cannot open shared object file: No such file or 
directory

Here are the steps I followed:
1. mkdir /foo/bar/usr/local
2. set environment variables with
export CFLAGS=-I/foo/bar/usr/local/include; export 
LDFLAGS=-L/foo/bar/usr/local/lib; export 
LIBLEPT_HEADERSDIR=/foo/bar/usr/local/include
3. Compile and install jpeg-8d, giflib-4.1.6, libpng-1.5.13, tiff-4.0.0, 
zlib-1.2.7, and leptonica-1.69 using this command for each library:
./configure --prefix=/foo/bar/usr/local; make; make install;
4. install python 2.7.3 from source to /foo/bar/usr/local:
./configure --prefix=/foo/bar/usr/local; make; make install;
5. grab tesseract-ocr-read-only from svn compile and install:
./configure --prefix=/foo/bar/usr/local; make; make install;
6. copy tesseract-ocr-read-only/ccutil/tprintf.h to /foo/bar/usr/local/include
7. svn checkout http://python-tesseract.googlecode.com/svn/trunk 
python-tesseract
8. cd python-tesseract
9. modify lines 99 & 100:
incls = ['/usr/include', '/usr/local/include', '/foo/bar/usr/local/include']
libs=['/usr/lib', '/usr/local/lib', '/foo/bar/user-supported/usr/local/lib']
10. build and install python-tesseract running:
python config.py;
python setup.py clean;
python setup.py build;
python setup.py install --prefix=/foo/bar/usr/local

11. Enter python prompt and import tesseract.

The output I get is as follows:

>>> import tesseract
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "tesseract.py", line 26, in <module>
    _tesseract = swig_import_helper()
  File "tesseract.py", line 18, in swig_import_helper
    import _tesseract
ImportError: libtesseract.so.3: cannot open shared object file: No such file or 
directory


My python-tesseract build output is as follows:
os=linux
Current Version : tesseract
===========['stdc++', 'tesseract', 'lept']===========
aaaaaaaaaaaaaaaaaaaaaaaaaaa
['.', '/foo/bar/usr/local/include/tesseract', 
'/foo/bar/usr/local/include/leptonica', '/usr/local/include/opencv']
running clean
os=linux
Current Version : tesseract
===========['stdc++', 'tesseract', 'lept']===========
aaaaaaaaaaaaaaaaaaaaaaaaaaa
['.', '/foo/bar/usr/local/include/tesseract', 
'/foo/bar/usr/local/include/leptonica', '/usr/local/include/opencv']
running build
running build_py
creating build
creating build/lib.linux-x86_64-2.7
copying tesseract.py -> build/lib.linux-x86_64-2.7
running build_ext
building '_tesseract' extension
swigging tesseract.i to tesseract_wrap.cpp
swig -python -c++ -I/foo/bar/usr/local/include/tesseract 
-I/foo/bar/usr/local/include/leptonica -o tesseract_wrap.cpp tesseract.i
/foo/bar/usr/local/include/tesseract/publictypes.h:78: Warning 462: Unable to 
set dimensionless array variable
creating build/temp.linux-x86_64-2.7
gcc -pthread -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes 
-I/foo/bar/usr/local/include -fPIC -I. -I/foo/bar/usr/local/include/tesseract 
-I/foo/bar/usr/local/include/leptonica -I/usr/local/include/opencv 
-I/foo/bar/usr/local/include/python2.7 -c tesseract_wrap.cpp -o 
build/temp.linux-x86_64-2.7/tesseract_wrap.o
cc1plus: warning: command line option "-Wstrict-prototypes" is valid for 
Ada/C/ObjC but not for C++
tesseract_wrap.cpp: In function âid 
SWIG_InitializeModule(void*)âtesseract_wrap.cpp:6675: warning: statement has 
no effect
gcc -pthread -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes 
-I/foo/bar/usr/local/include -fPIC -I. -I/foo/bar/usr/local/include/tesseract 
-I/foo/bar/usr/local/include/leptonica -I/usr/local/include/opencv 
-I/foo/bar/usr/local/include/python2.7 -c main_dummy.cpp -o 
build/temp.linux-x86_64-2.7/main_dummy.o
cc1plus: warning: command line option "-Wstrict-prototypes" is valid for 
Ada/C/ObjC but not for C++
main_dummy.cpp: In function âar* ProcessPagesRaw(const char*, 
tesseract::TessBaseAPI*)âmain_dummy.cpp:129: warning: address of local 
variable âgâeturned
main_dummy.cpp: At global scope:
main_dummy.cpp:196: warning: âlimage_Typeâefined but not used
main_dummy.cpp:203: warning: ât is_none(PyObject*)âefined but not used
g++ -pthread -shared -L/cliphomes/gtg426r/local/lib -L/foo/bar/usr/local/lib 
-I/foo/bar/usr/local/include build/temp.linux-x86_64-2.7/tesseract_wrap.o 
build/temp.linux-x86_64-2.7/main_dummy.o -lstdc++ -ltesseract -llept -o 
build/lib.linux-x86_64-2.7/_tesseract.so
os=linux
Current Version : tesseract
===========['stdc++', 'tesseract', 'lept']===========
aaaaaaaaaaaaaaaaaaaaaaaaaaa
['.', '/foo/bar/usr/local/include/tesseract', 
'/foo/bar/usr/local/include/leptonica', '/usr/local/include/opencv']
running install
running bdist_egg
running egg_info
writing python_tesseract.egg-info/PKG-INFO
writing top-level names to python_tesseract.egg-info/top_level.txt
writing dependency_links to python_tesseract.egg-info/dependency_links.txt
writing manifest file 'python_tesseract.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
copying tesseract.py -> build/lib.linux-x86_64-2.7
running build_ext
creating build/bdist.linux-x86_64
creating build/bdist.linux-x86_64/egg
copying build/lib.linux-x86_64-2.7/tesseract.py -> build/bdist.linux-x86_64/egg
copying build/lib.linux-x86_64-2.7/_tesseract.so -> build/bdist.linux-x86_64/egg
byte-compiling build/bdist.linux-x86_64/egg/tesseract.py to tesseract.pyc
creating stub loader for _tesseract.so
byte-compiling build/bdist.linux-x86_64/egg/_tesseract.py to _tesseract.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying python_tesseract.egg-info/PKG-INFO -> 
build/bdist.linux-x86_64/egg/EGG-INFO
copying python_tesseract.egg-info/SOURCES.txt -> 
build/bdist.linux-x86_64/egg/EGG-INFO
copying python_tesseract.egg-info/dependency_links.txt -> 
build/bdist.linux-x86_64/egg/EGG-INFO
copying python_tesseract.egg-info/top_level.txt -> 
build/bdist.linux-x86_64/egg/EGG-INFO
writing build/bdist.linux-x86_64/egg/EGG-INFO/native_libs.txt
zip_safe flag not set; analyzing archive contents...
tesseract: module references __file__
creating dist
creating 'dist/python_tesseract-tesseract-py2.7-linux-x86_64.egg' and adding 
'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing python_tesseract-tesseract-py2.7-linux-x86_64.egg
removing 
'/foo/bar/usr/local/lib/python2.7/site-packages/python_tesseract-tesseract-py2.7
-linux-x86_64.egg' (and everything under it)
creating 
/foo/bar/usr/local/lib/python2.7/site-packages/python_tesseract-tesseract-py2.7-
linux-x86_64.egg
Extracting python_tesseract-tesseract-py2.7-linux-x86_64.egg to 
/foo/bar/usr/local/lib/python2.7/site-packages
python-tesseract tesseract is already the active version in easy-install.pth

Installed 
/foo/bar/usr/local/lib/python2.7/site-packages/python_tesseract-tesseract-py2.7-
linux-x86_64.egg
Processing dependencies for python-tesseract==tesseract
Finished processing dependencies for python-tesseract==tesseract

Original issue reported on code.google.com by stevencd...@gmail.com on 16 Nov 2012 at 5:56

then u need to manually copy all the .so into a single directory. I guess it 
will work

Original comment by FreeT...@gmail.com on 4 Dec 2012 at 2:50

I'm trying to figure out from where the module is attempting to load the 
libtesseract.so.3 module.  It is already in my /foo/bar/usr/local/lib and 
permissions are set to 777 just in case.

Original comment by stevencd...@gmail.com on 23 Dec 2012 at 8:00

I solved the problem.  I had to modify the PYTHONPATH to include the custom bin 
directory /foo/bar/usr/local/bin

Original comment by stevencd...@gmail.com on 22 Jan 2013 at 6:13

Interesting....
How about install python-tesseract to user-directory?

python setup.py install --user

Of course, I haven't tried it before

Original comment by FreeT...@gmail.com on 23 Jan 2013 at 2:58

I apologize, I added my lib directory not my bin directory. so my PYTHONPATH = 
/foo/bar/usr/local/lib which is where libtesseract.so is located.  

I attempted to install to user-directory but it does not change anything 
related to the location of libtesseract.so.  If I had compiled tesseract and 
all its dependencies to my user directory, then this may work, but I suspect, I 
would still have to add the user directory containing libtesseract.so to my 
pythonpath

Original comment by stevencd...@gmail.com on 23 Jan 2013 at 8:14

[deleted comment]
If it is not too much a trouble, please share your successful story (step by 
step preferably) on installing python-tesseract even without an admin right.

Original comment by FreeT...@gmail.com on 24 Jan 2013 at 5:30

{{{
eli@eli-iMac:~$ python setup.py install --user
eli@eli-iMac:~$ find . |grep -i _tesseract.so
~/.local/lib/python2.7/site-packages/python_tesseract-tesseract-py2.7-linux-x86_
64.egg/_tesseract.so
}}}

Original comment by FreeT...@gmail.com on 16 Feb 2013 at 9:22

a user install would work, but in my particular situation, my user drive does 
not have much space. I am close to finishing my write up on what I did. I had 
to confirm my steps as I did this spread out over several months.  I will post 
soon

Original comment by stevencd...@gmail.com on 16 Feb 2013 at 9:29

Better late than never... I think these are still incomplete, but I no longer 
have the time to fully retest my instructions

In order to setup python-tesseract to a custom prefix, I had to setup a 
completely fresh build environment in a directory where I have permissions.  
Additionally I wanted OpenCV support as well, so I had to build OpenCV as well. 
 I will include the steps I followed for each of these separately.  Assuming 
the directory you have full rights is at /foo/bar.  I chose to install 
tesseract and all dependent libraries to /foo/bar/usr/local.  I had to build 
gcc4.4.7 from source as well because my gcc was producing errors with the 
current version of OpenCV 2.4.3.

Building GCC 4.4.7:
mkdir /foo/bar/sandbox;
cd /foo/bar/sandbox;
svn co svn://gcc.gnu.org/svn/gcc/tags/gcc_4_4_7_release gcc;
cd /foo/bar/sandbox/gcc;
wget ftp://ftp.gnu.org/gnu/gmp/gmp-4.3.2.tar.gz
wget http://www.mpfr.org/mpfr-2.4.2/mpfr-2.4.2.tar.gz
wget http://www.multiprecision.org/mpc/download/mpc-0.8.1.tar.gz
tar -xzf gmp-4.3.2.tar.gz
mv gmp-4.3.2 gmp
tar -xzf mpfr-2.4.2.tar.gz
mv mpfr-2.4.2 mpfr
tar -xzf mpc-0.8.1.tar.gz
mv mpc-0.8.1 mpc
mkdir /foo/bar/sandbox/gcc-build;
cd /foo/bar/sandbox/gcc-build;
/foo/bar/sandbox/gcc/configure --prefix=/foo/bar/usr \
--with-local-prefix=/foo/bar/usr/local
make; 
make install;


Building OpenCV 2.4.3:

cd /foo/bar/sandbox;
wget 
http://sourceforge.net/projects/opencvlibrary/files/opencv-unix/2.4.3/OpenCV-2.4
.3.tar.bz2/download;
tar -xjf OpenCV-2.4.3.tar.bz2
cd OpenCV-2.4.3O
comment out CMakeLists.txt lines 80:104
mkdir /foo/bar/sandbox/OpenCV-2.4.3/build
cd /foo/bar/sandbox/OpenCV-2.4.3/build
cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/foo/bar/usr/local \
-D BUILD_PYTHON_SUPPORT=ON -D CMAKE_C_COMPILER=/foo/bar/local/bin/gcc \
-D CMAKE_CXX_COMPILER=/foo/bar/local/bin/g++ \
-D CMAKE_LIBRARY_PATH=/foo/bar/usr/local/lib64:/foo/bar/usr/local/lib \
-D CMAKE_INCLUDE_PATH=/foo/bar/usr/local/include
/foo/bar/sandbox/OpenCV-2.4.3
cmake -D OPENCV_BUILD_3RDPARTY_LIBS=TRUE -D CMAKE_BUILD_TYPE=RELEASE \
-D CMAKE_SYSTEM_PREFIX_PATH=/foo/bar/local \
-D CMAKE_INSTALL_PREFIX=/foo/bar/local -D BUILD_PYTHON_SUPPORT=ON \
-D CMAKE_C_COMPILER=/foo/bar/local/bin/gcc \
-D CMAKE_CXX_COMPILER=/foo/bar/local/bin/g++ \
-D 
CMAKE_LIBRARY_PATH=/foo/bar/local/lib64:/foo/bar/local/lib:/usr/lib64:/usr/lib 
/foo/bar/src/OpenCV-2.4.3

make;
make install;

Here are the steps I followed:
1. mkdir /foo/bar/usr/local
2. set environment variables with
export CFLAGS=-I/foo/bar/usr/local/include; export 
LDFLAGS=-L/foo/bar/usr/local/lib:/foo/bar/usr/local/lib64; export 
LIBLEPT_HEADERSDIR=/foo/bar/usr/local/include
3. Compile and install jpeg-8d, giflib-4.1.6, libpng-1.5.13, tiff-4.0.0, 
zlib-1.2.7, and leptonica-1.69 using this command for each library:
./configure --prefix=/foo/bar/usr/local; make; make install;
4. install python 2.7.3 from source to /foo/bar/usr/local:
./configure --prefix=/foo/bar/usr/local; make; make install;
5. grab tesseract-ocr-read-only from svn compile and install:
./configure --prefix=/foo/bar/usr/local; make; make install;
6. copy tesseract-ocr-read-only/ccutil/tprintf.h to /foo/bar/usr/local/include
7. svn checkout http://python-tesseract.googlecode.com/svn/trunk 
python-tesseract
8. cd python-tesseract
9. modify lines 99 & 100 of setup.py:
incls = ['/usr/include', '/usr/local/include', '/foo/bar/usr/local/include']
libs=['/usr/lib', '/usr/local/lib', '/foo/bar/user-supported/usr/local/lib']
10. build and install python-tesseract running:
python config.py;
python setup.py clean;
python setup.py build;
python setup.py install --prefix=/foo/bar/usr/local

Original comment by stevencd...@gmail.com on 13 Jun 2013 at 9:21

Original comment by FreeT...@gmail.com on 9 May 2014 at 7:49

  • Changed state: Fixed