Updated replacement for vader-sentiment (vaderSentiment-js) that runs original vaderSentiment natively, using CPython.
vader-sentiment is outdated and doesn't support emoji - vital element of comments nowadays.
Also, spawning python process from nodejs is slow. This module can analyze ~60,000 strings per second by re-using the same python instance, and doesn't leak memory.
- Run
nvm i
- Run
npm ci
- Install dependencies. On Ubuntu, run
./deps.sh
- Build python and boost statically (see
build_boost.sh
andbuild_python.sh
) - Generate prebuild:
npm run prebuild
(might need to update hardcoded libs) - Run it:
node test.js "your awesome text goes here"
Note: target machine should have same python (and glibc?) version, as dev machine
Note: currently is was tested on Ubuntu 20, and on Heroku (with Ubuntu 20). Check mac
branch for Mac OS
Note: if you'd like to use it on another linux distro, at least install Python 3.8.5 and make sure that python3
points to it
Note: requires Python 3.8.5 (installed on Ubuntu 20 by default)
- Run
npm install Maxim-Mazurok/vader-sentiment-cpython
- Add
install_vader.sh
script to your project, and run it before using module. You can include it instart
script, like so:"start": "bash ./node_modules/vader-sentiment-cpython/install_vader.sh && node start.js",
It only has prebuild for Ubuntu 20 and Python 3.8.5.
I have to figure out how to make it more portable and independent from installed Python interpreter.
As far as I understand, I have to create portable Python interpreter, similar to how pyinstaller works.
Disclaimer: this is my (very limited) understanding.
- It bundles copy of the system (
libpython3.8.so.1.0
in my case) - It doesn't bundle system Python interpreter executable. Instead, it builds own Python interpreter (kind of), which is embedded in the final binary executable.
- I think they use
objcopy
on Linux to bundle archives into one binary, see api.py#L645 - Python "starts" in
pyi_pylib_start_python
function - pyinstaller bootloader's entrypoint is based on an Python 2 version of Python interpreter main program
So... Looks like I have to use python's main.c in order to create portable python interpreter that I can ship with this package.
But instead of running in console mode (and requiring to be ran using spawn child process
), it'll be possible to embed NAPI into it, so it'll be a NodeJS package.
The most recent attempt to create such a portable python lives in my cpython fork's [vader-branch]
Currently it will run vader if you put vaderSentiment-master
folder in the cpython
root folder.
PyRun_SimpleString
is the key function under the hood.
There are two options:
-
Migrate vader branch of cpython to gyp building system
- Copy all cpython code into this project (or clone and patch)
- Either separate
node-gyp configure
step from the build step (in theprebuildify
, somehow), so that we can change/merge cpython's Makefile and NodeJS's native module Makefile - Or migrate cpython's Makefile to binding.gyp
-
Move cpython code into this project
Essentially, copy only the required bits of code from cpython to this project. Probably will require copying a whole lot and diving very deep into cpython. Also, will probably require the same migration of cpython's Makefile to binding.gyp.
So, I think that I should try to migrate cpython's Makefile to binding.gyp. Then include and call cpython's main function in binding.cc?