riedlma/SECOS

Conversion to python package, structure into modules, simplify language support.

ohenrik opened this issue · 4 comments

First of, this looks like a great piece of work! Nicely done!

I’m working on a project where i will need to split Norwegian words ("skolebuss" (school buss) for example).

This library seems to be the only library that has been used on Norwegian (given that you have provided language files for Norwegian).

Would you be willing to allow me to contribute to this project. I want to do this:

  1. Package the code into a python package. (so we can use pip install secos for example)
  2. Create a module so the code can be used as part of other python programs
  3. Make it easier to install and use the different languages.

Perfect! I will try to get this done within a month, depending on my workload on my current project.

Hello,

First of all, sorry for digging up this old topic.

I have made a fork to patch SECOS into an actual Python package. I cleaned up the example scripts to maximize code reuse. I also used that opportunity to do the following tasks:

  • Port the code to Python3.
  • Use f-string instead of percent string formatting, and other clean-ups here and there.
  • Add type annotations so that it can be used with tools like Mypy.
  • Use the standard logging facility to report debug and info statements.
  • Write a setup.py file for installation, using setuptools

The minimum required version for this package is Python 3.7 because of the use of dataclasses. Although that could be changed by writing the needed boilerplate manually, it would still require Python 3.5+ for typing annotations.

Are you interested in merging this into your repository? It could benefit from a review to make sure I did not break anything in the process, although I did verify that the decompound_secos.py and decompound_server.py scripts gave the same output as before the conversion. The intended interface is showcased in the other scripts at the root of the repository.

Do note that the decompound_secos.py script makes use of methods that are not intended to be user-facing because I did not want to break its output while migrating the code.

Hi,
sorry for the late response. Thanks for your efforts. I will have a look at your changes, but I think they are very useful.
I am very interested in merging those changes to the repository. I will have a look at your modifications as soon as possible and then merge those to my repository.

Thanks a lot and all the best,
Martin