Provide library
plata opened this issue · 12 comments
It should be possible to put the core functionality into a library which can be reused by other projects.
Not OP but that's something I needed to integrate rdrview into another tool I'm building. I changed target platform to nodejs in the meantime (not only because of rdrview, some other things were missing or not production ready yet), so now I just use mozilla's reader instead.
Is this something that you actually need, or just a general idea?
Not really under my control but I would love to see this being used by @TobiasFella in https://github.com/KDE/alligator.
Background info: Flym feed reader uses Readability4J to show the complete feed content for RSS which only contains teaser text. I couldn't find a C/C++ library which provides the same/similar functionality.
I am using haxe, which can target different languages or VMs. I wanted to use the new-ish Haxe VM Hashlink for this project, but it's still very rough for sys things (handling processes, etc. ; even creating a web server was pre-alpha).
Having rdrview as a lib would have allowed be to write native bindings for hashlink VM and would have eased the process, but I gave up because of all the other (current) shortcomings of hashlink as a sys application (vs game, which is its primary target). Still using haxe, but now targetting nodejs which is a much better fit for this project.
By the way, did you actually need rdrview as a library? Were you calling it often enough to get performance issues from spawning processes, or was there another reason?
I made a quick prototype where I call the executable: plata/alligator@30a5bf0
It takes some time when loading many feeds. However, I cannot tell if this is because it's not a library, if it's Internet access or if it's even related to rdrview at all.
This is also something I would like to have - ideally a simple function that takes in an input HTML string, and returns or allocates the output HTML string. The main reason for this is performance, as there is some overhead when running a new process each time I want to get the content of a website.
Working with the strings directly also means that I can do the fetching however I want, I don't need to rely on the networking capabilities of this project (for example fetching 100s of websites).
C is also super portable, so adding this functionality would allow less popular languages to use this implementation
@eafer Could you provide some hinds on what would be required to build a C library in case somebody would like to give it a try? Looking at the code, it isn't really obvious for me.
In the meantime, I've been looking around for other readability implementations. While there are several for Python, Java, Javascript etc., I've not been able to find anything for C. Also, calling an executable is not an option in my use case (I'm not allowed to do so and it raises issues for packaging/delivery).
I don't mind leaving it open in case other people show up asking for this.