chimbori/crux

Unable to write custom plugin since Plugin interface is sealed

evuki opened this issue · 2 comments

evuki commented

I really don't understand how we should write custom plugins if we can't even extend the Plugin interface since it's sealed.

I have a need for an article extractor, but don't want to do anything with urls, I just want to pass it a jsoup Document instance and for it to spit out an Article when it does its business. I figured I'd write my own minimal plugin which uses some of Crux's extraction functions, but without any HttpUrl or OkHttpClient params inside the constructor. It seems that I can't really do this since:

  1. I can't actually extend Plugin, meaning I can't pass it as part of activePlugins to a Crux constructor,
  2. I have to copy/paste the contents of your extractContent method (which is already copy/pasted inside your ArticleExtractor plugin), which is obviously doable but feels unnecessary
  3. I can't use any pre or post process helpers since the damn things are internal!

Can you explain the logic behind hiding all of this? What should I do?

Just 5 lines below the sealed Plugin interface are the two extensible sub-interfaces Rewriter and Extractor. You want to implement one of them, not the base interface Plugin.

Crux knows about & handles these two sub-types, so making the base interface sealed just prevented the library from being used in an unexpected manner:

val rewrittenUrl = activePlugins
.filterIsInstance<Rewriter>()
.fold(originalUrl) { rewrittenUrl, rewriter -> rewriter.rewrite(rewrittenUrl) }
activePlugins
.filterIsInstance<Extractor>()

There are lots of samples in the plugins package that you can check out for inspiration.

BTW, the Article API is deprecated and replaced by the much more capable and extensible Resource, but I’m not done with the full refactor so it’s lightly documented (mostly just KDoc). For any future use, I strongly recommend using the Resource object. You can add fields to it, unlike Article.

The pre- and post-process helpers are a bit of a pain, to be frank. Crux inherited them from a previous fork, and I’m not too happy with them. They are hard to reason about, and hard to tune. The intent behind keeping them internal is to prevent future unexpected usages of them, making it much harder to refactor them later. Consider them an implementation detail, and if you need to reuse some of that logic, copy/pasting (and improving them) is in fact better than trying to reuse the methods as they stand today.

The metadata extractors are all public, under com.chimbori.crux.extractors.MetadataHelpers. Feel free to reuse/remix them as needed.

evuki commented

Thanks for the fast and detailed response, appreciate it. I'll try to do as you say and reuse some logic for now. Don't be surprised if some PR pops out in the near future, though :)