Clarify and refine the streamer API on mobile platforms
qnga opened this issue · 8 comments
Current uses of the streamer in the Kotlin testapp
- When a
LibraryActivity
is started, it instantiates aServer
. - When a book is either selected for opening (
recyclerViewListClicked
) or added to the library (parseIntent
),- first, it is parsed into a
Publication
by calling the appropriate parser, depending on the file extension; the result is stored in a PubBox containing thePublication
and theContainer
. - if the
Publication
is an Epub, theServer
functionaddEpub
is called (prepareToServe
) for the publication. AFetcher
is instantiated in this call and kept in memory by the router to be used when an HTTP request occurs. - only in case of opening, a new activity is started, depending on the publication type; the
Publication
itself along with its path, filename, id and cover are passed to the new activity through intent extras.
- first, it is parsed into a
Current public API
PublicationParser
interface is implemented by all parsers, but every one is individually imported and used by the testappContainer
interface (though it is not consistently used, see below)Fetcher
class is currently not used directly in the testapp, but instantiated inside aServer
Server
class which is instantiated in the testapp- Each
PublicationParser
publicly exposes mimetype constants.
Current issues
- Selecting a Parser depending on the extension is done multiple times and apps have to be aware of all supported publication formats.
- Java Zip API is directly used rather than
Container
to get the cover in the Epub case. - Neither the container nor a fetcher is passed to the reading activity, so a resource cannot be directly retrieved cleanly. In Kotlin implementation, Cbz images are currently directly retrieved by the navigator using ZeroTurnaround Zip Library and the publication path (Java SDK ZipFile is directly used in the streamer), while in Swift they are served over HTTP.
- Apps cannot use custom
ContentFilter
s (see readium/kotlin-toolkit#201). Some of these content filters (for instance third-party DRM CF) are required to be passed to the PublicationParser (for example to decrypt navigation and SMIL files), some others should be defined later (for example, navigator-specific ones). - Navigators should be able to make custom resources served by the
Server
(see #103). In addition, apps should be able to make r2 use their own HTTP server implementation.
Some ideas
- It would be great if parsing of publications could be achieved independently of publication type, and applications could automatically leverage full streamer capabilities through the
PublicationParser
interface. To achieve this, thePublicationParser
parse
method could include a parser factory based on path extension. Should format-specific parsers still be public or not? - What about publicly exposing the
Fetcher
interface rather thanContainer
? It would allow to directly retrieve encrypted and other on-the-fly modified content, besides raw content. AFetcher
has to be instantiated early in the Epub parser code to decrypt navigation and SMIL files, so it can be returned in thePubBox
rather than aContainer
. This will also prevent apps from accidentally bypassing deciphering while accessing resources. Consequently, some content filters should be passed to the PublicationParser by the app. - With the same goal, of directly retrieving resources, a
Fetcher
should be passed to reading activities (or probably to the navigator) (see readium/kotlin-toolkit#176). So,Fetcher
(at least an interface) should be defined in theshared
module. - Apps and navigators should be able to add content filters after parsing. I suggest a functional way: newFetcher = oldFetcher + newContentFilterChain. In this way, the object returned by the streamer, with resource-access-level content filters, is not modified any more. The
plus
operation could mix content filter chains or give the content filters from oldFetcher a highest priority by chaining the content filter chain of oldFetcher and the new one. - Since native APIs are preferred on mobile platforms, the HTTP Server is just a trick used by the navigator to serve packaged HTML resources. This job might be done by the navigator itself, which would receive a
Server
instance. So, the interface might be moved to Shared, and the R2 implementation to the Navigator (or testapp).
Thanks, with all these new materials we're moving towards drafting a first Streamer API spec!
It would be great if parsing of publications could be achieved independently of publication type, and applications could automatically leverage full streamer capabilities through the PublicationParser interface.
Definitely, the less stuff we have in the reading apps, the easier it is to upgrade them.
This solution is already partially implemented in Swift using an extension on Publication
.
Moreover, I think we also need a way for reading apps to plug their own parser implementations for custom formats into R2.
Like you said in the other issue, parser selection should be done using the file extension and/or media type, as well as some sniffing capabilities (JSON, ZIP entries) that we could provide helpers for.
(On a side-note, I think parse
is quite technical and not very OO. I'm also not sure it applies to every format. Maybe we should call this API open
or something like that).
Should format-specific parsers still be public or not?
Probably, at least for the beginning for backward compatibility. Then it might make sense to expose them anyway if someone really wants to parse an EPUB only, for example.
What about publicly exposing the
Fetcher
interface rather thanContainer
?
I fully agree, we need to make sure that reading the content always go through the content filters. This is also more in line with the original architecture schema which doesn't mention Container
, but which has a Fetcher
.
Apps and navigators should be able to add content filters after parsing. I suggest a functional way: newFetcher = oldFetcher + newContentFilterChain. In this way, the object returned by the streamer, with resource-access-level content filters, is not modified any more.
That's interesting. For R1 we used to have priorities as integers, but that might be better to enforce it at the construction level.
So I guess:
-
At the parsing stage, the streamer would add to the initial
Fetcher
the necessary content filters to be able to read the resources (so decryption, font obfuscation, etc.). The reading app would be able to pass additional content filters to handle, for example, another DRM. -
When presenting the publication with the
Navigator
, it would create its ownFetcher
instance from the initial one, adding its internal private content filters (CSS injection, etc.). The reading app can also provide custom "navigation" content filters to add when creating theNavigator
. -
Any additional process can create a new
Fetcher
that will delegate to the initial one and have its own chain of content filters. For example, we could imagine a search indexing background process that would add a content filter to pre-process the ressources. We only want this content filter to be used for this process, so that makes sense to compose a newFetcher
instance from the initial one.
I like this approach a lot.
One might wonder what's the difference between Fetcher
and a FilteredContainer(container: Container, filters: List<ContentFilter>)
? That's the only thing that Fetcher
is doing.
Do we need Fetcher
at all? Should we rename Container
into Fetcher
to match the original architecture and fit better its purpose (e.g. HTTP requests)? This would have the advantage of being backward compatible by adding a typealias Container = Fetcher
in the current API.
Merging Container
and Fetcher
means that we can apply ContentFilter
to only a selected portion of the exposed resources (e.g. decryption of the EPUB content), but avoid passing resources such as manifest.json
through the decryption filter. If we have a CachedFetcher
, it also means that the ressources can be cached post-filter, or on a case-by-case basis. That's the power of a composite architecture, everything is configured at construction without modifying the classes. 🚀
The HTTP Server is actually a trick used by the navigator to serve packaged HTML resources. This job might be done by the navigator itself, which would receive a Server instance. So, the interface might be moved to Shared, and the R2 implementation to the Navigator (or testapp).
I think it makes sense, the server is actually an implementation detail of the Navigator to serve only the HTML publications. Having the server in the streamer means that we have to resort to small hacks to keep the responsibilities of the streamer and the navigator segregated.
Having the server in the Navigator also means that there's less code in the reading apps, and that we're sure that we start it only if it's really needed. However, I think reading apps should be able to customize which implementation of the server is used by the Navigator. So a Server
interface should be exposed in r2-navigator
, with a default implementation provided by R2.
This also means that we don't need to hack the self
Link of the Publication
to send the baseURL
to the navigator anymore, since the navigator owns the server. Removing this, the server doesn't need to know at all about the Publication
anymore, only the Fetcher
.
protocol Server {
// Returns the baseURL
func serve(fetcher: Fetcher, at endpoint: String) -> String
}
Moreover, I think we also need a way for reading apps to plug their own parser implementations for custom formats into R2.
I suggest a method like Publication.open(path, parsers=defaultparsers, filters=defaultcontentfilters) parsers
would be something like a `List<Pair<(String) -> Boolean, PublicationParser>>. The streamer would consider every list item each after other, and use the first parser for which the lambda would return true. This lambda would be able to check path extension and sniff the file in order to decide if the parser is suitable. This is a simple approach, but I am not sure we need a more complex one. (The function responsible for deciding if a parser is suitable for a publication could also be a part of the parser object)
Do we need Fetcher at all? Should we rename Container into Fetcher to match the original architecture and fit better its purpose (e.g. HTTP requests)? This would have the advantage of being backward compatible by adding a typealias Container = Fetcher in the current API.
This sounds great.
That's interesting. For R1 we used to have priorities as integers, but that might be better to enforce it at the construction level.
After discussion, it seems that two CF provided at the same point (for example to PublicationParser) may need some prioritization (for example, the order between decryption and font deobfuscation matters). So there are three possibilities: using priorities as integers or using builder with categories to construct ordered chain, or both, the latter being convenience method (less flexible).
I suggest a method like Publication.open(path, parsers=defaultparsers, filters=defaultcontentfilters)
parsers
would be something like a `List<Pair<(String) -> Boolean, PublicationParser>>.
In this issue, I initially thought of MediaType
as an enum, but now it seems more useful to have some kind of singleton on which the reading app can register new media types (with custom sniffing lambdas). If we have that, we just need to map a media type set for each parser in Publication.open
, and require the reading app to add their custom types in MediaType
. This has the added advantage that we can recognize the format everywhere uniformly, not only in Publication.open
.
After discussion, it seems that two CF provided at the same point (for example to PublicationParser) may need some prioritization (for example, the order between decryption and font deobfuscation matters). So there are three possibilities: using priorities as integers or using builder with categories to construct ordered chain, or both, the latter being convenience method (less flexible).
I think having simply an int priority (like in R1) is simpler and more flexible. And since the content filter chains would be segregated, we could have different constant priorities defined by each module (e.g. DECRYPTION for streamer, INJECTION for navigator, etc.), instead of a global "registry" of priorities.
In this issue, I initially thought of MediaType as an enum, but now it seems more useful to have some kind of singleton on which the reading app can register new media types (with custom sniffing lambdas). If we have that, we just need to map a media type set for each parser in Publication.open, and require the reading app to add their custom types in MediaType. This has the added advantage that we can recognize the format everywhere uniformly, not only in Publication.open.
There are few uses of format besides parsing. Amongst the reasons you presented, it seems to me that only the last one would be left with the new API: apps may want to store publication format. However your approach doesn't seem to clearly provide them with this information. Either apps are required to call media type detection once again themselves (though Android and iOs are Unix-like, it could theoretically be issues with opened files), or media type (string) should be inserted in PubBox
es. So, I'm not sure MediaType registry is very useful (and I don't like registries very much).
In addition, I think it is interesting to provide an ordering mechanism for media type detection. Someone may want to handle a subset format, or a not clearly identified extension, with a special parser.
There are few uses of format besides parsing. Amongst the reasons you presented, it seems to me that only the last one would be left with the new API: apps may want to store publication format.
There are a few uses left, but not much, here's from Swift:
- Checking the media type of an LCP protected publication (in
r2-lcp
) - Getting the media type of a file upon download (using content-type returned by the server and sniffing), to be able to determine its proper extension (in
r2-lcp
and potentially in reading apps)
However your approach doesn't seem to clearly provide them with this information.
Yes it's missing. We want to avoid determining the media type several times – especially since we might lack useful information such as HTTP Content-Type at different points – so there must be a way to optionally pass the media type directly to the parser, and to get it back from it.
I thought about revamping PubBox
as well to make it more useful. So it could contain the media type, and also the DRM
object which is currently in Container
(but wouldn't if we go ahead with this refactoring). Naming will be tricky, maybe OpenedPublication(Publication, Fetcher, DRM, MediaType)
?
Publication.open(file: String, mediaType: String? = null): OpenedPublication?
So, I'm not sure MediaType registry is very useful (and I don't like registries very much).
We still need to make sure that the media type computing is consistent, wherever it's done. So I still think registering additional media type app-wise is needed. Is it the singleton you don't like? Because we could also have a MediaTypeSniffer(List<MediaType>)
that is configured (or not) by reading apps and given to R2 in the initialization step, or to Publication.open
directly.
A register has the advantage of being independent of the call to open. With this solution, we could indeed have Publication.open(file: String, mediaType: String? = null): OpenedPublication?
But there are also drawbacks: the function depends on the MediaTypeSniffer state and doesn't simply expose the relationship between the open function and the global MediaTypeSniffer.
As you suggested, we could pass MediaTypeSniffer (default or custom) to the open function, but if media type is sometimes known, it would require two different functions. One with a sniffer argument, and the other with a mediaType argument.
Therefore, my favorite solution is to let apps determine media type before calling Publication.open(file: String, mediaType: String): OpenedPublication?
. I think it matches the following use case quite well: an app receive a Path or URL, if it is a path, it uses MediaTypeSniffer to guess media type, if it is an URL, it uses HTTP media-type to name the downloaded file. Then, it is able to call the open function with an explicit media type.
This approach sounds to me very simple and natural. Does it fit possible different use cases?
Therefore, my favorite solution is to let apps determine media type before calling
Publication.open(file: String, mediaType: String): OpenedPublication?
.
I think it sounds reasonable, so we don't need to pass MediaTypeSniffer
to Publication.open
.
However, we want to have the simplest API while allowing a high degree of customization for advanced reading apps. I think the majority of the reading apps will use our default media types and just want to open a Publication
from a file path. So I think we can have mediaType
as optional, and Publication.open
would create a MediaTypeSniffer
on its own with the default media types supported by Readium to compute it.
On a side note, I think we should use a custom MediaType
object instead of String
for media types – it would be what's returned by MediaTypeSniffer
– because we often use media types to resolve a file extension. While for known media types that wouldn't be such a problem, for media types provided by the test app it's good to have every information packed together (so file extension, and maybe UTI for iOS).
@Throws
Publication.open(file: String, mediaType: MediaType? = null): OpenedPublication
I think we can close this issue now that we have a few proposals addressing most of it. To answer the initial problems:
- Selecting a Parser depending on the extension is done multiple times and apps have to be aware of all supported publication formats.
The new Streamer
type will have all default parsers implicitly set up.
- Java Zip API is directly used rather than
Container
to get the cover in the Epub case.- Neither the container nor a fetcher is passed to the reading activity, so a resource cannot be directly retrieved cleanly. In Kotlin implementation, Cbz images are currently directly retrieved by the navigator using ZeroTurnaround Zip Library and the publication path (Java SDK ZipFile is directly used in the streamer), while in Swift they are served over HTTP.
These will be fixed with Publication::get()
using the internal Fetcher
.
- Apps cannot use custom
ContentFilter
s (see readium/kotlin-toolkit#201). Some of these content filters (for instance third-party DRM CF) are required to be passed to the PublicationParser (for example to decrypt navigation and SMIL files), some others should be defined later (for example, navigator-specific ones).
The Streamer
will allow customizing the Fetcher
, and injection of Resource Transformers (formerly Content Filters) can be done with a TransformingFetcher
.
- Navigators should be able to make custom resources served by the
Server
(see readium/r2-navigator-kotlin#103). In addition, apps should be able to make r2 use their own HTTP server implementation.
This is not yet addressed, but we can always open an issue for that later.