readium/architecture

Splitting the Publication model

mickael-menu opened this issue · 7 comments

There are a number of issues with the current Publication model (at least in the mobile toolkit):

  • It is used in at least two very different cases:
    • to parse the metadata of an OPDS entry
    • to parse a publication to be presented by a reading app
  • It is constructed in two different places:
    • shared when deserializing a JSON model (OPDS, RWPM)
    • streamer when parsing a publication (including RWPM-based publications like DiViNa)

The second case needs much more than just a bunch of metadata, such as a position list or a search service, which depend on an associated Fetcher. All of this doesn't make sense for an OPDS entry.

The need to keep the Publication model lightweight for OPDS means that the other associated objects created during the parsing of a publication are not joined in a cohesive object. The parser returns a PubBox(Publication, Fetcher) and a closure to be called later when the DRM is unlocked, to update the Publication. This leads to confusing code in the reading apps.

@qnga suggested a different approach which I think makes a lot of sense:

  • Introducing a Manifest type that is a pure immutable data object representing strictly a RWPM.
  • Making the Publication the owner of all the associated objects (goodbye PubBox):
    • a Manifest
    • a Fetcher
    • the protection (DRM, password)
    • any additional service (position list, search, etc.)
  • Having the OPDS parser return a Manifest instead of a Publication.

We could also add a typealias OPDSPublication = Manifest, to have a nicer API for the OPDS parser.

I believe that we can implement this without breaking the reading apps, by doing some aliases with deprecation warnings in the Publication object.

I understand the idea behind this proposal, but I have mixed feelings:

  • what about Web Publications (RWPM, W3C Audiobook, Readium Audiobook or DiViNa)?
  • using Manifest instead of Publication or OPDSPublication for OPDS 1.x would feel very weird...
  • but OPDS 2.0 Publications can also be real RWPM with a readingOrder and resources, how do we treat them?

Things are easier than they look right now because locally we mostly deal with packaged publications and remote publications are mostly limited to OPDS. As we blur the line between these two, I don't think that the distinction will be that easy.

qnga commented

what about Web Publications (RWPM, W3C Audiobook, Readium Audiobook or DiViNa)?

Actually, the idea is that Publication should model full publications from the point of view of applications and hold all objects that apps and navigators need to use them. In case of Web Publications, the parser would still build a Publication objet, with a Manifest and a Fetcher that allows to retrieve remote resources. This is very natural, because, for example, the Fetcher returned by the EpubParser should provide access to both local and remote resources.

using Manifest instead of Publication or OPDSPublication for OPDS 1.x would feel very weird...

I guess OPDS 1.x entries are parsed into the RWPM model, so it would make sense to use OPDSPublication.

but OPDS 2.0 Publications can also be real RWPM with a readingOrder and resources, how do we treat them?

I was not aware of this possibility. What are resources and readingOrder supposed to be used for? Should an OPDS entry directly be used as a publication in some cases?
Anyway, an OPDS entry is still a Manifest that we could call OPDSPublication to match OPDS terminology. It is not a full Publication in the sense that, before being provided to the streamer, it is not ready for reading (no fetcher is available).

I was not aware of this possibility. What are resources and readingOrder supposed to be used for? Should an OPDS entry directly be used as a publication in some cases?

Yes they can be directly be used as a publication in certain cases.

It is not a full Publication in the sense that, before being provided to the streamer, it is not ready for reading (no fetcher is available).

Which could become problematic for the case explained above.

Sorry I might have been ambiguous by citing the RWPM, but basically the separation we're suggesting is between:

  • a stateless immutable object: the publication metadata, which is basically what the RWPM is describing, including the readingOrder
  • a stateful object holding the things needed to present the publication: the metadata, fetcher, helpers, DRM protection, etc.

Now we might argue about how to name what. We could keep Publication for the manifest, but introduce another object such as OpenedPublication or PresentablePublication.

Any format would be converted into a Manifest, either through direct JSON deserialization for RWPM, or through parsing for the others. But only the one from which we need to read the actual content will be wrapped into a Publication.

Which means that for the "in-between" OPDS entry, it would be first parsed as a Manifest from the feed, to be displayed in the catalog view. If the user triggers the "Read" action, then we can ask the streamer to wrap the Manifest into a Publication to present it to the user.

let opdsEntry: OPDSPublication = opds.parseEntry(entryURL)

// Create the Publication from a Manifest
let publication = streamer.open(opdsEntry)

navigator.present(publication)

But when opening a local EPUB:

// Create the Publication directly from a path
let publication = streamer.open(at: File("path/to/epub"))
  • a stateless immutable object: the publication metadata, which is basically what the RWPM is describing, including the readingOrder
  • a stateful object holding the things needed to present the publication: the metadata, fetcher, helpers, DRM protection, etc.

Why would you consider the metadata to be mutable while the readingOrder is immutable?
In the case of a Web Publication they're potentially both mutable as the manifest could be updated at any time.

We can "freeze" the publication at a specific state, but we'll need the ability to easily update it as well.

Which means that for the "in-between" OPDS entry, it would be first parsed as a Manifest from the feed, to be displayed in the catalog view. If the user triggers the "Read" action, then we can ask the streamer to wrap the Manifest into a Publication to present it to the user.

let opdsEntry: OPDSPublication = opds.parseEntry(entryURL)

// Create the Publication from a Manifest
let publication = streamer.open(opdsEntry)

navigator.present(publication)

So, in this example opdsEntry would return an OPDSPublication while publication would return a Publication or OpenedPublication right?

I don't know how I feel about this though since an OPDSPublication could have useful helpers that are not available for other types of publications.

Why would you consider the metadata to be mutable while the readingOrder is immutable?

No that's not what I meant, maybe this schema will help:

publication

In the case of a Web Publication they're potentially both mutable as the manifest could be updated at any time.
We can "freeze" the publication at a specific state, but we'll need the ability to easily update it as well.

Actually I'm not so sure that Publication should be mutable as well anymore.

Do you have a more precise use case for that?

Nothing prevents from creating a copy of the Publication or a Manifest from an update, and updating the variable storing the Publication in the navigator.

publication.updateFrom(url) -> Publication

So, in this example opdsEntry would return an OPDSPublication while publication would return a Publication or OpenedPublication right?

I don't know how I feel about this though since an OPDSPublication could have useful helpers that are not available for other types of publications.

Yes that's right.

If there's a need for helpers at the Manifest level, maybe we need to rethink the Services.

But for the problem at hand, the question is how do we transition from having just the publication metadata to a full-fledged presentable publication? Loading full Publication with Fetcher for each entry of an OPDS feed sounds a bit overkill to me, and having a Publication with a nullable Fetcher error prone.

Here's a proposal to answer this issue: #137