Accessing a single page of a multi-page document using the Image API

Question

Accessing a single page of a multi-page document using the Image API

adolski opened this issue 7 years ago · 0 comments

Description

The Image API doesn't provide a recommendation for accessing a "resource within a resource" -- for example, assuming the identifier URI component identifies a PDF file, the individual pages aren't addressable.

I would like to have a standard and effective way of retrieving a representation of a single page of a multi-page document. Currently, my only option is to provide identifiers for each page, but unlike other identifiers, these wouldn't map directly to source resources (such as document files) without some kind of arbitrary implementation-specific translation.

Variation(s)

Although I'm talking about document pages here, this could really describe a standard way of addressing any "resource within a resource": a page of a document, a frame of a video... and maybe other things.

Proposed Solutions

A "subidentifier" path component after the identifier component with a syntax like:
a. page:n for documents
b. frame:n and/or time:nn:nn:nn for videos
c. some kind of token like default (or whatever) when there is no subresource
A recommended syntax for integrating a subresource identifier into an identifier, so that Image API implementers can scan an identifier component to find both an identifier and optional subidentifier.

Additional Background

My main use case is creation of thumbnails for PDF pages, but this feature could also be used for easy integration of PDFs and other documents into existing image viewer clients.

Currently, the Cantaloupe image server provides a page= URI query argument to solve this problem, but I'd like to have a better and more standard way.