jupyterlab/jupyterlab-data-explorer

Structured URLs

saulshanabrook opened this issue · 0 comments

Currently, we have to rely on manually parsing and creating different types of URLs which represent different dataset locations. For example, the notebook URL looks like this:

// 'file:///{path}.ipynb/#/cells/{cellid}/outputs/{outputid}/data/{mimetype}'

It would be nice if we could just write a format string that looks like that, and get a way to both generate notebook URLs and extract the data from them. Currently, we have to do something like this instead:

const result = decodeURIComponent(url.hash).match(
/^[#]([/]cells[/]\d+[/]outputs[/]\d+)[/]data[/](.*)$/
);
if (
url.protocol !== "file:" ||
!url.pathname.endsWith(".ipynb") ||
!result
) {
return null;
}
const [, outputHash, type] = result;

This is error prone and requires duplicating code.

Luckily, there is a "URI Template RFC 6570" standard just for this use case!

We should add support this, using an existing URL template library or writing our own. Ones that look like they might work are:


Design

This is similar to how we created a type safe abstraction over different mimetypes, some of them with arguments:

export abstract class DataType<T, U> {
abstract parseMimeType(mimeType: MimeType_): T | typeof INVALID;
abstract createMimeType(typeData: T): MimeType_;
createDataset(data: U, typeData: T) {
return createDataset(this.createMimeType(typeData), data);
}
createDatasets(url: URL_, data: U, typeData: T) {
return createDatasets(url, this.createMimeType(typeData), data);
}
/**
* Filer dataset for mimetypes of this type.
*/
filterDataset(dataset: Dataset<any>): Map<T, U> {
const res = new Map<T, U>();
for (const [mimeType, [, data]] of dataset) {
const typeData_ = this.parseMimeType(mimeType);
if (typeData_ !== INVALID) {
res.set(typeData_, data as any);
}
}
return res;
}
}

It lets us define a mimetype once, like this:

const cellModelDataType = new DataTypeNoArgs<Observable<ICellModel>>(
"application/x.jupyterlab.cell-model"
);

and use it in converters to go to/from that mimetype:

return createConverter(
{ from: resolveDataType, to: cellModelDataType },
({ url }) => {
const result = url.hash.match(/^[#][/]cells[/](\d+)$/);
if (
url.protocol !== "file:" ||
!url.pathname.endsWith(".ipynb") ||
!result
) {
return null;
}
const cellID = Number(result[1]);
// Create the original notebook URL and get the cells from it
url.hash = "";
const notebookURL = url.toString();
return defer(() =>
notebookCellsDataType
.getDataset(registry.getURL(notebookURL))!
.pipe(map(cells => cells[cellID]))
);
}
);

In a similar fashion, we should be able to create an object that refers to a certain URL template once, and then use it in converters. So we could add an optional fromURL and toURL parameter to createConverter that takes in a URL template template, and so instead of getting/returning an actual URL, you just return the parameters extracted from the template.

So the URLTemplate type, that you pass in, would have to both have the string of the URL template, and have some types that specify the mapping from params to types, so probably an object. So possibly something like this:

const notebookTemplate = new TemplateURL<"path" | "cellID">(
  'file://{/path}.ipynb#/cells/{cellID}',
)