Support externally stored images in .ipe files

Question

Support externally stored images in .ipe files

Opened this issue a month ago · 5 comments

The XML-Ipe file format is pure text and can nicely be stored in git, but including images breaks that. Also, auto-saving is very slow for documents with images, as they are stored in text format in an XML-document.

Ipe should support images that remain external. We cannot just store the source path of the image, because images can be brought into Ipe through copy-paste, and different platforms support different file formats for image import.

One simple idea could be:

For image "figure.ipe", there can be a parallel directory "figure.images", that contains the bitmaps in PNG or JPG format. Ipe copies them there when the image is imported into Ipe, using either the last component of the filename or creating a unique name. The images are read when Ipe loads the document (to display them), but otherwise never touched.

This has no effect on PDF documents: they continue to store all images (as they are needed in the PDF output anyway).

Answer 1 · 2024-10-27T11:18:15.000Z

So here are some features I'd love to see here, many taking inspiration from the similar functionality in inkscape (see below).

allow to choose between embedding and (relative or absolute) linking when inserting images
also embed pdf pages and maybe make further includegraphics options available (like cropping)
convert between embedded and linked representation for (all/individual) already-present images
reload changes from linked images that changed externally
update relative paths of linked images when saving to a different location
tool for resolving all broken links when opening an ipe file in a new location / from a different system
for quicker autosaving, use ephemeral external copies of all images instead of embedding

Answer 2 · 2024-10-27T12:19:16.000Z

"reload changes from linked images that changed externally"

This is a actually an argument in favour of \includegraphics. Every time you press "Run Latex", the images are refreshed.

To make such images easier to work with programmatically, we can define a new text style image like this:

<textstyle name="image" type="label" begin="\includegraphics{" end="}"/>

Now the contents of the object is simply the filename:

<text pos="128 768" style="image">Screenshot</text>

One could go a step further and give the TeX code inside a text object access to the custom property of the object. Then we could store the filename in the custom property. In fact I could imagine we allow new attributes parm1, parm2, etc. on text objects, and give the TeX code access to these, so then it could look like this:

<text pos="128 768" style="image" parm1="Screenshot" />

parm2 could then be the optional argument to \includegraphics.

Finding all broken links means simply running Latex, and looking for the error messages when it doesn't find the image 😀

Personally I would set \graphicspath in the preamble and store only the filename itself in the text object, but others might have a different preference. Therefore it would be nice to make the "management" functions (like a "linked bitmap manager") an ipelet, and only have the core functions in Ipe itself:

"Insert image" should offer the option to either embed or use \includegraphics
Image objects should have "Embed" and "Make external" on their context menu (for this one image)

The "linked bitmap manager ipelet" could then offer all the bulk functions that need more option settings - making everything external, adjusting paths, etc.

What do you mean by:

for quicker autosaving, use ephemeral external copies of all images instead of embedding

Wouldn't autosave just save the XML file, and never touch the image files (which are read-only as far as Ipe is concerned)?

Answer 3 · 2024-10-27T12:50:04.000Z

A huge disadvantage of relying on \includegraphics: it's hard to make this work well with online Latex conversion. We would have to send all images with each Latex run (and adjust the path names appropriately).

That seems like a good reason to build our own solution.

Answer 4 · 2024-10-28T14:50:19.000Z

This is a actually an argument in favour of \includegraphics. Every time you press "Run Latex", the images are refreshed.

Indeed! Even if we don't implement linked images via LaTeX I'd couple refreshing them (i.e., checking whether the files changed on disk and reloading changed buffers using some logic of chosen complexity) with running LaTeX.

One could go a step further and give the TeX code inside a text object access to the custom property of the object. Then we could store the filename in the custom property. In fact I could imagine we allow new attributes parm1, parm2, etc. on text objects, and give the TeX code access to these

This sounds like a very good idea independently of this issue!
This would kind of yield an in-between of templated LaTeX "text" objects and ipe-based symbols.
I guess you could do a lot of nice things with these, for example incorporating tcolorboxes with variable headings and colors as a more flexible alternative to decorations.

Finding all broken links means simply running Latex, and looking for the error messages when it doesn't find the image 😀

Yeah, but that means parsing the semi-structured LaTeX logs and also only knowing which images are needed by a document after a first (failed) LaTeX run. Moreover, I could imagine that it is harder to recover from this and get LaTeX to ignore only the missing images but compile the rest alright. I think we should have an easy way to know which images are linked by a document that doesn't involve parsing latex commands or logs .

Personally I would set \graphicspath in the preamble and store only the filename itself in the text object, but others might have a different preference.

Having some document-based default base for relative paths sounds like a good idea as it might also help against links breaking when moving an ipe file. Although some people may not be too happy with "leaking" the absolute paths in their documents and different paths on different systems would make using git harder. So this indeed should be an opt-in thing.

Therefore it would be nice to make the "management" functions (like a "linked bitmap manager") an ipelet, and only have the core functions in Ipe itself... The "linked bitmap manager ipelet" could then offer all the bulk functions that need more option settings - making everything external, adjusting paths, etc.

Agreed! Depending on the solution we choose, I think the main challenge may be to know which objects/files to manage. Managing everything that might be included from latex is close to infeasible (although the .fls file of a successful LaTeX run would actually tell you which file were read and we could keep a cache of this similar to latexmk), but only for some specifically-crafted text or image objects it is surely easier, especially if they don't involve free-text latex commands. Still, this would leave the user alone with the LaTeX error-messages, e.g. on opening the document you were sent from a colleague. So we may need a slightly tighter integration anyways.

Wouldn't autosave just save the XML file, and never touch the image files (which are read-only as far as Ipe is concerned)?

The issue I'm targeting is that autosave with many large embedded images might become very slow - for my PhD defense presentation (auto)saving took around 10s. As embedded images only change very rarely we might add some internal optimizations only for autosaving that at least make this process, which randomly interrupts the user with an unresponsive UI, quicker. This might also help towards a more Google Docs-like UX on the web. To this end, we could (independently of linked images) store the part of the XML file that contain the serialized image data to a separate cache file, and naïvely but quickly copy this data over into the autosave file when needed without repeated need for serialization, at the cost of discarding and regenerating the cache file whenever the user changes images. Alternatively, we could store all embedded images separately to some autosave cache location on disc, such that the autosave process can simply link them (potentially with a hint that they should be re-embedded if the autosave file is opened and properly saved).

it's hard to make this work well with online Latex conversion. We would have to send all images with each Latex run (and adjust the path names appropriately).

Indeed. The same also holds for ipe-overleaf. Moreover, going via LaTeX would mean copying the image data multiple times, which might also be slow when having many high-resolution images, which would be the exact opposite of the speed-up I'd like to achieve.

Currently, my "image encoder" based approach from #548 seems to work alright without introducing any new complexity to image encoding (although I haven't thoroughly tested it yet). I guess that most of the user-facing management feature for that could now also be implemented in lua, mostly independent of whether the actual loading works via the image or LaTeX implementation. So should we continue this way?

Answer 5 · 2024-10-28T15:39:28.000Z

Yes, I think we should build a solution that does not depend on Latex.