tmalsburg/helm-bibtex

Support non-PDF files (JPG, PDF, HTML, ePub, directory)

Closed this issue · 26 comments

Note: I was told to ask here, see emacs-citar/citar#188.

I have, for instance, the following .bib entry:

@Misc{monet-1877a,
  author       = {Claude Monet},
  title        = {La Gare Saint-{Lazare}},
  year         = 1877,
  type         = {painting},
  publisher    = {Musée d'Orsay, Paris, France}
}

The .bib entry has a corresponding monet-1877a.jpg file.

When I use bibtex-actions-open, instead of seeing the painting, I am informed that:

Wrong type argument: stringp, nil

I would like to open JPG, PDF, HTML, ePub, or even directories, that match the citation keys.

I was suggesting, @tmalsburg, that ideally you'd support other file types, or if not, provide error handling for wrong file types.

support other file types

Ideally, this would be any kind of file (including directories) that matches the key. For instance, if one cites foo-2020 and there exists foo-2020.mp3, why not open it, or at least reveal it in dired? People cite images, videos, and more; not just PDF documents.

Your use case makes sense, but here's a problem when we allow any file type: Many users store their note files along with the PDFs and then have Authors2020.pdf and Authors2020.org in the same directory. If those users select "Open PDF, URL, or DOI", they will be asked every time whether they'd to open the note file or the PDF which is unnecessary.

Two possible solutions:

  1. You could list the .jpg in the file field. My memory is that we allow any file type there:
@Misc{monet-1877a,
  author       = {Claude Monet},
  title        = {La Gare Saint-{Lazare}},
  year         = 1877,
  type         = {painting},
  publisher    = {Musée d'Orsay, Paris, France},
  file         = {monet-1877a.jpg}
}
  1. You could define a custom function for opening JPGs. bibtex-completion-open-url-or-doi could be used as a template for that function.

Wow, as a mere end-user of bibtex-actions, I am in the weeds. 😄 For instance, I do not understand why “Open PDF, URL, or DOI” would open Org notes which are not PDF, URL, or DOI. That said, I like the idea of using the explicit file field, and so I focused on the first solution suggested above. Here is how it went:

I added file = {monet-1877a.jpg} to my BibTeX entry. Then, I ran bibtex-actions-open, and it showed [multiple] References: has:link\|has:pdf. I wondered why the prompt is nothas:url\|has:file instead, as those are the relevant BibTeX fields. (Right?) Confused, I deleted the whole line, typed monet, and pressed RET twice. Then, Emacs said "No URL or DOI found for this entry: monet-1877a".

I added file = {monet-1877a.jpg} to my BibTeX entry. Then, I ran bibtex-actions-open, and it showed [multiple] References: has:link\|has:pdf. I wondered why the prompt is nothas:url\|has:file instead, as those are the relevant BibTeX fields. (Right?)

So this piece is my problem :-)

You can follow up on that by bibtex-actions. Not a bug per se, but worth discussing.

Confused, I deleted the whole line, typed monet, and pressed RET twice. Then, Emacs said "No URL or DOI found for this entry: monet-1877a".

That's bibtex-completion though.

For instance, I do not understand why “Open PDF, URL, or DOI” would open Org notes which are not PDF, URL, or DOI.

The proposal was to extend that action such that it opens any file or directory that has the BibTeX key as its name. That would include notes files.

I added file = {monet-1877a.jpg} to my BibTeX entry.

It works for me, but note that you need to add this in your configuration (otherwise helm-bibtex doesn't even check the file field):

(setq bibtex-completion-pdf-field "file")

The proposal was to extend that action such that it opens any file or directory that has the BibTeX key as its name. That would include notes files.

I'm not sure about that.

I like the current distinction between opening effectively source files, and opening notes related to those source files.

I'm not sure about that.

I was responding to this:

Ideally, this would be any kind of file (including directories) that matches the key.

(emphasis mine)

I like the current distinction between opening effectively source files, and opening notes related to those source files.

I like that distinction as well, that's why I'm against opening any file type when the name matches the key. But perhaps there's a misunderstanding somewhere.

I like that distinction as well, that's why I'm against opening any file type when the name matches the key. But perhaps there's a misunderstanding somewhere.

If @salutis can put an mp3 or tiff file path in his "file" field, that should go a long way.

If there needs any code changes, maybe should only be files, at least initially, with a defcustom to define a list of extensions, with default value of pdf?

If there needs any code changes, maybe should only be files, at least initially, with a defcustom to define a list of extensions, with default value of pdf?

That sounds viable, although I somewhat dread the idea of adding more defcustoms. We have to many already :) If there performance impact is acceptable, we could perhaps just add the usual file formats per default. Not very high on my list of priorities, though.

(setq bibtex-completion-pdf-field "file")

Wohoo! Who would have thought that setting a "completion PDF field" to "file" in a Helm BibTeX package will fix JPEG not showing in bibtex-actions. I guess this is why "normal people" use Zotero. Ha-ha! 😄

Screen Shot 2021-07-25 at 17 19 44

Nice! Thanks for the screen shot :)

I guess this is why "normal people" use Zotero.

This is open source. So you're invited to help improve the documentation :)

I added file = {monet-1877a.jpg} to my BibTeX entry. Then, I ran bibtex-actions-open, and it showed [multiple] References: has:link\|has:pdf. I wondered why the prompt is nothas:url\|has:file instead, as those are the relevant BibTeX fields. (Right?)

So this piece is my problem :-)

I have opened emacs-citar/citar#191 to solve the problem.

This is open source. So you're invited to help improve the documentation :)

Personally, I am not convinced that this is a documentation problem. I see it more of a design problem. Currently, "Helm BibTeX" is tied to the idea that PDF files is what people cite, but that is a somewhat narrow perspective, especially in the age of "fake news," when we want people to cite more, not less. For instance, looking at the APA citation manual, the style is not tied to "PDF" in any way, which is wise. That said, I understand that PDF files are probably the most common use case in academia.

Well, the documentation does say that we support the Mendeley way of referring to files which is not limited to PDFs. And we've been supporting this format for years now. See here. It's just that this piece of information is in a section titled "PDF files". So it's not surprising that you overlooked it. In sum, I do think this is primarily (though perhaps not only) a documentation issue. (Not sure what any of this has to do with fake news and narrow perspectives.)

It's just that this piece of information is in a section titled "PDF files". So it's not surprising that you overlooked it. In sum, I do think this is primarily (though perhaps not only) a documentation issue.

I see. I indeed overlooked that! There is another section, Other file types than PDF, that seems related. I wonder, would changing "bibtex-completion-pdf-extension" to '(".pdf" ".jpg") have helped in my case? Either way, I do not understand why all these variables have pdf in their names. For instance, we fixed my problem by changing bibtex-completion-pdf-field to file, and that made Emacs show JPEG files. How is that a PDF field? This is why I am not sure that this is primarily a documentation problem.

It's just that this piece of information is in a section titled "PDF files". So it's not surprising that you overlooked it. In sum, I do think this is primarily (though perhaps not only) a documentation issue.

I see. I indeed overlooked that! There is another section, Other file types than PDF, that seems related. I wonder, would changing "bibtex-completion-pdf-extension" to '(".pdf" ".jpg") have helped in my case? Either way, I do not understand why all these variables have pdf in their names. For instance, we fixed my problem by changing bibtex-completion-pdf-field to file, and that made Emacs show JPEG files. How is that a PDF field? This is why I am not sure that this is primarily a documentation problem.

As I said here, "open-pdf" is probably really better thought of as "open-local-file".

But given how much is online these days, including historical archives, maybe the better distinction in 2021 is what I suggested earlier: between source document, and other documents about the source.

I see. I indeed overlooked that! There is another section, Other file types than PDF, that seems related. I wonder, would changing "bibtex-completion-pdf-extension" to '(".pdf" ".jpg")

Just tested it and it works. So we have a complete solution for this issue and it's even documented. Nice :)

Either way, I do not understand why all these variables have pdf in their names. For instance, we fixed my problem by changing bibtex-completion-pdf-field to file, and that made Emacs show JPEG files. How is that a PDF field? This is why I am not sure that this is primarily a documentation problem.

This is for historic reasons and due to the fact that this software grew organically from a quick hack to what it is today. Unfortunate, but difficult to change because we'd be breaking almost everyone's configuration if we'd just change the variable names. But I think the vast majority of users care about PDFs and only PDFs (the standard in academia for published articles), so it's perhaps not that big a deal.

But I think the vast majority of users care about PDFs and only PDFs (the standard in academia for published articles), so it's perhaps not that big a deal.

It's not such a big deal because in part Uis and documentation can smooth the issue, but @salutis is referring to fields that cite primary source documents a lot. Think (art, or musical) historians, for example.

This is for historic reasons and due to the fact that this software grew organically from a quick hack to what it is today. Unfortunate, but difficult to change because we'd be breaking almost everyone's configuration if we'd just change the variable names. But I think the vast majority of users care about PDFs and only PDFs (the standard in academia for published articles), so it's perhaps not that big a deal.

At some point it might be good to work on a helm-bibtex v2.0 which allows breaking changes (similar to org-roam v2), but I'd need a sabbatical to make that happen because there are a number of bigger changes that we'd need to make. A better system for dealing with associated documents is at the top of the list and there's a pretty clear plan for that (a flexible plug-in system that allows users to mix and mash way to link documents).

It's not such a big deal because in part Uis and documentation can smooth the issue, but @salutis is referring to fields that cite primary source documents a lot. Think historians, for example.

Sure, but BibTeX is a terrible format for these people in the first place. For instance, there are no entry types for paintings or musical performances. This software was written with the primary use case of BibTeX in mind. However, if it can be made more useful for others as well, I'd be excited about that.

(Closing this issue because the original problem has been resolved.)

I see. I indeed overlooked that! There is another section, Other file types than PDF, that seems related. I wonder, would changing "bibtex-completion-pdf-extension" to '(".pdf" ".jpg")

Just tested it and it works. So we have a complete solution for this issue and it's even documented. Nice :)

This is awesome! So, I do not need to retroactively add file everywhere. Thank you!

@tmalsburg @bdarcus Thank you for helping me out here! Once bibtex-actions has a better UI (emacs-citar/citar#191), non-PDF workflows will be smooth as butter (after adding a single line to .emacs).

TL;DR for future readers:

(setq bibtex-completion-pdf-extension '(".pdf" ".jpg"))

@tmalsburg

This is open source. So you're invited to help improve the documentation :)

And done: PR #384