pprevos/denote-explore

About denote-explore-identify-duplicate-identifiers

Closed this issue · 6 comments

First, thank you for your excellent work.

About denote-explore-identify-duplicate-identifiers, if a file was generated by org file, so they have same name except its extension.

Such as 20230308T211425--Tops-in-mind2__selfworth.html was generated from 20230308T211425--Tops-in-mind2__selfworth.org, denote-explore will think they have duplicate identifiers.

How about exclude them as duplicate, when files have same name but different extension.

Great suggestion. When I originally wrote this function, Denote had issues linking to exported notes (which has since been fixed). I also like to keep my folder tidy so I do remove them (or force a different export file name).

I cobbled a new version together which removes duplicate extensions when using the universal argument to call the function (or use (denote-explore-identify-duplicate-identifiers t):

(defun denote-explore-identify-duplicate-identifiers (&optional sans-extension)
  "Provide list of duplicate Denote IDs or file name SANS-EXTENSION.
With universal argument removes duplicates using file name sans extension."
  (interactive)
  (if-let* ((notes (denote-directory-files))
            (candidates (if sans-extension
			    (mapcar (lambda (path)
				      (file-name-sans-extension
				       (file-name-nondirectory path)))
				    notes)
			  (mapcar #'denote-retrieve-filename-identifier notes)))
	    (dups (delete-dups
                   (cl-remove-if-not
                    (lambda (id)
                      (member id (cdr (member id candidates)))) candidates))))
      (message "Duplicate identifier(s): %s"
               (mapconcat (lambda (id) id) dups ", "))
    (message "No duplicate identifiers found")))

If this work for you, then I'll merge it into the next release.

Great suggestion. When I originally wrote this function, Denote had issues linking to exported notes (which has since been fixed). I also like to keep my folder tidy so I do remove them (or force a different export file name).

I cobbled a new version together which removes duplicate extensions when using the universal argument to call the function (or use (denote-explore-identify-duplicate-identifiers t):

(defun denote-explore-identify-duplicate-identifiers (&optional sans-extension)
  "Provide list of duplicate Denote IDs or file name SANS-EXTENSION.
With universal argument removes duplicates using file name sans extension."
  (interactive)
  (if-let* ((notes (denote-directory-files))
            (candidates (if sans-extension
			    (mapcar (lambda (path)
				      (file-name-sans-extension
				       (file-name-nondirectory path)))
				    notes)
			  (mapcar #'denote-retrieve-filename-identifier notes)))
	    (dups (delete-dups
                   (cl-remove-if-not
                    (lambda (id)
                      (member id (cdr (member id candidates)))) candidates))))
      (message "Duplicate identifier(s): %s"
               (mapconcat (lambda (id) id) dups ", "))
    (message "No duplicate identifiers found")))

If this work for you, then I'll merge it into the next release.

The docstring for denote-directory-files reads:

With optional TEXT-ONLY as a non-nil value, limit the results to
text files that satisfy denote-file-is-note-p.

I think to achieve that functionality we can use the TEXT-ONLY parameter for denote-directory-files. So the body of denote-explore-identify-duplicate-identifiers can be:

(if-let* ((notes (denote-directory-files nil nil t))
          (ids (mapcar #'denote-retrieve-filename-identifier notes))
          (dups (delete-dups
                 (cl-remove-if-not
                  (lambda (id)
                    (member id (cdr (member id ids)))) ids))))
    (message "Duplicate identifier(s): %s"
             (mapconcat (lambda (id) id) dups ", "))
  (message "No duplicate identifiers found"))

Thanks for the suggestion @krisbalintona ,

My first version did not achieve the result. I think this version covers both eventualities. I use Denote extensively for PDF files and images, so using text-only is not a useful option.

(defun denote-explore-identify-duplicate-notes (&optional filenames)
    "Identify duplicate Denote IDs or FILENAMES.
  If FILENAMES is non-nil, check for filename duplicates if nil, check Denote IDs.
Using the FILENAMES option excludes exported Denote files."
    (interactive "P")
    (let* ((denote-files (denote-directory-files))
           (candidates (if filenames
                           (mapcar (lambda (path)
                                      (file-name-nondirectory path))
                                   denote-files)
                         (mapcar #'denote-retrieve-filename-identifier denote-files)))
           (tally (denote-explore--table candidates))
           (duplicates (mapcar #'car (cl-remove-if-not
                                      (lambda (note) (> (cdr note) 1)) tally))))
      (if duplicates
          (message "Duplicate identifier(s): %s" (mapconcat 'identity duplicates ", "))
        (message "No duplicate identifiers found"))))

Hi @pprevos. That's a good point. I looked into denote to see if my suggested code could work alongside PDFs.

It seems it can. Adding (pdf :extension ".pdf") to denote-file-types should be sufficient. I think it would identify PDFs as notes without any undesirable consequences, but I'm not 100% sure. At the very least, it lets .pdf extensions be returned by denote-file-type-extensions.

Hi @krisbalintona ,

There is no need to adjust denote-file-types, that variable is only used when creating new notes.

Denote recognises any file that conforms with it's file-naming convention. So you can find and link PDF files, or any other binary file.

P:)

New function denote-explore-identify-duplicate-notes replaces old version. Committed in version 1.2. f1dbb2d