sharkdp/bat

Please support .debdiff files

bdrung opened this issue · 13 comments

Debian source diff files are often named .debdiff. They are normal diff files. Please support coloring them by just treating them as diff files.

jacg commented

If nobody is picking this up, I might have a go.

Any hints on how it should be approached?

Hi, I believe the way to go is to add a syntax mapping for it, which is done with TOML files - see https://github.com/sharkdp/bat/tree/master/src/syntax_mapping/builtins

jacg commented

@bdrung Presumably you'd want .nmudiff too?

@jacg I cannot remember seeing .nmudiff in the wild. Looking at a random NMU bug the attached diff is named .debdiff there.

jacg commented

Are there any guidelines on what to use as a test case file?

Curiously enough, if I copy tests/syntax-tests/source/Diff/82e7786e747b1fcfac1f963ae6415a22ec9caae1.diff into a file with the .debdiff extension, then bat applies syntax highlighting.

However, if I use the the diff from the random NMU bug mentioned above, then bat only applies syntax highlighting if the file extension is .diff but not .debdiff.

In other words, for some diff files, bat figures out that it's a diff even if the extension is .debdiff, for others it does not.

jacg commented

Adding a file src/syntax_mapping/builtins/unix-family/50-diff.toml with contents

# .debdiff is the extension used for diffs in Debian packaging
[mappings]
"Diff" = ["*.debdiff"]

seems to do the job.

However:

  1. There are only 30 .toml files in syntax_mapping/builtins. Clearly these only account for a tiny part of the known syntaxes, not including diff.
  2. bat -L shows Diff diff, patch. The proposed solution above changes this to Diff diff, patch, *.debdiff.

This makes me wonder whether it could/should be done using the same mechanism that was used to recognize the diff and patch extensions.

It can be done, yes, you'd need to patch the Diff.sublime-syntax file to add debdiff as a file extension associated with that syntax. See https://github.com/sharkdp/bat/blob/master/assets/patches/XML.sublime-syntax.patch for an example.

However, if I use the the diff from the random NMU bug mentioned above, then bat only applies syntax highlighting if the file extension is .diff but not .debdiff.

Presumably the first_line_match regex pattern doesn't match the first line of that NMU diff, which is why it only gets picked up by a matching file extension.

jacg commented

Is either approach to be preferred over the other? (The toml file approach certainly looks simpler.)

As for testing, I started off by trying to generate a test failure, as a sanity check, by modifying the contents of tests/syntax-tests/source/Diff/*.diff and hoping that a mismatch with the corresponding file in syntax_tests/highlighted would be detected. But all tests pass. What am I missing?

I would think it's generally easier to maintain the TOML mappings than a series of patches. The TOML mappings were something added fairly recently, for reference. Previously mappings were maintained in Rust code, which wasn't so fun to modify.

If we want simple file extension globs to show up differently in bat --list-languages - to match the "standard supported file extensions" then we can certainly discuss that :)

I just tried changing bat/tests/syntax-tests/source/Diff/82e7786e747b1fcfac1f963ae6415a22ec9caae1.diff by adding a word and it was picked up as differing by PATH="$PATH:$PWD/target/release/" ./tests/syntax-tests/regression_test.sh

========== Diff/82e7786e747b1fcfac1f963ae6415a22ec9caae1.diff
--- /home/keith/repos/bat/tests/syntax-tests/highlighted/Diff/82e7786e747b1fcfac1f963ae6415a22ec9caae1.diff
+++ /tmp/tmp.6yGkWoBGns/Diff/82e7786e747b1fcfac1f963ae6415a22ec9caae1.diff
@@ -9,7 +9,7 @@
 +
 +- Add a new `--diff` option that can be used to only show lines surrounding
 +  Git changes, i.e. added, removed or modified lines. The amount of additional
-+  context can be controlled with `--diff-context=N`. See #23 and #940
++  context can be controlled with `--diff-context=N`. See #23 and #940. Blah
 +
  ## Bugfixes
  ## Other
jacg commented

Thanks, I can see the tests picking up changes to the .diff sample source file, using the technique you suggested.

  1. This required running these specific tests by hand. Is there a way of running all the available tests (cargo test, tests/syntax-tests/regression_test.sh, any others there may be) in one go? Alternatively, is there a convenient way of knowing what tests exist and how to run them?

  2. Running these syntax regression tests in a fresh clone of the bat repo reports some failures:

    • cmd-help/test.cmd-help
    • Plaintext/plaintext.txt
    • JQ/sample.jq

    Should I be surprised?

  3. Running cargo test in a fresh clone of the bat repo produces one failure: bat::integration_tests long_help. Should I be surprised?

  1. This required running these specific tests by hand. Is there a way of running all the available tests (cargo test, tests/syntax-tests/regression_test.sh, any others there may be) in one go? Alternatively, is there a convenient way of knowing what tests exist and how to run them?

As with any reasonably complex software, the language tooling's simple test suite often doesn't cover everything. Generally, one can look at the CI steps performed for an exhaustive list of what commands are executed for verification that everything is working as expected. The regression test for example is defined at

run: tests/syntax-tests/regression_test.sh
. You may be able to use a tool designed to run GitHub Actions locally to avoid having to manually execute all the individual steps.

Running these syntax regression tests in a fresh clone of the bat repo reports some failures:

* cmd-help/test.cmd-help

* Plaintext/plaintext.txt

* JQ/sample.jq

Should I be surprised?

I would personally say "yes" - it shouldn't have been possible to merge changes which would fail these tests. Indeed, the latest CI run on master branch passed successfully: https://github.com/sharkdp/bat/actions/runs/8751624872
And the plaintext syntax has never changed, so it seems weird that it would fail for you...

3. Running cargo test in a fresh clone of the bat repo produces one failure: bat::integration_tests long_help. Should I be surprised?

As per the previous answer, "yes". What environment (OS etc.) are you running bat in? I know there used to be problems with tests on Windows, but a Windows test run was added to CI so should be resolved.

jacg commented

As with any reasonably complex software, the language tooling's simple test suite often doesn't cover everything.

FWIW, that's why I try to stick a justfile at the root of my projects, with an obvious and easily-discoverable recipe that runs all the tests. Admittedly, this can diverge from what is executed in CI workflows, so it's not foolproof.

What environment (OS etc.) are you running bat in?

Linux (NixOS). Not sure what other environment details would be useful to mention.

Tried it in a bare bones bash instead of my highly-configured zsh, and the same failures appear.

They seem to be mostly caused by the odd token having an unexpected colour.

Anyway, for the time being I can work with this small number of tests failing locally, and rely on CI (which is working fine in my draft PR (#2947) to catch anything I miss.

jacg commented

Ah, sorry, I see that the link to the draft PR in the previous comment was pointing at the issue that inspired it, rather than the PR itself. Here is the correct link: #2947 (now also fixed in the comment).