tldr-pages/tldr

Maintenance search commands

Opened this issue · 13 comments

I'll list here some search commands for the repository to find common mistakes that contributors make. Feel free to post your own. Not everything here returns a direct mistake, but they reduce the list enough that it's feasible to go through it manually. The acceptable outputs are outputs that have been verified to be intended and not an error.

Placeholders

grep -r -- "{{\[.*\]}[^}]" No output
grep -r -- "[^{]{\[.*\]}}" No output
grep -r "\]\]" acceptable output
grep -r "\[\[" acceptable output
grep -rE "{{\[[a-z]\|--[a-z]+\]}}" No output
grep -rE "{{\[-[a-z]\|[a-z]+\]}}" acceptable output
grep -r "{{-[a-zA-Z]|-" acceptable output
grep -r "{{-[a-zA-Z][a-zA-Z]|-" No output
grep -r "{{\[[^|]*\]}}" acceptable output
grep -r "{{\[ " No output
grep -r " ]}}" No output
grep -r "{{ "
grep -r " }}"
grep -r ^\` | grep -i " n "
grep -r '{{"'
grep -r "{{'"

Brackets

grep -r {{{ acceptable output
grep -r }}} acceptable output
find . -type f -print0 | xargs -0 awk '{ o=gsub(/{/,"&"); c=gsub(/}/,"&"); if(o!=c) print FILENAME ": " $0 }' acceptable output
find . -type f -print0 | xargs -0 awk '{ o=gsub(/\[/,"&"); c=gsub(/\]/,"&"); if(o!=c) print FILENAME ": " $0 }' acceptable output
find . -type f -print0 | xargs -0 awk '{ o=gsub(/\(/,"&"); c=gsub(/\)/,"&"); if(o!=c) print FILENAME ": " $0 }' acceptable output
find . -type f -print0 | xargs -0 awk '{ q=gsub(/"/,"&"); if(q % 2 != 0) print FILENAME ": " $0 }' No output
grep -r "{{[^}]*{{" No output
grep -r "}}[^{]*}}" acceptable output

Man pages

grep -r manned.org/man/ acceptable output
grep -r www.manned No output
grep -r ubuntu.com/manpages
grep -r linux.die acceptable output
grep -r linux.org/docs No output
grep -r linuxcommandlibrary No output
grep -r /html_node/ | grep -v coreutils | grep -v emacs | grep -v grub No output
grep -r "\.[a-z]*[/]*>" | grep -v "\.[x]*htm[l]*[/]*>" | grep -v "\.php[/]*>" | grep -v "\.md[/]*>" | grep -v "\.adoc[/]*>" | grep -v "\.pdf[/]*>" | grep -v "\.txt[/]*>"
grep -r http:// | grep "More information"
grep -r "#>" No output
grep -r freedesktop.org | grep latest

Wrong wording

grep -ri help | grep -v "Display help:" | grep -v -- --help | grep -v -- help] | grep -v subcommand | grep -v https | grep -v "\`"
grep -ri version | grep -v "Display version:" | grep -v -- --version
grep -ri config | grep -vi configuration | grep -v "More information"
grep -ri info | grep -vi information | grep -v informative
grep -ri stats | grep -vi statistics
grep -ri "See also" | grep -v "> See also:" acceptable output
grep -ri "show.* help"
grep -ri "show.* version"
grep -ri "check.* help" No output
grep -ri "check.* version"
grep -ri WiFi
grep -r wlan

Duplicate files

find . -type f -printf "%f\n" | sort | uniq -c | sort

Command line

grep -ri "command line"
grep -ri command-line
grep -r CLI
grep -ri terminal

Github and gitlab useless parts

grep -r ?ref_type=heads No output
grep -r ?tab=readme-ov-file No output
grep -r ?utm_source=chatgpt.com No output

Wrong filepath or url format

grep -r filename
grep -r folder
grep -r file_path No output
grep -r filepath
grep -r "\./"
grep -r http://target No output
grep -r "foo\."
grep -r "bar\."
grep -r path/to | grep -v "{{path/to"
grep -r "directory/}}" No output
grep -r "dir}}"
grep -r "path/to/[a-z]*/[a-z]*"
grep -r "{{.*another.*}}"

More information versioning

grep -r "More information:" | grep "[0-9]\.[0-9]"
grep -r "More information:" | grep "v[0-9]"
grep -r "More information:" | grep "/[0-9]/"
grep -r "> More information:" | grep -E "/[0-9]+/"

Device format

grep -r /dev/sd[a-z]
grep -r eth[0-9]
grep -r wlan[0-9]
grep -r /dev/tty

Standard streams

grep -vr "^\`" | grep -i stdout | grep -v "\`stdout\`"
grep -vr "^\`" | grep -i stdin | grep -v "\`stdin\`" No output
grep -vr "^\`" | grep -i stderr | grep -v "\`stderr\`" No output
grep -vr "^\`" | grep -i "standard out" No output
grep -vr "^\`" | grep -i "standard in" No output
grep -vr "^\`" | grep -i "standard err" No output

Lone short options

grep -r " -[a-zA-Z][^a-zA-Z]"
grep -r " -[a-zA-Z][a-zA-Z][^a-zA-Z]"

Small letter after colon

grep -r ": [a-z]"

Imperative mood

grep -r Generates No output
grep -r Runs
grep -r Resolves No output
grep -r Lists
grep -r Displays No output
grep -r Gets No output
grep -r Uses
grep -r Restarts No output
grep -r Opens
grep -r Executes No output
grep -r Creates
grep -r Initializes No output
grep -r Applies
grep -r Performs
grep -r Deploys No output
grep -r Controls
grep -r Converts No output
grep -r Launches No output
grep -r Updates
grep -r Installs

Wrong capitalization

grep -r bash | grep -v https
grep -r zsh
grep -r Fish | grep -v https
grep -r [pP]owershell | grep -v https
grep -r \`git\`
grep -r " N "
grep -r bluetooth

Character mistakes

grep -r … No output
grep -r " " acceptable output
grep -r — No output
grep -r =
grep -r ’ No output
grep -r ” No output

Wrong case

grep -rE "\{\{[a-zA-Z]+(-[a-z]+)+}}" # kebab-case
grep -rE "\{\{[a-z]+([A-Z][a-z]+)+\}\}" # camelCase
grep -rE "\{\{[A-Z][a-z]+([A-Z][a-z]+)+\}\}" # PascalCase
grep -rE "\{\{[A-Z]?[a-z]+([A-Z][a-z]+)+\}\}" # PascalCase + camelCase (previous 2 combined)

Page title doesn't match the filename

grep -r -m 1 '#' | sed "s/-/ /g" | sed "s/ \+/ /g" | grep -vEi '/(.+)\.md:.*\1' acceptable output

Command does not match filename

grep -r '`' | grep 'md:`' | sed "s/-/ /g" | grep -vEi '/([a-zA-Z0-9+_ \.\^,!~%\[]+)\.md:`.*\1' | grep -v tldr | grep -v "<" | grep -v "pacman" | grep -vEi ' ([a-zA-Z0-9+_ \.]+)\.md:`.*\1\]}}'

Page title does not match command

find . -type f | while read -r file
do
  l1=$(grep -m 1 "#" $file | cut -d ' ' -f 2-)
  grep '^`' $file | while read -r l2
  do
    short=$(echo $l1 | rev | cut -d ' ' -f 1 | rev)
    echo $l2 | grep -vF -- "$l1" | grep -v tldr | grep -v "<" | grep -v pacman | grep -vFi -- "$short]}}" | awk -v file="$file" -v title="$l1" 'NR==1 { printf "%-40s %s %s\n", file ":", "#" title ":", $0 }'
  done
done

Title contains uppercase letters

grep -r -m 1 "#" | grep [A-Z]

Test if command is a symlink and the page should be an alias

find . -type f | xargs -n1 basename | cut -d . -f 1 | xargs -I _ bash -c "if [ -L /bin/_ ]; then echo _ is a symlink; fi"

Typos

grep -ir initialise No output
grep -ir licence No output

Placeholders in descriptions

grep -r "^-" | grep {{ No output

Mnemonics

grep -r "^-" | grep -E "\[[a-zA-Z]+\]"

Optionality in commands

grep -rv "^-" | grep "\`" | grep "\[" | grep -v "{{\["

Use of archaic foobar

grep -ri foo | grep -v foot | grep -v footprint | grep -v food
grep -ri bar
grep -ri baz

Usage of ls in pipelines

grep -r "ls[ ]*.* |"

File contains executable permissions

find . -type f -executable No output

Use of apostrophe instead of backtick

grep -vr ^\` | grep "'[a-zA-Z][a-zA-Z]*'" No output

Missing oxford comma

grep -rE "[^,]+,[^,]+ (and|or) "

Use of personal language

grep -ri your
grep -r "my_"

Sort files by by creation date

git log --diff-filter=A --name-only --format='%aI' | awk '/^[0-9T-]/ {date=$0} /^[^0-9T-]/ && NF {print date, $0}' | sort -r

Here are some that can help with fixing issues in translated pages.

Translations

grep -r package
grep -rE "Display (help|version)"
grep -r "More information"
grep -r "Some subcommands"

Placeholders (may not work for all languages)

grep -r \{path/to
grep -rE "(command|name|number|domain)}}"

I used grep -r "See also" | grep -v pages/ on the whole project and saw that dnsx.md in korean mistakenly uses See also: @nelsonfigueroa

Instead of running all of these commands manually, would it be an option to run these in any of the scripts?

Why not. I just haven't bothered since making the script would take more time than just doing it manually and the results are not always a mistake

Why not. I just haven't bothered since making the script would take more time than just doing it manually and the results are not always a mistake

The script is already there (scripts/test.sh I believe), the grep just need to be added to it. Are there any grep's that do not have any false-positives?

Some of the ones that don't have "acceptable output" return nothing so those are fully fixed. I can't sau for certain for any of them so you'll have to run them individually.

Instead of running all of these commands manually, would it be an option to run these in any of the scripts?

Agreed, I think these commands would be nice to add to the test script and tldr-maintenance repository.

Here are some greps for checking snake_case in placeholders:

grep -rE "\{\{[a-zA-Z]+(-[a-z]+)+}}"  # kebab-case
grep -rE "\{\{[a-z]+([A-Z][a-z]+)+\}\}"  # camelCase
grep -rE "\{\{[A-Z][a-z]+([A-Z][a-z]+)+\}\}"  # PascalCase
grep -rE "\{\{[A-Z]?[a-z]+([A-Z][a-z]+)+\}\}"  # PascalCase + camelCase (previous 2 combined)

The kebab-case regex generates false-positives, but the camelCase and PascalCase regexes do not (at a quick glance). I tried to expand the regexes to cover other edge cases, but this led to more false-positives.

I edited your comment to include cases with more than two words.

Also, keep in mind that some of these might be intentional. For example in manim or gtk-launch

It is not adviceable to use ls in pipelines because it doesn't aim to be parseable. I added a search command to find its usage.

Here's a script of all the commands that so far don't return anything. Can also be used to check translations for issues.

checktldr.sh

Here's a script of all the commands that so far don't return anything. Can be used to check translations for issues.

checktldr.sh

Can you maybe integrate it in the existing scripts?

Maybe some day.