becheran/mlc

List of features

Closed this issue · 3 comments

Could you maybe add a list of features to the README?

Things I'd be interested in knowing:

  • Does it only check links to full URLs, or also links to local files?
  • Does it check references? (the part of a URL after the #)
  • Does it check HTML links in Markdown files?

Surely there is more that would make sense in the list, but these are most interesting for me.
I need a tool to check consistency of machine documentation, and local files and references are very important for that.

I did update the feature list in the readme, hope this will help others to see what this tool offers.

To answer your questions directly:

  • It does check URLs and links to local files
  • It does not :( (yet?...)
  • It does. Also plain URLs which are found in the markdown text

wow.. perfect, thank you @becheran! :-)
I think this will help others a lot! :-)

I was thinking about implementing such a tool myself (probably would have used pandoc + filters), but now it looks like.. it might make more sense to improve this one (with reference parsing & checking).
So I imagine, this tool already uses HTML and Markdown parsing. So for the reference part, I would just need to fetch the linked-to files (download them if they are not local yet), and run them through the same HTML or Markdown parsing code you already have, but fetching the referenceable items, instead of the links. As an optimization, it might make sense to have a kind of caching system/folder, which might or might not remain after the tool finished executing.

What do you think... are you interested in a pull request, and could you give some initial pointers into the code for where I should check, or will I find it all by myself easily? (I did not look yet at all)
Note that I am quite a newb with rust, so if I get it running, the quality of the code would probably be quite low; I'd not be angry if you would not want to merge it.

@hoijui if you want to contribute and help improving mlc this would be awesome of course. Hope the tool actually does what you want it to do. I crated it with my (and others) GitHub READMES and docs in mind to eliminate broken links there.

I would of course be more than happy with a PR. I created another issue for the anchor link support where I already started to write down some thoughts and implementation hints: #31.

For now I do not use any file system caching in mlc. I am also not really sure if it helps with the usecase that I have in my mind that people wanna check their markup docs via a CI-pipeline. There you don't really want caching because you want to know if URL references are outdated. But the caching mechanism you describe seems also decoupled from the anchor link problem.

Let's continue the discussion about anchor link support in issue #31 I would say if you want to contribute.