piqoni/matcha

support questions

Opened this issue · 2 comments

omega3 commented

May I ask you some questions:

  1. Please tell me if it is possible to set date of pulled articles, like yesterday or from 05.08.2023 to 07.08.2023? I was wondering how many articles it pulls. If I track some reddit channel as rss feed it can have like about 100 news/articles a day. If I run matcha once a day it will pull only with today's date to the hour when I run it like to 13:00. Correct?
    What if I will not run it one day because I am not at my computer? All yesterday articles will be missed?

  2. Can you tell me what coding is used when run in terminal (I am on Linux)? When I run:
    ./matcha-linux-amd64 >01.txt to save output to txt and open it in Kate (plasma note app) it shows some strange coding that looks like squares.

  3. Last issue: would it be possible to have terminal output without markdown but like this:

title
url

title
url

If you wish I can start new issues on this.

piqoni commented

Thanks for the questions!

  1. It will not check the published date or any date for that matter, it will simply get as much as the feed provides (depends on the feed, it can be 20, 100..) regardless of the date and simply check in its local database whether it has fetched this article or not, meaning if you miss one day and the article is still present on the RSS feed, the next day you should have the previous articles on todays date. But if feed publishes only 20, and there are 100 posts, then it is possible for them to be missed.

(if its too many articles you can add a limit adding a space and a number after the feed url: Example)

  1. Maybe you meant -t >01.txt? , probably the terminal tricks that termlink library has to add in front of text so it makes links clickable in terminal. Wondering why you would run it like that when it would generate markdown files by default instead of txt if you run it without options. I'd say its safe to say -t is only for terminal viewing.

  2. Can you elaborate more on this, so you dont want to use the terminal but some text reader instead of markdown, correct?

omega3 commented

Here is the output of ./matcha-linux-amd64 -t >02.txt
https://drive.google.com/file/d/1JEHQ87MOyO_81EgfqRGZ0Y_2uCr7ea8f/view?usp=sharing

and screenshot from Kate (default Plasma KDE on Linux note app)
https://i.imgur.com/SvhjJJL.png
Perhaps it is Kate setting but when I check coding in this output file is is set to UTF-8.

This is strange but in terminal I don't see these squares but neither links are visible
https://i.imgur.com/hIxXmhn.png

I could write a script to run your program daily and have terminal output in txt and either create md file or csv and use it in spreadsheet. For thousands of articles I think it is good for me to have a database of them in spreadsheet. Perhaps because I am new to Obsidian and I don't know all about it. Besides regardless of md or txt I'd prefer if the output is in form of two separate lines title and url below. It makes it easier to make headline out of title and after that add some comments or copy some content below url and creating links and backlinks for headline is much better in Obsidian than operating on block level. That is why I use Firefox add-on that also gives me title and url.

And for Logseq there is an excellent link preview add-on that doesn't need to have link in markdown for but title url is enough. As I haven't decided yet which note taking app will be my default for processing RSS I think that keeping it without markdown format gives me more options to process data. I am perfectly aware that I can process this markdown links into title and url myself manually but being able to have it ready out of the box as title and url would be a plus for me.

Maybe you could give option to also create md file but not formatted like markdown link but title and url below. That is my feature request.