Do not let rules sit in memory for too long
sp1ff opened this issue · 6 comments
sp1ff commented
Presently, rules & their state (last match time, # hits, &c) are flushed to disk when:
- a scoring operation is performed (
elfeed-score-score
orelfeed-score-search
) - it is explicitly requested (
elfeed-score-serde-write-score-file
) - elfeed-score is unloaded (
elfeed-score-unload
)
Otherwise, if you simply leave elfeed-score in place, day after day, reading news & relying on the Elfeed new entry hook to score entries, a lot of state accumulates in-memory that won't be written to disk.
Worse, if you update your score file by hand (to add a new rule, say), and then carry out one of the above three operations, your changes will be overwritten.
elfeed-score needs to:
- at minimum, check to see if the score file has been touched since last read and refuse to overwrite without confirmation; in practice, this shouldn't be too bad, since in the overwhelming number of cases, the edit will be to add new rules and so the user should be able to copy off the changes, accept the write & just re-add the new rules
- better would be to arrange to have rule state written out to disk more regularly, to preclude the case described in 1. This could be done on a timer. It could be done based on some sort of counter (e.g. write state every n times the new entry hook is invoked). It could also be done every time an
elfeed-search-fetch
operation completes, but since that's asynchronous, it's touchy. We could setup a feed update hook & write every time the work queue goes to zero, but there's no clear "I'm done" signal from that operation.
It is not clear to me what the right answer is, here. @firmart -- you first raised this issue... thoughts?
firmart commented
Michael <notifications@github.com> writes:
Presently, rules & their state (last match time, # hits, &c) are flushed to disk when:
1. a scoring operation is performed (`elfeed-score-score` or `elfeed-score-search`)
2. it is explicitly requested (`elfeed-score-serde-write-score-file`)
3. elfeed-score is unloaded (`elfeed-score-unload`)
Otherwise, if you simply leave elfeed-score in place, day after day, reading news & relying on the Elfeed new entry hook to score entries, a lot of state accumulates in-memory that won't be written to disk.
Worse, if you update your score file by hand (to add a new rule, say), and then carry out one of the above three operations, your changes will be overwritten.
I experienced that several times when I started to use elfeed-score.
elfeed-score needs to:
1. at minimum, check to see if the score file has been touched since last read and refuse to overwrite without confirmation; in practice, this shouldn't be too bad, since in the overwhelming number of cases, the edit will be to add new rules and so the user should be able to copy off the changes, accept the write & just re-add the new rules
It would be better than overwriting silently. But in my opinion there
are still too many interactions for a poor user wanted merely to add a
new rule.
2. better would be to arrange to have rule state written out to disk more regularly, to preclude the case described in 1. This could be done on a timer. It could be done based on some sort of counter (e.g. write state every _n_ times the new entry hook is invoked). It could also be done every time an `elfeed-search-fetch` operation completes, but since that's asynchronous, it's touchy. We could setup a feed update hook & write every time the work queue goes to zero, but there's no clear "I'm done" signal from that operation.
It is not clear to me what the right answer is, here. @firmart -- you first raised this issue... thoughts?
From a user perspective, I haven't care too much about the :date and
:hits properties and they are certainly inaccurrate in my side as, back
in score version 5, I have often copy-pasted a whole entry and just
changed the rule.
How about
3. Make elfeed.score read-only and write stats/state in another file
elfeed.score.state that the user won't alterate manually.
... it is less risky for the user and the stats/state will be accurate.
I believe that :date & :hits are still relevant to see the impact of a
rule. But most of the time I just use
elfeed-score-scroing-explain-entry to make sure that my rules
(e.g. regex) really work, no matter on how many entries they are
applied.
…
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#13
sp1ff commented
Wrote down some thoughts on this. Starting work this week, along the lines of your proposal 3 @firmart.
firmart commented
Hi Michael,
I just read your post and subscribed to your RSS feed through elfeed :)
Thanks for working on this. Being able to write rules interactively is
indeed very welcome. I'll see if I can make some PRs to achieve
ideas I have (the split of the source code was helpful in readability).
Michael ***@***.***> writes:
… [Wrote down](https://unwoundstack.com/blog/elfeed-score-state.html) some thoughts on this. Starting work this week, along the lines of your proposal 3 @firmart.
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#13 (comment)
firmart commented
Michael ***@***.***> writes:
Fixed. Give it a whirl @firmart & let me know how it works!
Just tested it out. It upgrades smoothly to version 8, and the stats
file works fine. Thanks for having worked on this !
…
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#13 (comment)