Filter email list to not show `unsubscribe` emails.
gs0510 opened this issue ยท 16 comments
On the https://ocaml.org/community/ page, the recent email threads show all emails sent to the list. Filter the list so that unsubscribe
emails are not displayed.
@Ndipbanyan since you were looking for a medium issue, you can go ahead and work on this one.
@gs0510 Alright. Thank you. I will begin working on it and reach out for any help or clarifications that I might need.
@Ndipbanyan Have you been able to make any progress? Do you have any questions? Thanks!
@gs0510 I have been able to find the code that generates this list in the rss2.html in the script directory and I am trying to understand the function that does that to see if I can modify it to filter the list. So the drawback I am currently having is my little to lack of understanding of the Ocaml language. However, I am still going through tutorials to catch up.
okay, let me know if you run into any problems! Thanks!
@gs0510 So I came up with a solution and want to clear be about it before creating a PR. Let me try to explain- The api that is 'consumed' to display the emails in recent thread emails
returns a result having items in which each item has a title tag which reflects the subject of each email and the email of the sender. Below is what I am referring to
generated from https://sympa.inria.fr/sympa/rss/latest_arc/caml-list?count=40
Looking at the above, you will notice that the item with the title <title>[Caml-list] - ulugbekna@gmail.com</title>
has its email subject as "[Caml-list]", item with title <title>[Caml-list] [CFP] Logical Frameworks and Meta-Languages: Theory and Practice - enrico.tassi@inria.fr</title>
has its email subject as "[Caml-list] [CFP] Logical Frameworks and Meta-Languages: Theory and Practice" and item with the title <title>[Caml-list] unsubscribe - jean-denis.eiden@orange.fr</title>
has "[Caml-list] unsubscribe" as its subject.
Now in the code base in the /script/rss2html.ml
, line 595 contains a regex expression that is written to exlude "Re:" and anything in between [ ] which was used to match the subject(represented in between the <title> </title> tags). Doing this results to the [Caml-list] and [CFP] removed from the above "titles" leaving only the remaining part of the titles to be displayed. so in the case of <title>[Caml-list] - ulugbekna@gmail.com</title>
, there isn't any title after the [Caml-list] has been replaced/removed so the email - ulugbekna@gmail.com
is displayed. Going by all these, my implementation added the unsubscribe
to the regex which will end up displaying <title>[Caml-list] unsubscribe - jean-denis.eiden@orange.fr</title>
as "- jean-denis.eiden@orange.fr" in recent thread emails.
This has become rather too long :). However, the point of all my explanations is to be sure if my implementation is the way it should be or you mean an entirely different thing. Thank you for taking time in helping me with this.
HI @Ndipbanyan! You are almost right :) We don't want to display the threads that say unsubscribe on the email feed and not remove unsubscribe
from the title. What the function normalize_title
is just normalizing titles (so removing the [CFP] etc etc.). What we want to do is remove the unsubscribe
post from the posts
list, so you can parse the list to see if there's a post with unsubscribe
in it's title and remove that from the list. Hope this helps!
Let me know if anything is unclear, or if there's anything OCaml related that you don't understand :)
Thank you @gs0510 for the clarity. I will look into implementing this and let you know when I run into any issue understanding anything. Thanks
@gs0510 I have been having issues in trying to run make
or make production
since I installed the ocaml platform extension on vscode. Below was the error I was getting
I uninstalled the extension then the cohttp-server-lwt ./ocaml.org
wouldn't start anymore and running make
gives the below error
Please can you help me detect what the problem is?
@Ndipbanyan Both errors are related to omd
. Can you run opam show omd
to see what version of omd
you have?
cohttp-server-lwt ./ocaml.org
will work only if your make command is successful.
The website doesn't work with the latest version of OMD, see issue #1321, you need to downgrade omd to 1.3.1 and it should be okay after that :)
Yes! It works now. Thanks. Got me stuck there for a while.
Also I think I have been able to filter the emails now. My implementation is thus:-
I wrote a regex (for the unsubscribe word) and added an else if
block in the must_keep
function to exclude any post whose title matches the regex. Is this implementation okay?
Before:
After:
Code snippet (lines 592 and 614)
This looks good @Ndipbanyan, you can make the regex case agnostic so that all kinds of unsubscribes are filtered out. You should also open a PR. :)
Great! I've opened a PR. I used Str.regexp_case_fold
as opposed to just Str.regexp
so I believe that makes it case agnostic.