morphy2k/rss-forwarder

data not getting pushed to sinks

hdhoang opened this issue · 4 comments

hi, sorry that i can't get started. Here is an example config

# Feed 1
[feeds.github-blog]
url = "https://github.blog/all.atom"
interval = "10m"
retry_limit = 5
sink.type = "discord"
sink.url = "https://discord.com/api/webhooks/84175.../OZdejNBCL1..."

# Feed 2
[feeds.rust-blog]
url = "https://blog.rust-lang.org/feed.xml"
interval = "1m"

[feeds.rust-blog.sink]
type = "slack"
url = "https://slack/"

# testing outside docker image, to have `tee` command
[feeds.xkcd]
url = "https://xkcd.com/atom.xml"
interval = "2m"
sink.type = "custom"
sink.command = "tee"
sink.arguments = ["-a", "/tmp/xkcd.ndjson"]

I tried with 0.5.1 binary, the ghcr images for 0.5.1 and 0.6.0-alpha.0, as well as building git repo. I can see the custom command running:

rss-forwarder-l,179885 config.toml
  |-tee,179900 -a /tmp/xkcd.ndjson
  |-{rss-forwarder-l},179887
  |-{rss-forwarder-l},179888
  |-{rss-forwarder-l},179889
  |-{rss-forwarder-l},179890
  |-{rss-forwarder-l},179891
  |-{rss-forwarder-l},179892
  |-{rss-forwarder-l},179893
  |-{rss-forwarder-l},179894
  |-{rss-forwarder-l},179895
  |-{rss-forwarder-l},179896
  |-{rss-forwarder-l},179897
  `-{rss-forwarder-l},179899

The debug log shows feed fetching happening:

[2022-09-03T06:57:31Z INFO  rss_forwarder] Start watcher for "github-blog"
[2022-09-03T06:57:31Z INFO  rss_forwarder] Start watcher for "rust-blog"
[2022-09-03T06:57:31Z TRACE mio::poll] registering event source with poller: token=Token(1), interests=READABLE | WRITABLE
[2022-09-03T06:57:31Z INFO  rss_forwarder] Start watcher for "xkcd"
[2022-09-03T06:57:31Z DEBUG reqwest::connect] starting new connection: https://github.blog/
[2022-09-03T06:57:31Z DEBUG reqwest::connect] starting new connection: https://xkcd.com/
[2022-09-03T06:57:31Z DEBUG reqwest::connect] starting new connection: https://blog.rust-lang.org/
[2022-09-03T06:57:31Z TRACE mio::poll] registering event source with poller: token=Token(2), interests=READABLE | WRITABLE
[2022-09-03T06:57:31Z TRACE mio::poll] registering event source with poller: token=Token(3), interests=READABLE | WRITABLE
[2022-09-03T06:57:31Z TRACE mio::poll] registering event source with poller: token=Token(4), interests=READABLE | WRITABLE
[2022-09-03T06:57:31Z TRACE want] signal: Want
[2022-09-03T06:57:31Z TRACE want] signal found waiting giver, notifying
[2022-09-03T06:57:31Z TRACE want] poll_want: taker wants!
[2022-09-03T06:57:31Z TRACE want] signal: Want
[2022-09-03T06:57:31Z TRACE want] signal found waiting giver, notifying
[2022-09-03T06:57:31Z TRACE want] poll_want: taker wants!
[2022-09-03T06:57:31Z DEBUG reqwest::async_impl::client] response '200 OK' for https://blog.rust-lang.org/feed.xml
[2022-09-03T06:57:31Z TRACE want] signal: Want
[2022-09-03T06:57:31Z TRACE want] signal found waiting giver, notifying
[2022-09-03T06:57:31Z TRACE want] signal: Want
[2022-09-03T06:57:31Z TRACE want] poll_want: taker wants!
[2022-09-03T06:57:31Z TRACE want] signal: Want
[2022-09-03T06:57:31Z DEBUG reqwest::async_impl::client] response '200 OK' for https://github.blog/all.atom
[2022-09-03T06:57:31Z TRACE want] signal: Want
[2022-09-03T06:57:31Z TRACE want] signal found waiting giver, notifying
[2022-09-03T06:57:31Z TRACE want] poll_want: taker wants!
[2022-09-03T06:57:32Z TRACE want] signal: Want
[2022-09-03T06:57:32Z TRACE want] signal found waiting giver, notifying
[2022-09-03T06:57:32Z TRACE want] signal: Want
[2022-09-03T06:57:32Z TRACE want] poll_want: taker wants!
[2022-09-03T06:57:32Z TRACE want] signal: Want
[2022-09-03T06:57:32Z DEBUG reqwest::async_impl::client] response '200 OK' for https://xkcd.com/atom.xml
[2022-09-03T06:57:32Z TRACE want] signal: Want
[2022-09-03T06:57:32Z TRACE want] signal found waiting giver, notifying
[2022-09-03T06:57:32Z TRACE want] signal: Want
[2022-09-03T06:57:32Z TRACE want] poll_want: taker wants!
[2022-09-03T06:57:32Z TRACE want] signal: Want
[2022-09-03T06:58:31Z DEBUG reqwest::async_impl::client] response '200 OK' for https://blog.rust-lang.org/feed.xml
[2022-09-03T06:58:31Z TRACE want] signal: Want
[2022-09-03T06:58:31Z TRACE want] signal found waiting giver, notifying
[2022-09-03T06:58:31Z TRACE want] signal: Want
[2022-09-03T06:58:31Z TRACE want] poll_want: taker wants!
[2022-09-03T06:58:31Z TRACE want] signal: Want
[2022-09-03T06:58:42Z TRACE mio::poll] deregistering event source from poller
[2022-09-03T06:58:42Z TRACE want] signal: Closed

At shutdown via ctrl+c:

^C[2022-09-03T07:00:15Z TRACE mio::poll] deregistering event source from poller
[2022-09-03T07:00:15Z INFO  rss_forwarder] Watcher for "rust-blog" has stopped
[2022-09-03T07:00:15Z INFO  rss_forwarder] Watcher for "github-blog" has stopped
[2022-09-03T07:00:15Z INFO  rss_forwarder] Watcher for "xkcd" has stopped
[2022-09-03T07:00:15Z TRACE mio::poll] deregistering event source from poller
[2022-09-03T07:00:15Z TRACE mio::poll] deregistering event source from poller
[2022-09-03T07:00:15Z TRACE want] signal: Closed
[2022-09-03T07:00:15Z TRACE want] signal: Closed

Idling section for 0.6.0:

2022-09-03T07:06:45.053670Z TRACE hyper::proto::h1::io: received 4490 bytes
2022-09-03T07:06:45.053682Z DEBUG hyper::proto::h1::conn: incoming body completed
2022-09-03T07:06:45.053718Z TRACE hyper::proto::h1::conn: maybe_notify; read_from_io blocked
2022-09-03T07:06:45.053734Z TRACE want: signal: Want
2022-09-03T07:06:45.053745Z TRACE want: signal found waiting giver, notifying
2022-09-03T07:06:45.053763Z TRACE hyper::proto::h1::conn: flushed({role=client}): State { reading: Init, writing: Init, keep_alive: Idle }
2022-09-03T07:06:45.053775Z TRACE want: signal: Want
2022-09-03T07:06:45.053781Z TRACE hyper::proto::h1::conn: flushed({role=client}): State { reading: Init, writing: Init, keep_alive: Idle }
2022-09-03T07:06:45.053789Z TRACE want: poll_want: taker wants!
2022-09-03T07:06:45.053818Z TRACE hyper::client::pool: put; add idle connection for ("https", github.blog)
2022-09-03T07:06:45.053834Z DEBUG hyper::client::pool: pooling idle connection for ("https", github.blog)
2022-09-03T07:06:45.056467Z TRACE want: signal: Want
2022-09-03T07:06:45.056479Z TRACE hyper::proto::h1::conn: flushed({role=client}): State { reading: Init, writing: Init, keep_alive: Idle }

But we can see that no sink are invoked. The wrong discord token, and the bad slack url are not causing error.

I added a tracing::info event in custom sink:

                let data: Vec<u8> = tokio::select! {
                    biased;
                    _ = kill_rx.recv() => break,
                    v = data_rx.recv() => v.unwrap(),
                };
                tracing::info!(data =? data, "flush");

but it doesn't happen.

I tried to use tokio-console (it's necessary to downgrade to tokio@=1.20.1 --features tracing due to tokio-rs/console#373), there's one warning, but i don't understand async enough to see how task notification works yet.

My environment is Fedora 36, container runtimes docker/moby-engine, podman give same result.

What are your environments? Any pointer on how to check more on my side? thanks!

Regarding custom sink, if you use shell commands, the corresponding shell must be set as a command like bash, as in the examples.

[feeds.xkcd]
url = "https://xkcd.com/atom.xml"
interval = "2m"
sink.type = "custom"
- sink.command = "tee"
+ sink.command = "bash"
- sink.arguments = ["-a", "/tmp/xkcd.ndjson"]
+ sink.arguments = ["-c", "tee -a /tmp/xkcd.ndjson"]

thank you for the feedback, I have updated to bash, and can see it spawning commands, but data doesn't arrive still.

Here's a command-only config:

[feeds.smbc]
url = "https://www.smbc-comics.com/comic/rss"
sink.arguments = ["-x","-c","tee -a /tmp/smbc.ndjson"]
sink.command = "/bin/bash"
interval = "30s"
sink.type = "custom"

[feeds.dinosaur]
url = "https://qwantz.com/rssfeed.php"
sink.arguments = ["-x","-c","tee -a /tmp/qwantz.ndjson"]
sink.command = "/bin/bash"
interval = "1d"
sink.type = "custom"

Starting, with bash -x's output early on, and one poll interval passed:

# env RUST_LOG=trace ~/bin/rss-forwarder-linux-x86_64 feeds.toml
[2022-09-07T01:52:28Z TRACE mio::poll] registering event source with poller: token=Token(1), interests=READABLE | WRITABLE
[2022-09-07T01:52:28Z INFO  rss_forwarder] Start watcher for "smbc"
[2022-09-07T01:52:28Z TRACE mio::poll] registering event source with poller: token=Token(2), interests=READABLE | WRITABLE
[2022-09-07T01:52:28Z INFO  rss_forwarder] Start watcher for "dinosaur"
[2022-09-07T01:52:28Z DEBUG reqwest::connect] starting new connection: https://www.smbc-comics.com/
+ tee -a /tmp/smbc.ndjson
+ tee -a /tmp/qwantz.ndjson
[2022-09-07T01:52:28Z DEBUG reqwest::connect] starting new connection: https://qwantz.com/
[2022-09-07T01:52:28Z TRACE mio::poll] registering event source with poller: token=Token(3), interests=READABLE | WRITABLE
[2022-09-07T01:52:28Z TRACE mio::poll] registering event source with poller: token=Token(4), interests=READABLE | WRITABLE
[2022-09-07T01:52:28Z TRACE want] signal: Want
[2022-09-07T01:52:28Z TRACE want] signal found waiting giver, notifying
[2022-09-07T01:52:28Z TRACE want] poll_want: taker wants!
[2022-09-07T01:52:29Z TRACE want] signal: Want
[2022-09-07T01:52:29Z TRACE want] signal found waiting giver, notifying
[2022-09-07T01:52:29Z TRACE want] poll_want: taker wants!
[2022-09-07T01:52:29Z DEBUG reqwest::async_impl::client] response '200 OK' for https://www.smbc-comics.com/comic/rss
[2022-09-07T01:52:29Z TRACE want] signal: Want
[2022-09-07T01:52:29Z TRACE want] signal found waiting giver, notifying
[2022-09-07T01:52:29Z TRACE want] signal: Want
[2022-09-07T01:52:29Z TRACE want] poll_want: taker wants!
[2022-09-07T01:52:29Z TRACE want] signal: Want
[2022-09-07T01:52:29Z DEBUG reqwest::async_impl::client] response '200 OK' for https://qwantz.com/rssfeed.php
[2022-09-07T01:52:29Z TRACE want] signal: Want
[2022-09-07T01:52:29Z TRACE want] signal found waiting giver, notifying
[2022-09-07T01:52:29Z TRACE want] signal: Want
[2022-09-07T01:52:29Z TRACE want] poll_want: taker wants!
[2022-09-07T01:52:29Z TRACE want] signal: Want
[2022-09-07T01:52:30Z TRACE mio::poll] deregistering event source from poller
[2022-09-07T01:52:30Z TRACE want] signal: Closed
[2022-09-07T01:52:34Z TRACE mio::poll] deregistering event source from poller
[2022-09-07T01:52:34Z TRACE want] signal: Closed
[2022-09-07T01:52:58Z DEBUG reqwest::connect] starting new connection: https://www.smbc-comics.com/
[2022-09-07T01:52:58Z TRACE mio::poll] registering event source with poller: token=Token(16777220), interests=READABLE | WRITABLE
[2022-09-07T01:52:58Z TRACE want] signal: Want
[2022-09-07T01:52:58Z TRACE want] signal found waiting giver, notifying
[2022-09-07T01:52:58Z TRACE want] poll_want: taker wants!
[2022-09-07T01:52:59Z DEBUG reqwest::async_impl::client] response '200 OK' for https://www.smbc-comics.com/comic/rss
[2022-09-07T01:52:59Z TRACE want] signal: Want
[2022-09-07T01:52:59Z TRACE want] signal found waiting giver, notifying
[2022-09-07T01:52:59Z TRACE want] signal: Want
[2022-09-07T01:52:59Z TRACE want] poll_want: taker wants!
[2022-09-07T01:52:59Z TRACE want] signal: Want
[2022-09-07T01:53:00Z TRACE mio::poll] deregistering event source from poller
[2022-09-07T01:53:00Z TRACE want] signal: Closed
^C[2022-09-07T01:53:07Z TRACE mio::poll] deregistering event source from poller
[2022-09-07T01:53:07Z TRACE mio::poll] deregistering event source from poller
[2022-09-07T01:53:07Z INFO  rss_forwarder] Watcher for "dinosaur" has stopped
[2022-09-07T01:53:07Z INFO  rss_forwarder] Watcher for "smbc" has stopped

processlist doesn't show bash somehow:

# pstree -Aap (pgrep -f forwarder-linux )
rss-forwarder-l,24350 feeds.toml
  |-tee,24365 -a /tmp/smbc.ndjson
  |-tee,24366 -a /tmp/qwantz.ndjson
  |-{rss-forwarder-l},24352
  |-{rss-forwarder-l},24353
  |-{rss-forwarder-l},24354
  |-{rss-forwarder-l},24355
  |-{rss-forwarder-l},24356
  |-{rss-forwarder-l},24357
  |-{rss-forwarder-l},24358
  |-{rss-forwarder-l},24359
  |-{rss-forwarder-l},24360
  |-{rss-forwarder-l},24361
  |-{rss-forwarder-l},24362
  |-{rss-forwarder-l},24363
  |-{rss-forwarder-l},24364
  `-{rss-forwarder-l},24367

This will be my main usecase, but because http-based sinks also don't see data, hopefully we can diagnose that with fewer moving parts.

tokio-console seems to show 3 new tasks at every smbc polling, the custom sinks are not polled beyond initial start up
image

update: 0.6.0 alpha (i copied from your ghcr image) running on fly does push events, here's a pushing debug log, whose downstream is erroring on one:

2022-09-06T07:34:13.890 app[93f339f6] nrt [info] 2022-09-06T07:34:13.890347Z DEBUG reqwest::async_impl::client: response '200 OK' for https://qwantz.com/rssfeed.php
2022-09-06T07:34:13.949 app[93f339f6] nrt [info] 2022-09-06T07:34:13.949490Z DEBUG reqwest::async_impl::client: response '200 OK' for https://www.smbc-comics.com/comic/rss

2022-09-06T07:34:14.009 app[93f339f6] nrt [info] 2022-09-06T07:34:14.009534Z DEBUG rss_forwarder::watcher: pushing items from feed feed="Dinosaur Comics!" count=1
2022-09-06T07:34:14.011 app[93f339f6] nrt [info] Error: invalid type: string "<center><table width=740 border=0 cellspacing=5 cellpadding=5><tr><td colspan=4 align=\"center\"><tr><td colspan=4 align=\"center\"><a href=\"http://www.qwantz.com/archive.php\">archive</a> - <a href=\"mailto:ryan@qwantz.com?subject=if%20you%20don't%20have%20a%20phone%20then%20maybe%20get%20one.%20%20if%20you%20have%20two%20phones%20then%20knock%20that%20off%20already.%20%20okay%20bye%20again\">contact</a> - <a href=\"http://www.topatoco.com/qwantz\">sexy exciting merchandise</a> - <a href=\"http://www.ohnorobot.com/index.php?comic=23\">search</a> - <a href=\"http://www.qwantz.com/about.php\">about</a></td></tr></table><img src=\"http://www.qwantz.com/comics/comic2-4148.png\" class=\"comic\" title=\"if there's any downsides we can vote on our phones for someone to fix them. okay thanks, no further questions will be entertained\"><table width=740 border=0 cellspacing=5 cellpadding=5><tr><td width=100 align=\"left\" valign=\"top\"><a rel=\"prev\" href=\"http://www.qwantz.com/index.php?comic=3946\">&larr; previous</a></td><td width=* align=\"center\" valign=\"top\">September 5th, 2022</td><td width=100 align=\"right\" valign=\"top\">next</td></tr><tr><td colspan=3 align=\"left\" valign=\"top\"><P><b>September 5th, 2022: </b>Last night we used an underwater drone and almost got it stuck on a rock 20m underwater, which I believe is pretty much the standard underwater drone experience! 10/10 would almost ruin someone else's very expensive piece of technology again.<p align=\"right\">&ndash; Ryan</p></p></td></tr></table></center>", expected a borrowed string at line 1 column 1678

2022-09-06T07:34:14.075 app[93f339f6] nrt [info] 2022-09-06T07:34:14.075443Z DEBUG rss_forwarder::watcher: pushing items from feed feed="Saturday Morning Breakfast Cereal" count=1
2022-09-06T07:34:14.076 app[93f339f6] nrt [info] 2022-09-06T07:34:14.075976Z ERROR rss_forwarder: Watcher stopped with an error feed="smbc" error=Sink("broken stdin task") 

something is cursed about my fedora36 then.

(heh, i have never noticed custom mail subject on dinosaur comic!)

I close it. If the issue persists, let me know.