mycognosist/solar

Data being re-written into sled ad infinitum?

Opened this issue ยท 1 comments

I'm running the connection manager branch on the same machine as Patchwork. I launched like this:

RUST_LOG=debug cargo run -- -p 9009 -l true -s false

This doesn't actually make solar show up in Patchwork's "Possible connections" list, but I can initiate the connection from the other side:

ssb-server conn.connect 'net:localhost:9009~shs:/RapX3ta4AgiNwqk4aDnfj/mvY3Jb8HfR7Ixe03CdIE='

At this point, solar tries to retrieve the latest messages:

[2023-02-08T20:12:04Z INFO  solar::actors::peer] ๐Ÿ’ƒ received connection from peer @MRiJ+CvDnD9ZjqunY1oy6tsk0IdbMDC4Q3tTC8riS3s=.ed25519
[2023-02-08T20:12:05Z DEBUG solar::actors::rpc::history_stream] initializing history stream handler
[2023-02-08T20:12:05Z INFO  solar::actors::rpc::history_stream] requesting messages authored by peer @MRiJ+CvDnD9ZjqunY1oy6tsk0IdbMDC4Q3tTC8riS3s=.ed25519 after Some(7998)
[2023-02-08T20:12:11Z WARN  solar::actors::rpc::history_stream] received out-of-order msg from @MRiJ+CvDnD9ZjqunY1oy6tsk0IdbMDC4Q3tTC8riS3s=.ed25519; recv: 7998 db: 7998
[2023-02-08T20:13:06Z INFO  solar::actors::peer] ๐Ÿ‘‹ finished connection with MRiJ+CvDnD9ZjqunY1oy6tsk0IdbMDC4Q3tTC8riS3s=.ed25519

As we see, patchwork connects to solar, replication succeeds, albeit with an "out of order" message.
I think I saw a discussion about this behaviour on ssb lately; Indeed it seems that requesting everything "after" seq=7998 returns seq==7998 as well.

Now, this was the "happy path". Replication got logged as complete.
It seems that this only happens when there's no actual new data though. In comparison, if I...

  • generate any new messages on the replicated feed
  • launch solar
  • tell patchwork to connect to solar

solar will still launch the replication, but it never prints the last line "finished connection". Instead, I see an infinite (?) loop of this sort of output:

Click to expand
[2023-02-08T20:06:37Z DEBUG sled::pagecache::iobuf] advancing offset within the current segment from 2621440 to 2642184
[2023-02-08T20:06:37Z DEBUG sled::pagecache::iobuf] wrote lsns 2712141824-2712162567 to disk at offsets 2621440-2642183, maxed false complete_len 20744
[2023-02-08T20:06:37Z DEBUG sled::pagecache::iobuf] mark_interval(2712141824, 20744)
[2023-02-08T20:06:37Z DEBUG sled::pagecache::iobuf] new highest interval: 2712141824 - 2712162567
[2023-02-08T20:06:37Z DEBUG sled::pagecache::iobuf] rolling to new segment after clearing 2642184-3142829
[2023-02-08T20:06:37Z DEBUG sled::pagecache::segment] segment accountant returning offset: 6291456 for lsn 2712666112 on deck: {6815744, 7864320, 8388608, 14155776, 14680064, 15204352, 15728640, 16252928}
[2023-02-08T20:06:37Z DEBUG sled::pagecache::iobuf] storing lsn 2712666112 in beginning of buffer
[2023-02-08T20:06:37Z DEBUG sled::pagecache::iobuf] wrote lsns 2712162568-2712666111 to disk at offsets 2642184-3145727, maxed true complete_len 503544
[2023-02-08T20:06:37Z DEBUG sled::pagecache::iobuf] mark_interval(2712162568, 503544)
[2023-02-08T20:06:37Z DEBUG sled::pagecache::iobuf] new highest interval: 2712162568 - 2712666111
[2023-02-08T20:06:37Z DEBUG sled::pagecache::segment] freeing segment 3145728
[2023-02-08T20:06:38Z DEBUG sled::pagecache::iobuf] advancing offset within the current segment from 6291456 to 6314656
[2023-02-08T20:06:38Z DEBUG sled::pagecache::iobuf] wrote lsns 2712666112-2712689311 to disk at offsets 6291456-6314655, maxed false complete_len 23200
[2023-02-08T20:06:38Z DEBUG sled::pagecache::iobuf] mark_interval(2712666112, 23200)
[2023-02-08T20:06:38Z DEBUG sled::pagecache::iobuf] new highest interval: 2712666112 - 2712689311
[2023-02-08T20:06:38Z DEBUG sled::pagecache::iobuf] rolling to new segment after clearing 6314656-6814078
[2023-02-08T20:06:38Z DEBUG sled::pagecache::segment] segment accountant returning offset: 3145728 for lsn 2713190400 on deck: {6815744, 7864320, 8388608, 14155776, 14680064, 15204352, 15728640, 16252928}
[2023-02-08T20:06:38Z DEBUG sled::pagecache::iobuf] storing lsn 2713190400 in beginning of buffer
[2023-02-08T20:06:38Z DEBUG sled::pagecache::iobuf] wrote lsns 2712689312-2713190399 to disk at offsets 6314656-6815743, maxed true complete_len 501088
[2023-02-08T20:06:38Z DEBUG sled::pagecache::iobuf] mark_interval(2712689312, 501088)
[2023-02-08T20:06:38Z DEBUG sled::pagecache::iobuf] new highest interval: 2712689312 - 2713190399
[2023-02-08T20:06:38Z DEBUG sled::pagecache::segment] freeing segment 3670016
[2023-02-08T20:06:38Z DEBUG sled::pagecache::iobuf] advancing offset within the current segment from 3145728 to 3163851
[2023-02-08T20:06:38Z DEBUG sled::pagecache::iobuf] wrote lsns 2713190400-2713208522 to disk at offsets 3145728-3163850, maxed false complete_len 18123
[2023-02-08T20:06:38Z DEBUG sled::pagecache::iobuf] mark_interval(2713190400, 18123)
[2023-02-08T20:06:38Z DEBUG sled::pagecache::iobuf] new highest interval: 2713190400 - 2713208522
[2023-02-08T20:06:38Z DEBUG sled::pagecache::iobuf] rolling to new segment after clearing 3163851-3667754
[2023-02-08T20:06:38Z DEBUG sled::pagecache::segment] segment accountant returning offset: 3670016 for lsn 2713714688 on deck: {6815744, 7864320, 8388608, 14155776, 14680064, 15204352, 15728640, 16252928}
[2023-02-08T20:06:38Z DEBUG sled::pagecache::iobuf] storing lsn 2713714688 in beginning of buffer
[2023-02-08T20:06:39Z DEBUG sled::pagecache::iobuf] wrote lsns 2713208523-2713714687 to disk at offsets 3163851-3670015, maxed true complete_len 506165
[2023-02-08T20:06:39Z DEBUG sled::pagecache::iobuf] mark_interval(2713208523, 506165)
[2023-02-08T20:06:39Z DEBUG sled::pagecache::iobuf] new highest interval: 2713208523 - 2713714687
[2023-02-08T20:06:39Z DEBUG sled::pagecache::segment] freeing segment 4194304
[2023-02-08T20:06:39Z DEBUG sled::pagecache::iobuf] advancing offset within the current segment from 3670016 to 3685118
[2023-02-08T20:06:39Z DEBUG sled::pagecache::iobuf] wrote lsns 2713714688-2713729789 to disk at offsets 3670016-3685117, maxed false complete_len 15102
[2023-02-08T20:06:39Z DEBUG sled::pagecache::iobuf] mark_interval(2713714688, 15102)
[2023-02-08T20:06:39Z DEBUG sled::pagecache::iobuf] new highest interval: 2713714688 - 2713729789
[2023-02-08T20:06:39Z DEBUG sled::pagecache::iobuf] rolling to new segment after clearing 3685118-4193048
[2023-02-08T20:06:39Z DEBUG sled::pagecache::segment] segment accountant returning offset: 4194304 for lsn 2714238976 on deck: {6815744, 7864320, 8388608, 14155776, 14680064, 15204352, 15728640, 16252928}
[2023-02-08T20:06:39Z DEBUG sled::pagecache::iobuf] storing lsn 2714238976 in beginning of buffer
[2023-02-08T20:06:39Z DEBUG sled::pagecache::iobuf] wrote lsns 2713729790-2714238975 to disk at offsets 3685118-4194303, maxed true complete_len 509186
[2023-02-08T20:06:39Z DEBUG sled::pagecache::iobuf] mark_interval(2713729790, 509186)
[2023-02-08T20:06:39Z DEBUG sled::pagecache::iobuf] new highest interval: 2713729790 - 2714238975
[2023-02-08T20:06:39Z DEBUG sled::pagecache::segment] freeing segment 5767168
[2023-02-08T20:06:39Z DEBUG sled::pagecache::iobuf] advancing offset within the current segment from 4194304 to 4207763
[2023-02-08T20:06:39Z DEBUG sled::pagecache::iobuf] wrote lsns 2714238976-2714252434 to disk at offsets 4194304-4207762, maxed false complete_len 13459
[2023-02-08T20:06:39Z DEBUG sled::pagecache::iobuf] mark_interval(2714238976, 13459)
[2023-02-08T20:06:39Z DEBUG sled::pagecache::iobuf] new highest interval: 2714238976 - 2714252434
[2023-02-08T20:06:39Z DEBUG sled::pagecache::iobuf] rolling to new segment after clearing 4207763-4715490
[2023-02-08T20:06:39Z DEBUG sled::pagecache::segment] segment accountant returning offset: 5767168 for lsn 2714763264 on deck: {6815744, 7864320, 8388608, 14155776, 14680064, 15204352, 15728640, 16252928}
[2023-02-08T20:06:39Z DEBUG sled::pagecache::iobuf] storing lsn 2714763264 in beginning of buffer
[2023-02-08T20:06:39Z DEBUG sled::pagecache::iobuf] wrote lsns 2714252435-2714763263 to disk at offsets 4207763-4718591, maxed true complete_len 510829
[2023-02-08T20:06:39Z DEBUG sled::pagecache::iobuf] mark_interval(2714252435, 510829)
[2023-02-08T20:06:39Z DEBUG sled::pagecache::iobuf] new highest interval: 2714252435 - 2714763263
[2023-02-08T20:06:40Z DEBUG sled::pagecache::segment] freeing segment 4718592
[2023-02-08T20:06:40Z DEBUG sled::pagecache::iobuf] advancing offset within the current segment from 5767168 to 5782063
[2023-02-08T20:06:40Z DEBUG sled::pagecache::iobuf] wrote lsns 2714763264-2714778158 to disk at offsets 5767168-5782062, maxed false complete_len 14895
[2023-02-08T20:06:40Z DEBUG sled::pagecache::iobuf] mark_interval(2714763264, 14895)
[2023-02-08T20:06:40Z DEBUG sled::pagecache::iobuf] new highest interval: 2714763264 - 2714778158
[2023-02-08T20:06:40Z DEBUG sled::pagecache::iobuf] rolling to new segment after clearing 5782063-6285966
[2023-02-08T20:06:40Z DEBUG sled::pagecache::segment] segment accountant returning offset: 4718592 for lsn 2715287552 on deck: {6815744, 7864320, 8388608, 14155776, 14680064, 15204352, 15728640, 16252928}
[2023-02-08T20:06:40Z DEBUG sled::pagecache::iobuf] storing lsn 2715287552 in beginning of buffer
[2023-02-08T20:06:40Z DEBUG sled::pagecache::iobuf] wrote lsns 2714778159-2715287551 to disk at offsets 5782063-6291455, maxed true complete_len 509393
[2023-02-08T20:06:40Z DEBUG sled::pagecache::iobuf] mark_interval(2714778159, 509393)
[2023-02-08T20:06:40Z DEBUG sled::pagecache::iobuf] new highest interval: 2714778159 - 2715287551
[2023-02-08T20:06:40Z DEBUG sled::pagecache::segment] freeing segment 5242880
[2023-02-08T20:06:40Z DEBUG sled::pagecache::iobuf] advancing offset within the current segment from 4718592 to 4736715
[2023-02-08T20:06:40Z DEBUG sled::pagecache::iobuf] wrote lsns 2715287552-2715305674 to disk at offsets 4718592-4736714, maxed false complete_len 18123
[2023-02-08T20:06:40Z DEBUG sled::pagecache::iobuf] mark_interval(2715287552, 18123)
[2023-02-08T20:06:40Z DEBUG sled::pagecache::iobuf] new highest interval: 2715287552 - 2715305674
[2023-02-08T20:06:40Z DEBUG sled::pagecache::iobuf] rolling to new segment after clearing 4736715-5237360
[2023-02-08T20:06:40Z DEBUG sled::pagecache::segment] segment accountant returning offset: 5242880 for lsn 2715811840 on deck: {6815744, 7864320, 8388608, 14155776, 14680064, 15204352, 15728640, 16252928}
[2023-02-08T20:06:40Z DEBUG sled::pagecache::iobuf] storing lsn 2715811840 in beginning of buffer
[2023-02-08T20:06:40Z DEBUG sled::pagecache::iobuf] wrote lsns 2715305675-2715811839 to disk at offsets 4736715-5242879, maxed true complete_len 506165
[2023-02-08T20:06:40Z DEBUG sled::pagecache::iobuf] mark_interval(2715305675, 506165)
[2023-02-08T20:06:40Z DEBUG sled::pagecache::iobuf] new highest interval: 2715305675 - 2715811839
[2023-02-08T20:06:40Z DEBUG sled::pagecache::segment] freeing segment 2621440
[2023-02-08T20:06:41Z DEBUG sled::pagecache::iobuf] advancing offset within the current segment from 5242880 to 5265267
[2023-02-08T20:06:41Z DEBUG sled::pagecache::iobuf] wrote lsns 2715811840-2715834226 to disk at offsets 5242880-5265266, maxed false complete_len 22387
[2023-02-08T20:06:41Z DEBUG sled::pagecache::iobuf] mark_interval(2715811840, 22387)
[2023-02-08T20:06:41Z DEBUG sled::pagecache::iobuf] new highest interval: 2715811840 - 2715834226
[2023-02-08T20:06:41Z DEBUG sled::pagecache::iobuf] rolling to new segment after clearing 5265267-5762597
[2023-02-08T20:06:41Z DEBUG sled::pagecache::segment] segment accountant returning offset: 2621440 for lsn 2716336128 on deck: {6815744, 7864320, 8388608, 14155776, 14680064, 15204352, 15728640, 16252928}
[2023-02-08T20:06:41Z DEBUG sled::pagecache::iobuf] storing lsn 2716336128 in beginning of buffer
[2023-02-08T20:06:41Z DEBUG sled::pagecache::iobuf] wrote lsns 2715834227-2716336127 to disk at offsets 5265267-5767167, maxed true complete_len 501901
[2023-02-08T20:06:41Z DEBUG sled::pagecache::iobuf] mark_interval(2715834227, 501901)
[2023-02-08T20:06:41Z DEBUG sled::pagecache::iobuf] new highest interval: 2715834227 - 2716336127
[2023-02-08T20:06:41Z DEBUG sled::pagecache::segment] freeing segment 6291456

This tells me (someone who has never really used sled before) that data is being written. And indeed, htop shows me that solar is writing ~1MB/s to the local disk. BUT the size of ~/.local/share/solar remains constant.

So I guess there's a bug somewhere?

@black-puppydog

This doesn't actually make solar show up in Patchwork's "Possible connections" list

LAN discovery is not working because the nodes are listening on different ports (8008 for Patchwork and 9009 for solar, per your config). I usually test LAN discovery by running solar on a separate machine.

As we see, patchwork connects to solar, replication succeeds, albeit with an "out of order" message.
I think I saw a discussion about this behaviour on ssb lately; Indeed it seems that requesting everything "after" seq=7998 returns seq==7998 as well.

Yup, this is the behaviour I noticed and reported. It's a bit annoying but certainly not a major issue.

solar will still launch the replication, but it never prints the last line "finished connection". Instead, I see an infinite (?) loop of this sort of output:

I recently noticed that solar can get stuck in the peer replication loop, even if the connection has been closed by a peer. This is generally because an end-of-stream message has not been received. For example, clicking the 'Force disconnect' button in Patchwork sends a binary RPC request to solar which is not recognised; as such, the peer loop continues.

In the (very WIP) connection manager branch I introduced an idle timer as a means of breaking out of the peer loop after it has been idle for n consecutive seconds (the current default is 30 seconds). This approach works but I hope I can find a more elegant solution in the replication logic.

Regarding the ongoing write activity by sled, I am not sure what is going on there (also not a sled expert). Since those debug logs from sled are very noisy, you might want to run some trace logging to get a better idea of what's happening in solar:

RUST_LOG=connection-manager,peer-loop=trace