spacejam/sled

Hi I wanted to know if this project is dead?

Raj2032 opened this issue ยท 13 comments

Hi I wanted to know if this project is dead?

Yeah, I wanna know too

I don't have an official answer, but the README mentions:

A full rewrite of sled's storage subsystem is happening on a modular basis as part of the komora project, in particular the marble storage engine. This will dramatically lower both the disk space usage (space amplification) and garbage collection overhead (write amplification) of sled.

It seems like sled's author, spacejam is still actively contributing to various projects including komora repos which AFAICT will become part of sled when ready.

You might also find this relevant: #1417 (comment)

I've been working on a rewrite of smaller components in the komora github org, which will be released as part of sled's new storage engine, solving the current space issues. It's much easier for me to work on it in a quiet way off to the side like that, but it is coming together nicely.

@spacejam is there a writeup on what features the new sled will have? I am mostly wondering if multi-threaded support will be improved?

vi commented

Will the renewed Sled have the same license (and approach towards distribution) as current one?

Or will there be any scheme like a perpetual beta "Community edition" plus production-ready "Enterprise edition"?

@vi generally I see sled remaining MIT/Apache-2.0. There are a variety of non-open systems that build on top of sled in interesting ways, some of which I've built myself, but I don't plan to monetize sled directly. sled's openness is good for the reliability of the core.

@physics515 could you elaborate on which multi-threaded aspects could help your use cases if they were changed? In general, I see sled's multi-threaded aspects as one of its strong points, given the high throughput lock-free indexing etc... and the next version is more about operational metric improvements around space and efficiency than new features. Although there are some aspects of transactions, merge operators, and subscription that I feel could be unified into a higher consistency unified concept, so it's possible that the pre-release will not have transactions/merge operators/subscription (at least at first, while the replacements are experimented with) but for the stable release the replacement / improvements for these 3 features should be in-place. Async is likely to see more attention, too.

@spacejam maybe I misspoke. I meant more along the lines of being able to have multiple instances. I have run into a few use cases where it would have been handy to access the same "data" from multiple programs simultaneously.

As I understand it, this is not currently possible, but I could be wrong.

@physics515 that makes sense! multi-process is definitely one of the major draws for embedded db's like LMDB or SQLite, and I see what they enable as being very valuable in certain use cases, but they require a very heavy syscall : write operation ratio to ensure consistency, which severely limits write throughput. RocksDB also supports a form of multi-process access, but it is much more restrictive. Multi-process has been elusive to me, because I feel like it always involves some sharp performance and complexity trade-offs that haven't felt right for sled. When sled supported a read-only mode, it was a constant source of bugs, so I decided to remove it to avoid causing data loss for users by exposing them to bug densities that I didn't feel comfortable about.

The thought that has kept my attention over the years is to include a server binary with sled that can be connected to from the sled library instead of opening a file, and support a more restrictive set of features, but once sockets start to become involved, users start to have far more specific requirements that are difficult to meet well for 80%+ of users. Operations that involve multiple communications with a server start to become interesting from a distributed systems perspective, which entail even sharper performance and complexity trade-offs.

So, I tend to suggest that people who want multi-process support to achieve it by using sockets in a way that meets their own goals. And it's possible that I'll provide a network access primitive for assisting users with this over time. But I don't see it as something that I'm likely to release in the short term.

@spacejam that has been exactly my solution a few times with pet projects. I have implemented lightweight Axum wrappers around sled and have had a bit of success. Maybe when I get around to formalizing something I'll publish a separate crate.

@spacejam Is there any expected date for the new rewritten version of sled?