duckdb/duckdb-rs

Multiple connections & threads from a single process

Closed this issue · 5 comments

I tried wrapping Connection with Arc in order to use it across multiple thread only to run into an error, InnerConnection is wrapped in a RefCell.

Per the duckdb docs, multiple reads and writes should be possible from a single process, the python lib allows that and I was wondering if there's a way to achieve this.

My current implementation involves creating a database with Connection::open("file_name") and subsequently calling this same function within each thread; it didn't work as I kept getting an error of the file being used by an existing process.

Is there something I'm missing? my current option to use channel mpsc where the consumer will hold a connection to the database and do the insertion from the producers within the multiple threads.

Thank you for the great work.

Unfortunately the rust client api does not separate the Connection and Database types and this is not straightforward. Instead you could use an arc mutex of a connection and use try_clone() to make a new connection within a thread. Otherwise you could use a deadpool of connections and populate the pool by cloning the connection using try_clone() on the initial connection.

I will do a POC per your pointers and share my findings. Thank you.

dvic commented

@dowusu care to sure your findings? What worked in the end the best for you?

I ended up with something that looks like the below; btw try_clone works as advertised & I replaced my channel implementation which serialises access to duckdb on a single thread with multiple connection.

pub struct MemCache {
    conn: Connection,
    conf: PrestoSettings,
}

impl MemCache {
    pub fn new(conf: PrestoSettings) -> Result<Self> {
        let conn = Connection::open_in_memory()?;
        info!("Creating an in-memory database.");

        Ok(Self { conn, conf })
    }
    
      pub fn get_connection(&self) -> Result<Connection> {
        self.conn.try_clone()
       }
    }

Usage was along these lines

let db = MemCache::new(config);

for i in 0..10 {
    let conn = db.get_connection().expect("Handle the error, I've not had a single failure");
    thread::spawn(move || {
       // you can work with the connection object/struct
   });
}

I hope this helps. Thank you.

dvic commented

I ended up with something that looks like the below; btw try_clone works as advertised & I replaced my channel implementation which serialises access to duckdb on a single thread with multiple connection.

pub struct MemCache {
    conn: Connection,
    conf: PrestoSettings,
}

impl MemCache {
    pub fn new(conf: PrestoSettings) -> Result<Self> {
        let conn = Connection::open_in_memory()?;
        info!("Creating an in-memory database.");

        Ok(Self { conn, conf })
    }
    
      pub fn get_connection(&self) -> Result<Connection> {
        self.conn.try_clone()
       }
    }

Usage was along these lines

let db = MemCache::new(config);

for i in 0..10 {
    let conn = db.get_connection().expect("Handle the error, I've not had a single failure");
    thread::spawn(move || {
       // you can work with the connection object/struct
   });
}

I hope this helps. Thank you.

Nice! Thanks for the reply!