pacman82/odbc-api

Support for BLOB

Closed this issue · 8 comments

Hi. First of all, thank you for your great work with both odbc-safe and odbc-api.

After having used odbc-rs for a while (which has not seen an update in some time), seeing this new approach is quite refreshing. I specially like the fact that now the bind_parameters is not as lifetime bound as in odbc-rs.

One thing that's preventing us from starting to experiment with odbc-api is the lack of support for BLOB types (Vec<u8>). I've read in the README that an expected feature is Support for streaming binary data, which is going to be very useful with some databases that require it.

However, will the support for BLOBs arrive with that feature, or is the implementation of InputParameter for Vec<u8> to be expected soon?

Thanks a lot!

Hello, and thank you. odbc-apis design very much benefits from the hindsight gained writing odbc-safe. I am happy to see that you want to give it a try.

I am not quite sure what features are missing for you exactly as there are a lot of things you might want to do with binary data. The checkbox in the readme refers to support for streaming large values from the database and either store them completly in memory or process them chunck by chunk. Similar to what you can see for character data in these two code examples:

However, will the support for BLOBs arrive with that feature, or is the implementation of InputParameter for Vec to be expected soon?

Your follow up question led me to believe that you are more interested in inserting binary data. PR #58 implements that, but I need to look over it again, then I am less tired. While it does not implement InputParameter for Vec<u8> it is implementing IntoParameter for both &[u8] and Vec<u8>. The indirection is required because we need to indiacte the length of the data to ODBC via an &isize rather than the usize the Vec<u8>::len offers.

The current version 0.22.0 actually already supports fetching and inserting binary data via ColumnarRowSet as AnyColumnView has a binary variant (https://docs.rs/odbc-api/0.22.0/odbc_api/buffers/enum.AnyColumnView.html#variant.Binary). This is the fastest way to insert or query a lot of binary buffers, but it is admitedly very akward if you just have a single buffer you want to use as a parameter.

Would you mind telling me a bit more about your usecase, so I can implement the features missing for you first?

Cheers, Markus

Hi!
First of all, thank you for your time writing that thorough answer!

The features I was talking about were:

  • Inserting and retrieving binary data from the database (plain old BLOBs).
  • Inserting BLOBs via data streaming.

Your follow up question led me to believe that you are more interested in inserting binary data. PR #58 implements that, but I need to look over it again, then I am less tired

Yes, I'm specially interested in inserting binary data. Don't worry and don't rush it, it's a tricky enough behaviour that it deserves its time :)

The part of inserting blobs via data streaming is because a behaviour of the Oracle database ODBC driver. They only support "direct insert" of blobs up to 2-4kb of data. If you want to insert something greater than that you must either:

a) Use their proprietary library (rust bindings: https://github.com/kubo/rust-oracle)
b) Insert the blob by using a binary stream.

Soon I'll try to make a first change of our persistence layer! Thanks for your support :)

Released odbc-api 0.22.1. It adds support finary types. You can fetch arbitrarily large binary data with:
https://docs.rs/odbc-api/0.22.1/odbc_api/struct.CursorRow.html#method.get_binary or with repeated calls to .get_data if you do not want to keep it in memory all at once. Vec<u8> also now supports into_parameter. Inserting arbitrary large binary data is not yet supported however. I updated the readme to reflect that fact.

Thanks a lot for your work and for the update!

There is no safe abstraction yet, but you may find this test interesting:

fn insert_text_blob_in_stream(profile: &Profile) {

#[test_case(MSSQL; "Microsoft SQL Server")]
#[test_case(MARIADB; "Maria DB")]
#[test_case(SQLITE_3; "SQLite 3")]
fn insert_text_blob_in_stream(profile: &Profile) {
    let table_name = "InsertLargeTextInStream";
    let conn = ENV
        .connect_with_connection_string(profile.connection_string)
        .unwrap();
    setup_empty_table(&conn, profile.index_type, table_name, &["text"]).unwrap();

    let insert = format!("INSERT INTO {} (a) VALUES (?)", table_name);
    let insert = U16String::from_str(&insert);

    let mut statement = conn.preallocate().unwrap().into_statement();
    let text_size = 12000;

    let batch = "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\
      abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\
      abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\
      abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\
      abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\
      abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\
      abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\
      abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\
      abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\
      abcdefghijklmnopqrstuvwxyz";

    unsafe {
        let indicator = sys::len_data_at_exec(text_size.try_into().unwrap());
        let ret = sys::SQLBindParameter(
            statement.as_sys(),
            1,
            sys::ParamType::Input,
            sys::CDataType::Char,
            sys::SqlDataType::EXT_LONG_VARCHAR,
            text_size,
            0,
            1 as sys::Pointer,
            0,
            &indicator as *const isize as *mut isize,
        );
        assert_eq!(sys::SqlReturn::SUCCESS, ret);

        let ret = sys::SQLExecDirectW(
            statement.as_sys(),
            insert.as_ptr(),
            insert.len().try_into().unwrap(),
        );
        assert_eq!(sys::SqlReturn::NEED_DATA, ret);

        let need_data = statement.param_data().unwrap();
        assert_eq!(Some(1 as Pointer), need_data);

        let mut bytes_left = text_size;
        while bytes_left > batch.len() {
            let ret = SQLPutData(
                statement.as_sys(),
                batch.as_ptr() as Pointer,
                batch.len().try_into().unwrap(),
            );
            assert_eq!(sys::SqlReturn::SUCCESS, ret);
            bytes_left -= batch.len();
        }
        // Put final batch
        let ret = SQLPutData(
            statement.as_sys(),
            batch.as_ptr() as Pointer,
            bytes_left.try_into().unwrap(),
        );
        assert_eq!(sys::SqlReturn::SUCCESS, ret);

        let need_data = statement.param_data().unwrap();
        assert_eq!(None, need_data);
    }
}

That's definitely a great start!

You are doing an awesome work, thanks a lot 🎉

The release 0.24.0 ships with a save abstraction for sending large Blobs. You find some code examples here: https://docs.rs/odbc-api/0.24.0/odbc_api/parameter/index.html#sending-long-data

Please let me know if this solves your use case.

Cheers, Markus