Transaction::lock_keys should wait instead of return a `Failed to resolve lock` Error.
blacktear23 opened this issue · 11 comments
Transaction::lock_keys should wait instead of return a Failed to resolve lock
Error. As other language's standard library lock
call will block current thread or coroutine. So Transaction::lock_keys should block until got the lock. And we can provide a try_lock_keys
function that return immediately.
I think it's by design. By default the resolve lock process will retry for several times, but not retry forever.
Your requested behavior can be achieved by providing a custom RetryOptions
via TransactionOptions::retry_options
I have wrote a test code like below:
async fn tikv_transaction1(txn: Transaction) {
let mut txn = txn;
println!("txn1 start Lock");
match txn.lock_keys(vec!["tk2".to_owned()]).await {
Ok(_) => {}
Err(err) => {
println!("txn2 lock key error: {}", err.to_string());
txn.rollback().await.unwrap();
return;
}
};
println!("txn1 start put");
match txn.put("tk2".to_owned(), "tv1".to_owned()).await {
Ok(_) => {}
Err(err) => {
println!("txn1 {:}", err.to_string());
txn.rollback().await.unwrap();
return;
}
}
sleep(5000).await;
txn.commit().await.unwrap();
println!("txn1 Commit");
}
async fn tikv_transaction2(txn: Transaction) {
let mut txn = txn;
println!("txn2 start Lock");
match txn.lock_keys(vec!["tk2".to_owned()]).await{
Ok(_) => {}
Err(err) => {
println!("txn2 lock key error: {}", err.to_string());
txn.rollback().await.unwrap();
return;
}
}
println!("txn2 start put");
match txn.put("tk2".to_owned(), "tv2".to_owned()).await {
Ok(_) => {}
Err(err) => {
println!("txn2 {:}", err.to_string());
txn.rollback().await.unwrap();
return;
}
}
sleep(1000).await;
txn.commit().await.unwrap();
println!("txn2 Commit");
}
pub fn get_transaction_option() -> TransactionOptions {
let opts = TransactionOptions::new_pessimistic();
let mut retry_opts = RetryOptions::default_pessimistic();
retry_opts.lock_backoff = Backoff::full_jitter_backoff(2, 500, 1000);
retry_opts.region_backoff = Backoff::full_jitter_backoff(2, 500, 1000);
opts.drop_check(CheckLevel::Warn)
.use_async_commit()
.try_one_pc()
.retry_options(retry_opts)
}
#[tokio::main]
async fn main() {
let runtime = Runtime::new().unwrap();
let pd_addrs = vec![String::from("127.0.0.1:2379")];
let conn = TransactionClient::new(pd_addrs.clone()).await.unwrap();
let conn2 = TransactionClient::new(pd_addrs.clone()).await.unwrap();
let txn1 = conn.begin_with_options(get_transaction_option()).await.unwrap();
runtime.spawn(async move {
// tikv_test().await;
tikv_transaction1(txn1).await;
});
sleep(1000).await;
let txn2 = conn2.begin_with_options(get_transaction_option()).await.unwrap();
runtime.spawn(async move {
tikv_transaction2(txn2).await;
});
sleep(10000).await;
println!("Finished");
runtime.shutdown_background();
}
As expected txn2's lock_keys
should be wait until txn1 commit. But the execute result is:
txn1 start Lock
txn1 start put
txn2 start Lock
txn2 start put
txn2 Commit
thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()` on an `Err` value: KeyError(KeyError { locked: None, retryable: "", abort: "Error(Txn(Error(Mvcc(Error(PessimisticLockNotFound { start_ts: TimeStamp(430570303956975617), key: [116, 107, 50] })))))", conflict: None, already_exist: None, deadlock: None, commit_ts_expired: None, txn_not_found: None, commit_ts_too_large: None })', src/main.rs:37:24
stack backtrace:
0: _rust_begin_unwind
1: core::panicking::panic_fmt
2: core::result::unwrap_failed
3: core::result::Result<T,E>::unwrap
at /private/tmp/rust-20211101-73771-1w3cv5q/rustc-1.56.1-src/library/core/src/result.rs:1299:23
4: tikvclienttest::tikv_transaction1::{{closure}}
at /Users/rain/work/tikvclienttest/src/main.rs:37:5
5: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at /private/tmp/rust-20211101-73771-1w3cv5q/rustc-1.56.1-src/library/core/src/future/mod.rs:80:19
6: tikvclienttest::main::{{closure}}::{{closure}}
at /Users/rain/work/tikvclienttest/src/main.rs:86:9
7: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at /private/tmp/rust-20211101-73771-1w3cv5q/rustc-1.56.1-src/library/core/src/future/mod.rs:80:19
8: tokio::runtime::task::core::CoreStage<T>::poll::{{closure}}
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/task/core.rs:161:17
9: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/loom/std/unsafe_cell.rs:14:9
10: tokio::runtime::task::core::CoreStage<T>::poll
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/task/core.rs:151:13
11: tokio::runtime::task::harness::poll_future::{{closure}}
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/task/harness.rs:461:19
12: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
at /private/tmp/rust-20211101-73771-1w3cv5q/rustc-1.56.1-src/library/core/src/panic/unwind_safe.rs:271:9
13: std::panicking::try::do_call
at /private/tmp/rust-20211101-73771-1w3cv5q/rustc-1.56.1-src/library/std/src/panicking.rs:403:40
14: <unknown>
at /private/tmp/rust-20211101-73771-1w3cv5q/rustc-1.56.1-src/library/std/src/panicking.rs:431:6
15: std::panicking::try
at /private/tmp/rust-20211101-73771-1w3cv5q/rustc-1.56.1-src/library/std/src/panicking.rs:367:19
16: std::panic::catch_unwind
at /private/tmp/rust-20211101-73771-1w3cv5q/rustc-1.56.1-src/library/std/src/panic.rs:129:14
17: tokio::runtime::task::harness::poll_future
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/task/harness.rs:449:18
18: tokio::runtime::task::harness::Harness<T,S>::poll_inner
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/task/harness.rs:98:27
19: tokio::runtime::task::harness::Harness<T,S>::poll
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/task/harness.rs:53:15
20: tokio::runtime::task::raw::poll
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/task/raw.rs:113:5
21: tokio::runtime::task::raw::RawTask::poll
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/task/raw.rs:70:18
22: tokio::runtime::task::LocalNotified<S>::run
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/task/mod.rs:343:9
23: tokio::runtime::thread_pool::worker::Context::run_task::{{closure}}
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/thread_pool/worker.rs:420:13
24: tokio::coop::with_budget::{{closure}}
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/coop.rs:102:9
25: std::thread::local::LocalKey<T>::try_with
at /private/tmp/rust-20211101-73771-1w3cv5q/rustc-1.56.1-src/library/std/src/thread/local.rs:399:16
26: std::thread::local::LocalKey<T>::with
at /private/tmp/rust-20211101-73771-1w3cv5q/rustc-1.56.1-src/library/std/src/thread/local.rs:375:9
27: tokio::coop::with_budget
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/coop.rs:95:5
28: tokio::coop::budget
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/coop.rs:72:5
29: tokio::runtime::thread_pool::worker::Context::run_task
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/thread_pool/worker.rs:419:9
30: tokio::runtime::thread_pool::worker::Context::run
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/thread_pool/worker.rs:386:24
31: tokio::runtime::thread_pool::worker::run::{{closure}}
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/thread_pool/worker.rs:371:17
32: tokio::macros::scoped_tls::ScopedKey<T>::set
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/macros/scoped_tls.rs:61:9
33: tokio::runtime::thread_pool::worker::run
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/thread_pool/worker.rs:368:5
34: tokio::runtime::thread_pool::worker::Launch::launch::{{closure}}
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/thread_pool/worker.rs:347:45
35: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/blocking/task.rs:42:21
36: tokio::runtime::task::core::CoreStage<T>::poll::{{closure}}
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/task/core.rs:161:17
37: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/loom/std/unsafe_cell.rs:14:9
38: tokio::runtime::task::core::CoreStage<T>::poll
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/task/core.rs:151:13
39: tokio::runtime::task::harness::poll_future::{{closure}}
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/task/harness.rs:461:19
40: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
at /private/tmp/rust-20211101-73771-1w3cv5q/rustc-1.56.1-src/library/core/src/panic/unwind_safe.rs:271:9
41: std::panicking::try::do_call
at /private/tmp/rust-20211101-73771-1w3cv5q/rustc-1.56.1-src/library/std/src/panicking.rs:403:40
42: <unknown>
at /private/tmp/rust-20211101-73771-1w3cv5q/rustc-1.56.1-src/library/std/src/panicking.rs:438:2
43: std::panicking::try
at /private/tmp/rust-20211101-73771-1w3cv5q/rustc-1.56.1-src/library/std/src/panicking.rs:367:19
44: std::panic::catch_unwind
at /private/tmp/rust-20211101-73771-1w3cv5q/rustc-1.56.1-src/library/std/src/panic.rs:129:14
45: tokio::runtime::task::harness::poll_future
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/task/harness.rs:449:18
46: tokio::runtime::task::harness::Harness<T,S>::poll_inner
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/task/harness.rs:98:27
47: tokio::runtime::task::harness::Harness<T,S>::poll
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/task/harness.rs:53:15
48: tokio::runtime::task::raw::poll
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/task/raw.rs:113:5
49: tokio::runtime::task::raw::RawTask::poll
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/task/raw.rs:70:18
50: tokio::runtime::task::UnownedTask<S>::run
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/task/mod.rs:379:9
51: tokio::runtime::blocking::pool::Inner::run
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/blocking/pool.rs:264:17
52: tokio::runtime::blocking::pool::Spawner::spawn_thread::{{closure}}
at /Users/rain/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/blocking/pool.rs:244:17
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
Finished
As we can see txn1's commit is failed.
This is unexpected, thanks for finding the issue!
The problem is that the pessimistic lock requests use the default TTL (3s, which is for prewrites) instead of the MAX_TTL(20s) so that the heartbeat mechanism isn't able to keep the locks alive.
@ekexium I test your PR in my test project, I found if I change the txn1's sleep time to 30 seconds, txn2 still can get the lock.
As we expected, txn2 should not get the lock before txn1 commit. txn2 lock_keys return an Error is acceptable.
@blacktear23 Did you use the 0.1 version or master? 0.1 doesn't include the heartbeat feature (its support for pessimistic transactions is incomplete).
your pr branch #329
your pr branch #329
emmm I can't reproduce it. Did you just change the sleep time from 5 secs to 30?
Btw what sleep
are you using?
yes just change 5sec to 30sec. The sleep function is using tokio::time::sleep
@ekexium https://github.com/blacktear23/tikv-client-test/blob/master/src/main.rs this is my test code repo.
If you build it with debug and execute the binary, you can get the result below:
txn1 start Lock
Jan 19 20:26:55.217 INFO connect to tikv endpoint: "127.0.0.1:20160"
txn1 start put
txn2 start Lock
Jan 19 20:26:56.223 INFO connect to tikv endpoint: "127.0.0.1:20160"
txn2 start put
txn2 Commit
thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()` on an `Err` value: MultipleKeyErrors([KeyError(KeyError { locked: None, retryable: "", abort: "Error(Txn(Error(Mvcc(Error(PessimisticLockNotFound { start_ts: TimeStamp(430596480093650945), key: [116, 107, 50] })))))", conflict: None, already_exist: None, deadlock: None, commit_ts_expired: None, txn_not_found: None, commit_ts_too_large: None })])', src/main.rs:37:24
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Finished
@blacktear23 I see. The new TTL is not updated correctly. I've fixed that in the same PR. Please check if it works.
@ekexium after update your branch it works!