ClickHouse/clickhouse-cpp

"Unexpected packet Data received from client" on good data after trying to insert bad data

Akvinikym opened this issue · 3 comments

Hello.

I have encountered what I believe isn't a good behaviour from the library. As you can see on the minimum example below, firstly I've tried to insert String to the UInt64 column, and I get an exception with text: DB::Exception: Cannot convert: String to UInt64. That's good.

However, when I try to insert correct data as a second step, I also get an exception saying: Unexpected packet Data received from client. Do I need to reset the client somehow after an exception happened? I would expect to successfully insert the correct data even after the first failure. Example:

int main() {
	clickhouse::Client client(clickhouse::ClientOptions().SetHost("localhost").SetPassword("clickhouse"));
	client.Execute("CREATE DATABASE IF NOT EXISTS test");
	client.Execute("CREATE TABLE IF NOT EXISTS test.numbers (id UInt64) ENGINE = Memory");

	clickhouse::Block block_bad;
	auto id_column_bad = std::make_shared<clickhouse::ColumnString>();
	id_column_bad->Append("Foo");
	block_bad.AppendColumn("id", id_column_bad);
	try
	{
		client.Insert("test.numbers", block_bad);
	}
	catch (const std::exception& e)
	{
		// says "arrived at 1: DB::Exception: Cannot convert: String to UInt64"
		printf("arrived at 1: %s\n", e.what());
	}

	clickhouse::Block block_good;
	auto id_column_good = std::make_shared<clickhouse::ColumnUInt64>();
	id_column_good->Append(1);
	block_good.AppendColumn("id", id_column_good);
	try
	{
		client.Insert("test.numbers", block_good);
	}
	catch (const std::exception& e)
	{
		// says "arrived at 2: DB::Exception: Unexpected packet Data received from client"
		printf("arrived at 2: %s\n", e.what());
	}
}

Initially I have discovered this issue in my code, but it was slightly different: the second attempt to insert the same incorrect data (I have a retrial mechanism in the code to mitigate possible network issues) led me to an infinite wait in client.Insert(..) function. After debugging a little I have found out that the library does not actually send a request to the server in that case: BufferedOutput::DoFlush function skips the if body completely. After that it tries to read the response from the server, but it never comes because there was no request.

Unfortunately, I could not reproduce the original issue with a minumum example, but I believe it can have something in common with the two exceptions case I've described in the beginning.

Enmk commented

Hi @Akvinikym, thank you for reporting.

Imagine the client was sending some data packets and got an error in the middle of it. So some data, like headers (that tell server to read next X bytes as payload) were already sent and interpreted by server, some data is in the userspace buffer, some in kernel internal TCP buffers.

Even if we somehow flush everything and send it to the server (or if we discard all of it) there is still not enough data on the receiving end. And anything sent over the same connection is going to be treated as payload (and potentially misinterpreted).

Furthermore, there is no way to tell server 'drop processing current stream of data and handle next request' in the CH native protocol.

SO in my opinion, it is much safer to just call a client->ResetConnection() before performing the next attempt.
i.e. there is no way to fix this behavior other than calling ResetConnection internally on any exception, but IMO this would bring more harm than good.

Hi @Enmk!

I see the issue. Of course, it's not a problem to call ResetConnection() after a failed attempt to send data, but I had to debug the library to figure it out :) Is it maybe possible to leave some comment in the documentation or the usage example, just for the future generations?

Enmk commented

Updated readme