LogDevice client API read slow than expected

Question

LogDevice client API read slow than expected

zhengxiaochuan-3 opened this issue 5 years ago · 9 comments

I use bufferwriter to write logs to logdevice, one line one record per second, write for 14 hours, and then read it through the ldcat tool, total 5MB data takes 12 seconds.

Is this normal? What I'm missed?

std::unique_ptr<facebook::logdevice::Reader> reader = client->createReader(1);
...
reader->startReading(log, facebook::logdevice::LSN_OLDEST, until_lsn);
...
ssize_t nread = reader->read(100, &data, &gap);
...

Answer 1 · 2019-08-28T03:02:03.000Z

Should I use multi-threads to read, as one thread per hour-range, in my client program?

Answer 2 · 2019-08-28T03:28:15.000Z

looks like the log has many small records (14*3600=50400), so it could be bottlenecked by moving the flow control window between client and server (requires 1RTT for moving the window). One thing you can try is to increase the read buffer/window size, which can be done either through: 1) specify the buffer_size in Client::createReader(), or 2) set the client-read-buffer-size in Client setting (the default is 512): e.g., ClientFactory().setSetting("client-read-buffer-size", "8192").create(...).

Also, if you don't care about append/delivery latency, you can try batch more on the buffer writer by setting a bigger time trigger to get a bigger batch size which may improve efficiency.

Answer 3 · 2019-08-28T06:24:52.000Z

My bufferWriter time trigger is 1 second, and I do care about append/delivery latency.
I set facebook::logdevice::ClientFactory().setSetting("client-read-buffer-size", "8192").create( ); ,and read still costs 12 seconds. @runemaster

Answer 4 · 2019-08-30T02:43:22.000Z

Set time trigger to 5 seconds, read costs 2 seconds to read all the records(50400).
I can only do this for now.

Answer 5 · 2019-08-30T08:44:52.000Z

So it seems like the issue that you're having a lot of small records as @runemaster said earlier :)

Answer 6 · 2019-08-30T09:50:21.000Z

But this is a compromised way, the biggest latency is 5 second, users can not get real-time logs.
@MohamedBassem

Answer 7 · 2019-08-30T09:59:10.000Z

Ok, I need some debugging metrics from you in the old mode (1 second time trigger with client-read-buffer-size set to 8192) to be able to debug what's going on.

How long does it take you from the start of binary until you reach the read loop?
How long on average do you stay blocked in the read function call? (basically how long are you waiting for data from logdevice).
How many gaps do you encounter?
Do you mind also try calling waitOnlyWhenNoData() on your reader before starting to read?

Answer 8 · 2019-08-30T11:15:19.000Z

I add some timestamps, find that the reader->read(100, &data, &gap); costs most of time.
Change read100 to read1000, make no change for all records read time.
Add waitOnlyWhenNoData() make no change too.

Answer 9 · 2019-09-23T11:22:03.000Z

Hey, @zhengxiaochuan-3 . Is this issue still still affecting you?

In case it is, it would be helpful to know your current settings both for the specific log you're reading and for the cluster. It would also be great to know the types of gaps you are seeing (if you're seeing them).

There is a chance that a slow node might be increasing latency. In that case, it is possible that you will see improvements if you call reader->forceNoSingleCopyDelivery(). Note, however, that this increases network utilization since you will get all replicas of every record.