an issue with the function implementation of mobi_buffer_get_varlen_internal in src/buffer.c
gkarchemsky opened this issue · 2 comments
When you create a MOBIBuffer
object:
typedef struct {
size_t offset; /**< Current offset in respect to buffer start */
size_t maxlen; /**< Length of the buffer data */
unsigned char *data; /**< Pointer to buffer data */
MOBI_RET error; /**< MOBI_SUCCESS = 0 if operation on buffer is successful, non-zero value on failure */
} MOBIBuffer;
the initial value of buf->offset
is 0:
MOBIBuffer * mobi_buffer_init_null(unsigned char *data, const size_t len) {
MOBIBuffer *buf = malloc(sizeof(MOBIBuffer));
if (buf == NULL) {
debug_print("%s", "Buffer allocation failed\n");
return NULL;
}
buf->data = data;
buf->offset = 0;
buf->maxlen = len;
buf->error = MOBI_SUCCESS;
return buf;
}
I think there is a problem calling mobi_buffer_get_varlen_internal
when direction
is -1
(read buffer backwards) with a value of buf->offset
that is 3.
If buf->offset
is 3, it should Reads maximum 4 bytes from the buffer. Stops when byte has bit 7 set.
so it should read byte number 3, byte number 2, byte number 1, and then byte number 0.
but when it comes to read byte number 0, we can see the following check at line 267:
if (buf->offset < 1) {
it checks if zero is less than 1 and it is, so an error is printed and only the last 3 bytes that have been read return and not the 4.
(even though according to pull request it should return 0)
if it needs to read byte number 0 - it should read it and then return without decrementing buf->offset
of 0 because if it does it, it will lead to an integer underflow and we will get the max value for size_t in buf->offset
, so I suggest checking if it is 0 after reading the byte to the value byte
and after updating the value of val
, and if buf->offset
is 0,
we should check byte_count
and according to that decide whether to execute
debug_print("%s", "End of buffer\n");
buf->error = MOBI_BUFFER_END;
return 0;
or to set byte
to stop_flag
so it will stop reading and return val
, while keeping buf->offset
at 0,
Thanks for this deep analysis.
Generally you are right, because the function doesn't work for all use cases.
On the other hand, the situation you describe should never happen, because the function is only used to read bytes in sections that are appended at the end of text record. In a valid file it should never go to offset 0.
The reason for the function to work like this is that, as you noticed, after reading the byte at offset zero the offset would underflow. But we cannot just leave the offset at position zero without decrementing it, as you propose. Imagine that after calling this function there is another call to any other function that reads the same buffer. That call would read the same byte again.
Maybe we could just let the offset underflow. The offset would point beyond the buffer. So any try to read the same buffer with any function would read 0 bytes and set error. The only problem is that someone could use mobi_buffer_seek
function on the buffer to advance the pointer leading to unexpected behavior.
This is a limitation of the implementation of the MOBIBuffer
, because it is always used in forward direction with the only exception of this function.
I don't see any perfect solution here. Maybe we should just update documentation of the mobi_buffer_get_varlen_dec
function to say that it will never read the byte at offset zero.
What do you think?
I had a closer look. For now, I just refactored the function a bit to make it more compatible with the other buffer functions. As you noticed it should return zero in case of error. But it also should not modify the buffer in case of error, so that the offset is not changed in that case. It should probably also fail in case of missing stop flag. The question how to treat a variable length integer at the buffer start in case of going backwards remains open.
static uint32_t mobi_buffer_get_varlen_internal(MOBIBuffer *buf, size_t *len, const int direction) {
bool has_stop = false;
uint32_t val = 0;
uint8_t byte_count = 0;
size_t max_count = direction == 1 ? buf->maxlen - buf->offset : buf->offset;
if (buf->offset < buf->maxlen && max_count) {
max_count = max_count < 4 ? max_count : 4;
uint8_t byte;
const uint8_t stop_flag = 0x80;
const uint8_t mask = 0x7f;
uint32_t shift = 0;
unsigned char *p = buf->data + buf->offset;
do {
if (direction == 1) {
byte = *p++;
val <<= 7;
val |= (byte & mask);
} else {
byte = *p--;
val = val | (uint32_t)(byte & mask) << shift;
shift += 7;
}
byte_count++;
has_stop = byte & stop_flag;
} while (!has_stop && (byte_count < max_count));
}
if (!has_stop) {
debug_print("%s", "End of buffer\n");
buf->error = MOBI_BUFFER_END;
return 0;
}
*len += byte_count;
buf->offset = direction == 1 ? buf->offset + byte_count : buf->offset - byte_count;
return val;
}