`apache_avro::Writer::flush` does not call `std::io::Write::flush` on the inner writer
Closed this issue · 2 comments
(Moving this over from AVRO-4063)
Issue Overview
The Rust documentation for apache_avro::Writer::flush
describes the function as follows:
Flush the content appended to a
Writer
. Call this function to make sure all the content has been written before releasing theWriter
.
However, this function does not actually guarantee that all the content will be written out after the flush()
call, because it does not call std::io::Write::flush
on the inner writer.
This can be a problem when the inner writer uses its own buffer.
Example
fn main() {
let buffered_writer = std::io::BufWriter::new(std::fs::File::create("test.avro").unwrap());
let schema = apache_avro::Schema::parse_str(
r#"
{
"type": "record",
"name": "example_schema",
"fields": [
{"name": "example_field", "type": "string"}
]
}
"#,
)
.unwrap();
let mut writer = apache_avro::Writer::new(&schema, buffered_writer);
let mut record = apache_avro::types::Record::new(writer.schema()).unwrap();
record.put("example_field", "value");
writer.append(record).unwrap();
writer.flush().unwrap();
let test_file_contents = std::fs::read("test.avro").unwrap();
assert_ne!(test_file_contents.len(), 0); // this will fail
}
In this example, the internal BufWriter
had not yet flushed its internal buffer after writer.flush().unwrap()
was called. In fact, the buffer is only written out once writer
is dropped.
Solution
std::io::Write::flush
should be called on the inner writer at the end of apache_avro::Writer::flush
.
@martin-g Could you add me as an assignee to this issue? Thanks!
Only members of the Avro team could be assigned.
Your comment is enough!
Thank you!