sheharyarn/memento

Data is not always flushed to disk

sheharyarn opened this issue · 6 comments

Original post by @cpilka, moved from #3:


I have a weird issue with disc persistence. Data is flushed to disc just for a certain amount of records (few hundred). Single inserts are not written to disc. Just wonder if there's a :disc_only_copies in Memento that would write all data straight to mnesia files.

This one doesn't write to disc:

Memento.transaction! fn ->
  Memento.Query.write(record)
end

This one does:

Memento.transaction! fn ->
  for _ <- 1..100 do
    Memento.Query.write(record)
  end
end

Is there any setting that tells Mnesia to flush the memory part always to disc and not to flush reaching certain buffer? Or whatever causes this issue ...

Original post by @cpilka, moved from #3:


From https://learnyousomeerlang.com/mnesia#from-record-to-table what I read is:

disc_copies
This option means that the data is stored both in ETS and on disk, so both memory and the hard disk. disc_copies tables are not limited by DETS limits, as Mnesia uses a complex system of transaction logs and checkpoints that allow to create a disk-based backup of the table in memory.

To tell Mnesia to flush data straight to disk :disc_only_copies exists, but is not supported in Memento, right?

Memento.Table.create!(YourTable, disc_only_copies: nodes)

returns:

Mnesia Error: {:bad_type, YourTable, {:not_supported, :ordered_set, :disc_only_copies}}

Original post by @cpilka, moved from #3:


OK, the issue was actually http://erlang.org/doc/man/mnesia.html -> "Notice that currently ordered_set is not supported for disc_only_copies". The :bad_type error above was caused by ordered_set in table definition and :disc_only_copies

@cpilka That's interesting, I didn't know disc_only_copies did not support ordered_set tables. I'll make sure that it gets documented in the create/2 method doc.

As for the flushing the data part, here's a discussion about it on the Erlang mailing list:

With Mnesia using the disk_log module, which in turn usually uses write(2) only, you are not certain that the OS will have copied write(2)'s data to the disk device. In most cases, the kernel can (and will) wait for many seconds before flushing that data to the disk device.

But I think you should be able to sacrifice some performance for guaranteed writes using Transaction.execute_sync!/2:

Memento.Transaction.execute_sync! fn ->
  Memento.Query.write(record)
end

Hey @cpilka. Any updates on this? Did Memento.Transaction.execute_sync! solve your problem?

In Apache Kafka they rely heavily in disk I/O and they have settings to control the time it takes the Operating System to flush the OS page cache to disk.

So if Mnesia also uses the same approach of Kafka to commit data from memory into disk then fsync can help flushing it.

See this answer that contains more info and several links, included the one I mentioned for fsync.

Closing because of inactivity.