yandex-cloud/geesefs

v0.40.0: Unexpected disk cache grows

nixargh opened this issue · 3 comments

Hi, team!

We used 0.39.1 before and disk cache consumption was quite stable, like 100GB from 1TB partition.
But with 0.40.0 cache started to grow continiously and reached about 600GB in a few hours. Rollback fixed the issue.
So something has changed and I don't understand what will happen with geesefs when cache partition is full.
Updated: our system makes a lot of writes and moves but few reads. So I believe it is write cache that grows.

  • Can you shed some lite on how disk cache works?
  • Maybe give me some advise about valid parameters etc?

Parameters we use:
/usr/local/bin/geesefs --endpoint 'https://s3.host' --region foo --storage-class STANDARD --uid 5001 --gid 5001 --no-checksum --memory-limit 3072 --read-ahead-large 20 --max-flushers 32 --max-parallel-parts 32 --part-sizes '50' --single-part 50 --cache '/media/cache/s3fuse/%i' -o allow_other --cheap --no-specials '%i' '/media/%i'

Hi. Sorry for not answering :-)
It's a great surprise to me that there are real users of disk cache :-)
Disk cache in fact never was a 100% complete implementation - it had always missed eviction i.e. cache size was never limited.
Before 0.40 there was an ugly "popularity" tracking implementation, but I thought it was a complete mess and removed it in 0.40. In fact some people who tried disk cache before 0.40 even filed bugs similar to "why is the disk cache empty?!" - because the idea was that only "popular" files were evicted from memory to the disk. And "non-popular" files were simply removed from memory without being copied to the disk.
Your issue means that it was limiting the disk cache so it was useful to some extent :)
I can think about returning it back, at least in some modified form...
What's your use case by the way, do you have a subset of "hot" files?

Hi, thanks for answering )
As I wrote before, we do a lot of writes and only few reads, so I believe these are not "hot" files at all but a kind of intemediate files on they way to S3.

In that case I suppose it's better for you to disable disk cache at all :)