Invalid 8 Byte Files, Frequent Hangs of Applications, ERROR - "path/file.something" is not a valid cache file, deleting
GeorgiaM-honestly opened this issue · 1 comments
Hello,
I love catfs. I'm using it to provide caching for a goofys s3 "mount". It makes a huge difference!
I do however have some nagging issues which happen on a reglar basis. I hope I have provided enough information and would like to help resolve this if it's a non-local specific issue.
System:
- Gentoo linux
- uname -a: Linux deleted 5.12.13-gentoo-x86_64 #1 SMP Thu Jun 24 12:24:13 DELETED 2021 x86_64 Intel(R) Core(TM) i7-7820X CPU @ 3.60GHz GenuineIntel GNU/Linux
- 16 cores
- 32GB RAM
- Filesystem which the cache is using: 100GB, EXT4-fs (sde1): mounted filesystem with ordered data mode. Opts: errors=remount-ro,user_xattr. Quota mode: disabled.
catfs issues:
-
When viewing the data of a directory full of .tif files, the first one requested is always 8 bytes, which appears to be the bare minimum of data to get 'file' to report it is a tif image.
-
Indeed the first 8 bytes of real source file is the same, however, it is far larger than 8 bytes.
-
When viewing the data of a directory full of .tif files, first request or not, screenful of errors are printed out by catfs. Examples:
2021-06-30 16:53:22 ERROR - "2005_Hurricane_Katrina/aug31JpegTiles_GCS_NAD83/aug31C0902830w295700n.tif" is not a valid cache file, deleting
2021-06-30 16:53:23 ERROR - "2005_Hurricane_Katrina/aug31JpegTiles_GCS_NAD83/aug31C0902830w295830n.tif" is not a valid cache file, deleting
2021-06-30 16:53:23 ERROR - "2005_Hurricane_Katrina/aug31JpegTiles_GCS_NAD83/aug31C0902830w300000n.tif" is not a valid cache file, deleting
2021-06-30 16:53:24 ERROR - "2005_Hurricane_Katrina/aug31JpegTiles_GCS_NAD83/aug31C0902830w300130n.tif" is not a valid cache file, deleting
2021-06-30 16:53:24 ERROR - "2005_Hurricane_Katrina/aug31JpegTiles_GCS_NAD83/aug31C0903000w295530n.tif" is not a valid cache file, deleting
2021-06-30 16:53:25 ERROR - "2005_Hurricane_Katrina/aug31JpegTiles_GCS_NAD83/aug31C0903000w295700n.tif" is not a valid cache file, deleting
2021-06-30 16:53:25 ERROR - "2005_Hurricane_Katrina/aug31JpegTiles_GCS_NAD83/aug31C0903000w295830n.tif" is not a valid cache file, deleting
2021-06-30 16:53:26 ERROR - "2005_Hurricane_Katrina/aug31JpegTiles_GCS_NAD83/aug31C0903000w300000n.tif" is not a valid cache file, deleting
-
During this time, which can be several minutes, the application trying to use the data is hung and usually needs force killed.
-
While catfs is "deleting" files, the total size of the cache directory root grows
-
After these few minutes, everything works, except for that first invalid 8 byte file.
-
Often but not always after this delay, despite having fetched files before, they are fetched new again (increased inbound network traffic is observed)
-
Gwenview STDERR on the 8 byte file:
gwenview.libtiff: Error JPEGLib "Not a JPEG file: starts with 0xd5 0x7e"
org.kde.kdegraphics.gwenview.lib: Could not generate thumbnail for file "file:///media/s3-noaa-eri-pds-LOCAL-catfs/2005_Hurricane_Katrina/sep02JpegTiles_GCS_NAD83/sep02C0890130w290900n.tif"
gwenview.libtiff: Error JPEGLib "Not a JPEG file: starts with 0xd5 0x7e"
org.kde.kdegraphics.gwenview.lib: Could not generate thumbnail for file "file:///media/s3-noaa-eri-pds-LOCAL-catfs/2005_Hurricane_Katrina/sep02JpegTiles_GCS_NAD83/sep02C0890130w290900n.tif" -
Sometimes after all the other images have been downloaded and thumbnails created, I can double-click on that first file that had been 8 bytes, and it's now downloaded, and will display without error. Other times, it stays 8 bytes.
-
Reproduction
-
Mount an aws s3 public bucket with goofys: goofys -o allow_other -o ro noaa-eri-pds /media/s3-noaa-eri-pds-LOCAL
-
Start catfs appropriately: catfs -o ro --free 1G /media/s3-noaa-eri-pds-LOCAL /home/myusername/.goofys-cache /media/s3-noaa-eri-pds-LOCAL-catfs
-
Explore the images with a tool such as gwenview that will show thumbnails
-
Optional: Start gwenview on the command line, outputting STDOUT and STDERR to their own files and then watch these files: gwenview 1> /tmp/gwenview-STDOUT.txt 2> /tmp/gwenview-STDERR.txt & tail -f /tmp/gwenview-STD*
I noticed I was getting complaints from catfs about too many open files. The default was 1024 for the user, so I increased it to 10240 and rebooted.
I'm not getting the too many open files errors anymore, however, the other issues continue. I was checking how many open files there were for which processes, and here's the result. I am not entirely sure this is the right way to count "open files", however, I think these stats are interesting. Is this normal?
As root: lsof -n | grep myusername | awk '{print $1}' | sort -n | uniq -c | sort -nr | head -n 4
133961 catfs
78938 chrome
49162 thunderbi
16426 gwenview