rfjakob/gocryptfs

Slow write speed with nfs

DerDonut opened this issue · 14 comments

I encountered a problem when using my "archive" folders from my home server. When Writing to a gocryptfs plain dir for which the cypher dir is mounted via nfs, the speed is surprisingly slow.
This is my setup:

   Client                                   Server
   ┏━━━┓                                    ┏━━━┓
   ┃   ┃═══════════1 GBit/s LAN═════════════┃   ┃
   ┗━━━┛         max ~90 MByte/s            ┗━━━┛
    nfs                                      nfs
     ↑                                        ↓
data/Cipher                              data/Cipher
     ↑                                        ↓
 gocryptfs                                HDD (ext4)
     ↑
data/plain
     ↑
   files

This is my configuration for the mounts:

  • nfs exports: rw,nohide,no_subtree_check,sync,no_root_squash
  • nfs mount: rw,hard,timeo=10,user,noauto
  • gocryptfs mount: passfile,quiet

And these are the measured results, measured with dd with bs=16k:
client:/dev/zero -> client:data/plain: 132 kByte/s
client:/dev/zero -> client:data/cipher: 47,3 MByte/s
server:/dev/zero -> server:data/plain: 104 MByte/s
server:/dev/zero -> server:data/Cipher: 203 MByte/s
client:/data/plain -> client:/desktop: 78 MByte/s

The value which bothers me is the 132 kByte/s when writing to the plain dir on the client. Reading from the same dir nearly at full speed of the LAN connection. I think I may messed up some configuration of nfs or gocryptfs. But since only the plain dir with gocryptfs is such slow, it seems that gocryptfs is struggling with something.

Hi,

I have used a similar setup for about eight years.

Using larger block sizes than you, I recently saw for plain NFS:

# client:/dev/zero to NFS
> dd if=/dev/zero of=./test bs=512k count=2048 oflag=direct
2048+0 records in
2048+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 465.452 s, 2.3 MB/s

This was a lot slower than your data. The destination was a Btrfs file system in an LVM partition on a rotational RAID-1 array managed by mdadm.

For encrypted data, I saw:

# client:/dev/zero to plaintext folder locally handled by Gocryptfs
> dd if=/dev/zero of=./test bs=512k count=2048 oflag=direct
2048+0 records in
2048+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1560.15 s, 688 kB/s

I used no additional flags to mount (but had a standard entry in /etc/fstab) and relied on system security (i.e. no Kerberos) due to a hardware migration.

On the server, my exports were declared with *(rw,sync,no_subtree_check,sec=sys:krb5i:krb5p,mountpoint).

Kind regards
Felix Lechner

Beautiful ASCII art ❤️

I'm running essentially the same setup. The 132 kByte/s is horrifying. Here's what I get:

nfssecure.mnt$ dd if=/dev/zero of=zero bs=16k status=progress
416890880 bytes (417 MB, 398 MiB) copied, 28 s, 14,9 MB/s^C

Looking at /etc/exports on this Synology NAS:

/volume1/jakob	192.168.0.0/24(rw,async,no_wdelay,no_root_squash,insecure_locks,sec=sys,anonuid=1025,anongid=100)

Maybe "async" is what makes the difference? Can you test?

changed to async in exports, speed goes up to 6,8 Mbyte/s - much better! The speed on plain dir is dependant on the CPU? gocryptfs thread uses 6-8% CPU with an i7-6500U with 4 Cores-
The reached speed would be okay for me, i just wondered why you reach 15 MB/s. When my cipher dir can be written with 47 MB/s it relies on gocryptfs, correct?

How do your effective mount flags look like? Here's mine:

$ mount | grep 192.168.0.3:/volume1/jakob

192.168.0.3:/volume1/jakob on /mnt/synology/jakob type nfs (rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,soft,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.0.3,mountvers=3,mountport=892,mountproto=udp,local_lock=none,addr=192.168.0.3)

When my cipher dir can be written with 47 MB/s it relies on gocryptfs, correct?

I did not understand this question

How do your effective mount flags look like?

mount | grep 192.168.0.42:/media/shared/Daten/Gemeinsam/VLbRR8bFBUOjQSYfSsP2Gw
192.168.0.42:/media/shared/Daten/Gemeinsam/VLbRR8bFBUOjQSYfSsP2Gw on /home/donut/Daten/Encrypted/Gemeinsam Archiv type nfs4 (rw,nosuid,nodev,noexec,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=10,retrans=2,sec=sys,clientaddr=192.168.0.20,local_lock=none,addr=192.168.0.42,user=donut)

additional info for this: I mount sub-dirs of the topmost cipher dir on my client. For that purpose I copied the gocryptfs.conf from the topmost cipher dir to the corresponding subdirs (only on subdir level 1):

Cipher Dir
  ┗ Subdir 1    <━━━━ Mounted with gocryptfs
     ┗ Subsubdir 1.1
     ┗ Subsubdir 1.2
     ┗ ...
     ┗ gocryptfs.conf
     ┗ gocryptfs.diriv
  ┗ Subdir 2
     ┗ Subsubdir 2.1
     ┗ Subsubdir 2.2
     ┗ ...
     ┗ gocryptfs.conf
     ┗ gocryptfs.diriv
gocryptfs.conf
gocryptfs.diriv

I did not understand this question

The difference between my 6,8 MB/s and your 14,0 MB/s seems not to be dependent on the network, target HDD or similar underlying layers since I am able to write to the mounted cipher dir on the server with 47,3 MB/s, correct?

I get about the same directly to nfs:

$ dd if=/dev/zero of=zero bs=16k status=progress
834486272 bytes (834 MB, 796 MiB) copied, 17 s, 49,1 MB/s

I'm not sure why you only get half the data rate via gocryptfs. One difference I see is that you use nfs4, while I have nfs3 (vers=3). Your CPU has AES acceleration and should not be a problem (as seen in the low cpu usage).

Hm okay, thank you very much. The 7 MB/s will work for me. If I have time I may execute some more tests with different parameters.

Hi, I can also warmly recommend running cachefilesd (at least, when running in the slower sync mode).

Wow, looks like I broke the isConsecutiveWrite optimization during the v2.0 refactor!
Fixed now: 8f3ec5d

And my numbers got a 4x boost:

before:

nfssecure.mnt$ dd if=/dev/zero of=zero bs=16k status=progress
416890880 bytes (417 MB, 398 MiB) copied, 28 s, 14,9 MB/s^C

after:

nfssecure.mnt$ dd if=/dev/zero of=zero bs=16k status=progress
555319296 bytes (555 MB, 530 MiB) copied, 11 s, 50,5 MB/s

Wow, thanks! Is that enough for a release? It's been a little while.

Hi,

Thanks for the release!

For the purpose of completeness, here are my numbers before upgrading but after dropping sync from my NFS export.

For straight NFS:

# client:/dev/zero to NFS
> dd if=/dev/zero of=./test bs=512k count=2048 oflag=direct
2048+0 records in
2048+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 306.617 s, 3.5 MB/s

For encrypted data:

# client:/dev/zero to plaintext folder locally handled by Gocryptfs
> dd if=/dev/zero of=./test bs=512k count=2048 oflag=direct
2048+0 records in
2048+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1268.88 s, 846 kB/s

Those speed-ups are modest, but meaningful.

I also ran the smaller block size of 16k (and waiting), but am not sure why those numbers would be a better indication for performance.

For straight NFS:

# client:/dev/zero to NFS
> dd if=/dev/zero of=zero bs=16k status=progress
4808802304 bytes (4.8 GB, 4.5 GiB) copied, 45 s, 107 MB/s^C

(That process did not engage well with my Emacs/EXWM environment.)

For encrypted data:

# client:/dev/zero to plaintext folder locally handled by Gocryptfs
> dd if=/dev/zero of=zero bs=16k status=progress
3571712 bytes (3.6 MB, 3.4 MiB) copied, 24 s, 148 kB/s^C
219+0 records in
219+0 records out
3588096 bytes (3.6 MB, 3.4 MiB) copied, 24.3492 s, 147 kB/s

I'll also post additional results after upgrading to 2.3.1 in a little while.

Kind regards
Felix Lechner

This is amazing. I just updated to v2.3.1 (77a0410), still async for nfs. Results:

client:/dev/zero -> client:data/plain: 
    v2.2.1 + sync: 132 kByte/s 
    v2.2.1 + async: 6,8 MByte/s
    v2.3.1 + async 110 MByte/s
client:/dev/zero -> client:data/cipher: 
    v2.2.1 + sync: 47,3 MByte/s
    v2.3.1 + async: 114 MByte/s
server:/dev/zero -> server:data/plain: 
    104 MByte/s
server:/dev/zero -> server:data/Cipher:
    203 MByte/s
client:/data/plain -> client:/desktop: 
    78 MByte/s

I also measured over Wifi:
client:/dev/zero -> WIFI -> client:data/plain: 47,5 MByte/s

This is a factor of 15,7 compared to v2.2.1 with async and 833x with v2.2.1 with sync 😮 All measurements lasted 20s. Maybe the number would go down a bit if the test would last longer since dd starts at 240 MB/s which is impossible with my 1GBit LAN (seems to be a caching thing). Basically I have full LAN speed now, could not wish for more.