microsoft/Windows-Containers

Writes to emptyDir is much slower as compared to writes within container root fs

ibabou opened this issue ยท 28 comments

ibabou commented

Describe the bug
We have been doing perf testing as complains arose around slower writes when writing to emptyDir. Based on the experiments, we can see that it's definitely slower especially with large sets of small files (like cloning a repo for example).

To Reproduce
Those tests were ran on GKE clusters, but the results will be identical to OSS clusters with GCE provider. We've looked into the perf. of backing managed boot-disk, and there wasn't any overhead or throttling happening. the tests were made on regular disks as well as using higher perf disks (pd-ssd), and the slowness still remains as compared to writes within container FS.

Expected behavior
We expect a similar performance with the writes to emptyDir.

Example for a test container:

apiVersion: v1
kind: Pod
metadata:
  name: pod-with-empty-dir
spec:
 containers:
 - args:
   - netexec
   image: k8s.gcr.io/e2e-test-images/agnhost:2.20
   name: agnhost
   imagePullPolicy: IfNotPresent
   ports:
   - containerPort: 8080
   volumeMounts:
    - mountPath: "C:\\test-ssd"
      name: "test-ssd-empty-dir"
    - mountPath: "C:\\anotherdir_mount"
      name: another-dir
      readOnly: false
 nodeSelector:
  kubernetes.io/os: windows
 volumes:
  - name: test-ssd-empty-dir
    emptyDir: {}
  - name: another-dir
    hostPath:
      path: "C:\\anotherdir"

Test 1: emtyDir vs container fs (high perf disk - cloning a repo - 300% slower):

C:/test-ssd $ powershell
Windows PowerShell
Copyright (C) Microsoft Corporation. All rights reserved.

PS C:\test-ssd> Measure-Command {git clone https://github.com/openjdk/jdk}
Cloning into 'jdk'...
remote: Enumerating objects: 1346798, done.
remote: Counting objects: 100% (478/478), done.
remote: Compressing objects: 100% (309/309), done.
Receiving objects: 100% (1346798/1346798), 1.06 GiB | 17.64 MiB/s, done.346320

Resolving deltas: 100% (1000847/1000847), done.
Updating files: 100% (66565/66565), done.


Days              : 0
Hours             : 0
Minutes           : 15
Seconds           : 53
Milliseconds      : 518
Ticks             : 9535184729
TotalDays         : 0.0110360934363426
TotalHours        : 0.264866242472222
TotalMinutes      : 15.8919745483333
TotalSeconds      : 953.5184729
TotalMilliseconds : 953518.4729


PS C:\> Measure-Command {git clone https://github.com/openjdk/jdk}
Cloning into 'jdk'...
remote: Enumerating objects: 1346813, done.
remote: Counting objects: 100% (528/528), done.
remote: Compressing objects: 100% (332/332), done.
Receiving objects: 100% (1346813/1346813), 1.06 GiB | 17.66 MiB/s, done.346285

Resolving deltas: 100% (1000206/1000206), done.
Updating files: 100% (66565/66565), done.


Days              : 0
Hours             : 0
Minutes           : 3
Seconds           : 56
Milliseconds      : 546
Ticks             : 2365465153
TotalDays         : 0.0027378068900463
TotalHours        : 0.0657073653611111
TotalMinutes      : 3.94244192166667
TotalSeconds      : 236.5465153
TotalMilliseconds : 236546.5153

Test 2: emtyDir vs container fs (high perf disk - cloning a repo - ~57% slower):

PS C:\test-ssd> Measure-Command {1..10000 | %{ ($_ * (Get-Random -Max ([int]::maxvalue))) > "file$_.txt"}}


Days              : 0
Hours             : 0
Minutes           : 1
Seconds           : 57
Milliseconds      : 174
Ticks             : 1171741456
TotalDays         : 0.00135618224074074
TotalHours        : 0.0325483737777778
TotalMinutes      : 1.95290242666667
TotalSeconds      : 117.1741456
TotalMilliseconds : 117174.1456

PS C:\> Measure-Command {1..10000 | %{ ($_ * (Get-Random -Max ([int]::maxvalue))) > "file$_.txt"}}


Days              : 0
Hours             : 0
Minutes           : 1
Seconds           : 15
Milliseconds      : 466
Ticks             : 754661779
TotalDays         : 0.000873451133101852
TotalHours        : 0.0209628271944444
TotalMinutes      : 1.25776963166667
TotalSeconds      : 75.4661779
TotalMilliseconds : 75466.1779

Test 3 : container fs with symlink vs emptyDir (this was to check if symlink overhead could be the cause - still ~46% slower):

PS C:\test> mkdir src
PS C:\test> New-Item -Path C:\test\sym -ItemType SymbolicLink -Value C:\test\src
PS C:\test> cd sym
PS C:\test\sym> Measure-Command {1..10000 | %{ ($_ * (Get-Random -Max ([int]::maxvalue))) > "file$_.txt"}}
 
 
Days              : 0
Hours             : 0
Minutes           : 1
Seconds           : 32
Milliseconds      : 173
Ticks             : 921737106
TotalDays         : 0.00106682535416667
TotalHours        : 0.0256038085
TotalMinutes      : 1.53622851
TotalSeconds      : 92.1737106
TotalMilliseconds : 92173.7106
 
 
PS C:\emptydir> Measure-Command {1..10000 | %{ ($_ * (Get-Random -Max ([int]::maxvalue))) > "file$_.txt"}}
 
 
Days              : 0
Hours             : 0
Minutes           : 2
Seconds           : 13
Milliseconds      : 681
Ticks             : 1336813105
TotalDays         : 0.0015472373900463
TotalHours        : 0.0371336973611111
TotalMinutes      : 2.22802184166667
TotalSeconds      : 133.6813105
TotalMilliseconds : 133681.3105

Configuration:

  • Edition: Windows Server 2019 LTSC
  • Base Image being used: Windows Server Core
  • Container engine: containerd
  • Container Engine version: 1.6.13

Additional context
Add any other context about the problem here.

Thank you James for bringing this to my attention. @ibabou, I was able to reproduce this issue with the performance traces that we needed. The issue is related to the different defender behaviors of scanning a local volume vs scanning a bind folder. We are in the process of addressing the issue.

ibabou commented

Thanks @jsturtevant / @Howard-Haiyang-Hao ! Howard, does defender still have this problematic effect even if exclusion is set on the container-runtime process (we do so to avoid the bad-effect on image unpacking)? or, the live detection will still take effect for the containers shims/processes?

We set those settings on the VMs for example:

Set-MpPreference -SubmitSamplesConsent NeverSend
Set-MpPreference -MAPSReporting Disabled
Add-MpPreference -ExclusionProcess "$NODE_DIR\containerd.exe"

Is there a workaround/suggetion to disable completely?

No, I don't think we should disable the defender completely. This is an issue that we need to address at system level. Let me ask the defender team to see if there is a temporary workaround about this issue.

We are trying to understand is that why fileio on Emptydir was reissued from cache manager as wirte operations. We will keep you posted.

Hi @Howard-Haiyang-Hao, is there any update on this issue?

@ruiwen-zhao Defender team is working on a solution to address this issue. We will keep you guys posted if there is any progress made.

@Howard-Haiyang-Hao is there any updates? Are there a workaround? Is turning off the defender the only option?

I am also looking for a temp workaround, is turning off the defender an option? @ibabou did you solve the issue with commands above?

Hey @SergeyKanzhelev @martinelli-francesco, the defender team is working on a solution but it may take some time. For now you can use the command Set-MpPreference -DisableRealtimeMonitoring $true to disable the scanning.

The fix has been checked in and should be rolled out with Windows Update soon.

ibabou commented

@martinelli-francesco The commands above won't solve the issue, it seems the only way is to disable live monitoring completely.

The fix has been checked in and should be rolled out with Windows Update soon.

Thanks @Howard-Haiyang-Hao for the update! can you please share on what monthly release this is expected (and if the fix will require any opt-in configuration)? Also, will it be for both 2019 & 2022 on same cadence?

@Howard-Haiyang-Hao gentle ping on this. Any pointers on when the fix will be available?

Also, what is the best way to disable it completely as per @ibabou 's comment above?

The fix has successfully passed most release rings and will be available to the public within 2-3 weeks through the Windows Update channel. Thank you for your patience!

@Howard-Haiyang-Hao - does this have a kb number or something?
How can we see whether the fix is present or not?

Is this the relevant Defender update @Howard-Haiyang-Hao ?

Exciting update! We've released the fix for the defender issue. To confirm if your system now has the fix, please execute the following command:

(Get-MpComputerStatus).AMProductVersion
4.18.23070.1004

ibabou commented

Thanks @Howard-Haiyang-Hao for the update!

It is still incredibly slow...

psychical machine Windows: 4.18.23070.1004
| -- Hyper-V Windows VM k8s: 4.18.23070.1004
| -- Hyper-V linux VM k8s: debian 12

Testing

copy (cp -R) 250mb in about 20000 files

Windows k8s

from emptyDir to same emptyDir,

with defender on : 271 sec
with defender exclude c:\var folder in host: 81 sec
with defender off in host: 61 sec

from emptyDir to container c:\ with defender off

with defender off in host: 31 sec

from container c:\ to container c:\ with defender off

with defender off in host: 31 sec

Windows VM (ie host)

from host c:\var\lib\kubelet\pods...\ to host c:\ with defender off

with defender off in host: 26 sec

from host c:\ to host c:\ with defender off

with defender off in host: 24 sec

linux k8s

emptydir to same emptydir

6 seconds

AbelHu commented

It is still incredibly slow...

psychical machine Windows: 4.18.23070.1004 | -- Hyper-V Windows VM k8s: 4.18.23070.1004 | -- Hyper-V linux VM k8s: debian 12

Testing

copy (cp -R) 250mb in about 20000 files

Windows k8s

from emptyDir to same emptyDir,

with defender on : 271 sec with defender exclude c:\var folder in host: 81 sec with defender off in host: 61 sec

from emptyDir to container c:\ with defender off

with defender off in host: 31 sec

from container c:\ to container c:\ with defender off

with defender off in host: 31 sec

Windows VM (ie host)

from host c:\var\lib\kubelet\pods...\ to host c:\ with defender off

with defender off in host: 26 sec

from host c:\ to host c:\ with defender off

with defender off in host: 24 sec

linux k8s

emptydir to same emptydir

6 seconds

@davhdavh have you checked the defender version?

(Get-MpComputerStatus).AMProductVersion
4.18.23070.1004

@Abe

It is still incredibly slow...
psychical machine Windows: 4.18.23070.1004 | -- Hyper-V Windows VM k8s: 4.18.23070.1004 | -- Hyper-V linux VM k8s: debian 12

@davhdavh have you checked the defender version?

(Get-MpComputerStatus).AMProductVersion 4.18.23070.1004

It was already in the post. Same version for both hardware host and k8s node host.

I've conducted a comparative study by following these steps within an AKS container:

  1. git.exe clone https://github.com/microsoft/hcsshim.git C:\osdisk
  2. git.exe clone https://github.com/microsoft/hcsshim.git C:\tmp\temp

In this scenario, c:\tmp serves as a bind folder. Here are the findings:

BugFix

The disparity between a regular folder and a bind folder has diminished from 3.5x to 32%. Please do let me know if you've observed any differing results.

@Howard-Haiyang-Hao Like I wrote above, we are still seeing a ~800% difference between defender off and defender on using version 4.18.23070.1004, and ~95% in bind vs regular.

git clone is one of the worst ways to test this issue because it involves network traffic, while reporter complains about storage I/O.

Agree. And that is why my testing was to copy an already checked out repo.

With this bug fix, I've observed a significant performance improvement, as mentioned earlier. However, we still have to tackle virtualization overhead and other additional issues stemming from Defender. It's worth noting that I'm not observing ~800% gaps. Here's a summary of Defender on vs. off with my measurements:

defenderonoff

@davhdavh, I'd appreciate it if you could provide the traces from your reproduction. On your container host, could you please follow these steps with Defender turned on:

  1. wpr -start cpu -start fileio -start diskio
  2. Inside your container with emptydir mapped to c:\tmp, please run:
    git.exe clone https://github.com/microsoft/hcsshim.git C:\osdisk
    git.exe clone https://github.com/microsoft/hcsshim.git C:\tmp\temp
  3. wpr -stop gitclone_on.etl

Repeat the same steps with Defender turned off.

Kindly share the files gitclone_on.etl and gitclone_off.etl with me. Thank you very much for your assistance!

@Howard-Haiyang-Hao how to send the files to you? They are 80mb zipped

@davhdavh, An efficient method for sharing sizable files involves the use of Azcopy. Should you kindly share me with the Azure storage file link accompanied by the access token, I would be capable of downloading the file. It's worth noting that Azcopy boasts exceptional performance.

@Howard-Haiyang-Hao you are not sharing your email on github, so how to send to you?