elastic/support-diagnostics

Error trying to clean files at end of running the Diagnostics.bat on windows

pl853 opened this issue · 2 comments

pl853 commented

Dear Elastic,

First of all; thanks for the awesome tool, it's very usefull!

In version 8.5.0 of the diagnostic tool, an error occurs at the end of the scanning process. Once the scanning is done, the script seems to clean up all the unnecessary files. However, when trying to clean up the diagnostics.log it returns an error, since the diagnostics.log is still open and being written to.

image

I hope this can be fixed! Thanks in advance

Kind regards

We have been able to reproduce this behaviour with the following set-up:

  • Windows Server 2019
  • OpenJDK 21
  • Diagnostics utility version 8.5.0
  • ESS deployment to generate a diagnostics in api mode

TL;DR
The error related to diagnostics.log does not prevent the generation of the diagnostics. However, it will prevent the utility to cleanup the temporary files.

Investigation

  1. As per the report from @pl853 , below errors will be observed:
02:54:44.538 [main] INFO  co.elastic.support.BaseService - Closing loggers.
02:54:44.538 [main] INFO  co.elastic.support.BaseService - Archiving diagnostic results.
02:54:44.711 [main] INFO  co.elastic.support.util.ArchiveUtils - Archive: C:\Users\romain\Downloads\diagnostics-8.5.0-dist\diagnostics-8.5.0\api-diagnostics-20231208-025444.zip was created
02:54:44.836 [main] ERROR co.elastic.support.util.SystemUtils - Delete of directory:C:\Users\romain\Downloads\diagnostics-8.5.0-dist\diagnostics-8.5.0\api-diagnostics failed. Usually this indicates a permission issue
org.apache.commons.io.IOExceptionList: C:\Users\romain\Downloads\diagnostics-8.5.0-dist\diagnostics-8.5.0\api-diagnostics
	at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:331) ~[commons-io-2.11.0.jar:2.11.0]
	at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1192) ~[commons-io-2.11.0.jar:2.11.0]
	at co.elastic.support.util.SystemUtils.nukeDirectory(SystemUtils.java:51) [diagnostics-8.5.0.jar:8.5.0]
	at co.elastic.support.diagnostics.DiagnosticService.exec(DiagnosticService.java:98) [diagnostics-8.5.0.jar:8.5.0]
	at co.elastic.support.diagnostics.DiagnosticApp.main(DiagnosticApp.java:51) [diagnostics-8.5.0.jar:8.5.0]
Caused by: java.io.IOException: Cannot delete file: C:\Users\romain\Downloads\diagnostics-8.5.0-dist\diagnostics-8.5.0\api-diagnostics\diagnostics.log
	at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:1344) ~[commons-io-2.11.0.jar:2.11.0]
	at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:324) ~[commons-io-2.11.0.jar:2.11.0]
	... 4 more
Caused by: java.nio.file.FileSystemException: C:\Users\romain\Downloads\diagnostics-8.5.0-dist\diagnostics-8.5.0\api-diagnostics\diagnostics.log: The process cannot access the file because it is being used by another process
	at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92) ~[?:?]
	at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103) ~[?:?]
	at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:108) ~[?:?]
	at sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:273) ~[?:?]
	at sun.nio.fs.AbstractFileSystemProvider.deleteIfExists(AbstractFileSystemProvider.java:109) ~[?:?]
	at java.nio.file.Files.deleteIfExists(Files.java:1191) ~[?:?]
	at org.apache.commons.io.file.PathUtils.deleteFile(PathUtils.java:487) ~[commons-io-2.11.0.jar:2.11.0]
	at org.apache.commons.io.file.PathUtils.delete(PathUtils.java:392) ~[commons-io-2.11.0.jar:2.11.0]
	at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:1341) ~[commons-io-2.11.0.jar:2.11.0]
	at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:324) ~[commons-io-2.11.0.jar:2.11.0]
	... 4 more
  1. We can confirm that the problem is not related to Windows Defender (given the above folder added in the exclusion list, the problem still occurs).

Note: the same error can occur as soon as a user runs the diagnostics utility IF the diagnostics.log is opened by another process (e.g Notepad) (which would then be an expected behaviour).

  1. At a glance in the code, these are some points that need to be further investigated (and fixed):
  • diagnostics.log is does not tied-up to any specific appender
  • diag appender does not seem to be use anywhere in the code
  • given the above, closeLogs method will not close the appender / diagnostics.log file
  • nukeDirectory method triggers an exception given that the appender / diagnostics.log file is still used by a process (but in this case, this seems to be java.exe itself)
  1. It has been observed as well that using the default Extract all... in Windows reports that the generated zip file is invalid:
Screenshot 2023-12-08 at 11 37 31 AM
Windows cannot complete the extraction.
The Compressed (zipped) Folder 'C:\api-diagnostics-20231208-023550.zip' is invalid.

It is unclear why this is happening. The zip file is definitely valid and can be opened using third-party tools like 7zip.

Sometimes zip formats are slightly different and less compatible clients fail on them.
A previous issue with the windows extractor: #517 (comment)