PowershellFrameworkCollective/psframework

Prevent creating empty log files

Zerowalker opened this issue · 10 comments

Not sure which version this started happening, but if you do something like

$paramSetPSFLoggingProvider = @{
    Name          = "logfile"
    FilePath      = "$PSScriptRoot\logs\%Date%.csv"
    LogRotatePath = "$PSScriptRoot\logs\*.csv"
    Enabled       = $true
}
Set-PSFLoggingProvider @paramSetPSFLoggingProvider

This will end up creating a log file even if nothing is ever written.
For cases where it's common that nothing happens for most days, it's a bit annoying to have a ton of log files even though just some actually may contains something.

So if possible it would be nice if the files are only created when something is actually written to it.

Heya @Zerowalker ,
sorry for the late reply, was on vacation.
Thanks for brining this up - I think it was somewhere in 1.7* - 1.9* where I changed up the log writer in favor of significantly better logging performance and less interference from the AV.
Can't really help it being created though, but if that works for you, I can add a deletion option to the config (i.e. when you run Set-PSFLoggingProvider).
The file will still be created, but it will then, when you stop the logging provider instance, kill the logfile, if it is still empty.

Would be something like this:

$paramSetPSFLoggingProvider = @{
    Name          = "logfile"
    FilePath      = "$PSScriptRoot\logs\%Date%.csv"
    LogRotatePath = "$PSScriptRoot\logs\*.csv"
    Enabled       = $true
    NoEmptyLog    = $true
}
Set-PSFLoggingProvider @paramSetPSFLoggingProvider

And at the end of your script:

Disable-PSFLoggingProvider -Name logfile

Can't make it automatic, as people might have switched to considering the empty files as proof of execution (just nothing going wrong) and alert on them missing, so I can't just drop that. But with the parameter as described, that I can do.

I'll keep this issue open as a marker to make it happen in the next release (no solid ETA, sorry), but I'll update here once it's shipped, so you'll get notified :)

Hey @FriedrichWeinmann,
No worries at all:)

Hmm, what is this change that requires the file to be created before hand?
As wouldn't creating it before hand or at time of writing act almost the same, the AV would still want to read it upon writing,
or rather when you just access a file I believe.

As you say the empty file can be used as proof of execution, didn't think of that, a good reason to have it really.

Why would you need Disable-PSFLoggingProvider -Name logfile though?
Wouldn't it "do that" when the script exits (and powershell is closed)?

Having an option NoEmptyLog would be nice from my perspective, but of course if that's just me it makes no sense in adding it.
It should be something that more would want, as adding a feature for 1 person that will increase the code and maintenance (however small) is hard to justify:P

Nevertheless I do appreciate the suggestion, many thanks!

Hmm, what is this change that requires the file to be created before hand?

Originally, each message would get written using a new Add-Content call. The problem with that approach is twofold:

  • Since I open/write/close the file each time, that is inefficient
  • When writing multiple entries close together, AV can interfere: We first write line 1. AV notices the file was modified and opens it to check for malware. We now try to open the file again to write line 2, but the file is currently in use and it fails.

This meant I needed to switch the way I write the file, keep it in access.
I tried steppable pipelines first, but I have absolutely no control over just when the entry is actually flushed to disk, which in combination with the problem how PowerShell handles its termination event meant I would lose log entries each time I used it in a scheduled task.

So I shifted over to a [System.IO.StreamWriter], allowing me direct control over everything I needed, at the cost of creating the file always.

Why would you need Disable-PSFLoggingProvider -Name logfile though?
Wouldn't it "do that" when the script exits (and powershell is closed)?

The background logging happens in 5 stages: Initialize, Begin, Process, End, Finalize. Initialize happens once only at the beginning. Then it will keep looping Begin/Process/End until either a) PowerShell closes, b) The logging Provider has been disabled or C) The End Of Time. Finally it will once only run Finalize.
If you just close PowerShell, Finalize never happens, but the logging plugin only releases the file on its own during that step, so it can't delete it any sooner.

It should be something that more would want

In my experience, most of the times we have some more messages in a script than just when things fail, meaning the empty-logfile-syndrome should be quite rare.
That said, I am quite happy with fine-tuning my logging features to be compatible even with rarer interests/needs, so don't worry about me adding it to the backlog - I'm taking a certain amount of pleasure from people approaching me at conferences and expressing their utter bafflement, that my tool enabled even their odd scenario :)

Ah that make sense.
I recall having issues with Add-Content breaking at times, not being able to access the file or something.
But using [System.IO.File]::AppendAllText solved that issue, not sure if that would do anything here though.

I unserstand and agree about your approach on this, lack of control and potential issues are quite a big bottleneck compared to the "empty file issue":)

If you just close PowerShell, Finalize never happens, but the logging plugin only releases the file on its own during that step, so it can't delete it any sooner.

Doesn't Finalize ensure the data is flushed as well?
Meaning closing powershell before the data is written means it might get lost?
Or is the background logging alive anyway but it just doesn't know it can release the file?

In my experience, most of the times we have some more messages in a script than just when things fail, meaning the empty-logfile-syndrome should be quite rare.

My case is probably rare than, I did log more things before, but as the log files are really only used about errors for me,
it just made looking for errors more messy. If only errors are logged it's easy to see what days it failed right of the bat, as those are the only log files that exists (or aren't empty).

But in that case please do with my request as you will, you taking your time to consider it is appreciated just in itself!

Doesn't Finalize ensure the data is flushed as well?

No. By the time we reach Finalize through Disable-PSFLoggingProvider we can be certain all logs have been processed, but that is because it ensures all pending messages have run through Process, which is the phase where messages get written (and flushed to disk).
PSFramework immediately flushes the content of the FileWriter on each message, but if you spam many messages, you might actually manage to create a backlog where the log-writing needs a bit extra time.

PSFramework registers an event when it is first imported that triggers when PowerShell shuts down, in order to flush the messages in the queue (but it does not disable the logging providers or wait for them to be Finalized).
In most cases that is just fine and works, but you don't get that "Shutting down my logging provider"-event, which we'd need for that feature.

Btw., there's some rare edge-cases where we lose a last message anyway and Disable-PSFLoggingProvider never hurts, so you can use that now already, if you want to make 110% sure all is logged correctly.

I'd also always recommend setting an InstanceName when defining your logging.
Usually makes not much of a difference, but if later for some reason you have ScriptA call ScriptB, ScriptB does not interfere with the logging of ScriptA (without an instancename, both will try to configure the same logger instance, meaning ScriptA's log will cut off the moment you call ScriptB and not return, even after ScriptB is done.).

Ah okay that's great.
Currently i have used
Wait-PSFMessage -Timeout 30
at the end (in a Finally) to hopefully ensure everything is written without have a possible freeze.

As for the InstanceName, I thought that logfile was special, cause when using other names the "log file" settings doesn't work if I recall.

EDIT:

Oh wait, that's Name not InstanceName.
Thought InstanceName just used the script name or something.

InstanceName

You can have multiple parallel enabled logging providers by giving each instance its own name.
With that you can log to two different files in parallel (with completely different configuration or filter settings)

Wait-PSFMessage

Wait-PSFMessage is another way to ensure the flush.
It will not trigger the Finalize event - all it does is ensure the entire Queue of log entries is emptied and a full cycle of Process has completed.
It is perfectly sufficient to ensure all log-entries have completed.

It has one more edge-case issue (your scenario probably doesn't suffer from):
If you use runspaces and have multiple scripts running in parallel, it can hang (hence the timeout) if messages get queued in the background faster than they get written by the logger. Extremely rare case, usually when doing massive runspace-counts or having just too many logs enabled in parallel.

You can have multiple parallel enabled logging providers by giving each instance its own name.
With that you can log to two different files in parallel (with completely different configuration or filter settings)

Got it, so in that case having a script with a log file setup calling another script that also does this can mess things up if they don't specify an InstanceName?

I just tried using Disable-PSFLoggingProvider, but the file is still locked afterwards.
I have to either call Wait-PSFMessage -Terminate or just close powershell in order to delete/move the file.
Am i missing something, as i thought Disable-PSFLoggingProvider would release it's hold on the file?
(not that it really matters in the test here, just tried it out and was surprised).

Multiple scripts & logging

Exactly. I usually just pick the name of the script as the instance name.

Disable-PSFLoggingProvider

Works for me:
image

If you now started using an InstanceName, make sure to also use it when calling Disable-PSFLoggingProvider

Got it thanks!

Works for me:
Must have done something wrong when testing it clearly.