Add -Parallel switch
nohwnd opened this issue ยท 28 comments
Add the ability to run tests in parallel by providing the number of concurrent runners, and making sure the output is written in blocks and not one over other.
At what scope are we talking about parallelizing? I mean, what "things" would become parallel?
Fixtures? Describe? It?
It is whole files (for now?). They usually have common setup on the top of the file and one or more Describes.
Moving this up to 5.x, it should be technically possible but needs more research to make the api simple and useful at the same time.
Would love to see this feature, our build times are getting longer as we keeping adding tests for new powershell code.
Would love to see this feature, our build times are getting longer as we keeping adding tests for new powershell code.
As a workaround, if your tests are not dependent on each other you can just use Foreach -Parallel to run each test file or even each test separately and then stitch the returned result objects back together, that's what we do.
^^ was what I had been doing with PoshRSJobs ages back
If Pester took a dependency on the ThreadJob module then you could do this quite quickly and easily.
I think as mentioned it would need to be File level based testing & not Describe
Context
It
block levels to make the most sense
@kilasuit wouldn't really need Threadjob, Runspaces and Runspace pools are "relatively" simple to work with compared to a lot of what is done in Pester (which Threadjob is just a wrapper around).
My opinion is there are so many potential issues around state and race conditions, it's pretty easy to "bring your own threading" to Pester now so I wouldn't say it's a "need to have" but definitely a "nice to have"
@nohwnd
It is whole files (for now?).
@JustinGrote
As a workaround, if your tests are not dependent on each other you can just use Foreach -Parallel to run each test file or even each test separately and then stitch the returned result objects back together, that's what we do.
This is a workaround I considered on my project, but some test files are way more heavy then the others. So it really needs to send each specific test to available thread for full parallelization effect. This is currently not trivially achievable as you can't execute test by name (something along that line is mentioned on #608). If test results actually provide info about LineNumber (the same as VsCode peste stub) it would be easy, although IMO the correct way is to make single test execution by name possible then orchestrate individual tests over available threads.
My opinion is there are so many potential issues around state and race conditions, it's pretty easy to "bring your own threading" to Pester now so I wouldn't say it's a "need to have" but definitely a "nice to have"
How your own threading solution resolves state and race conditions ? This is pretty expected option from any test runner, so I would categorize it as a must rather then nice to have.
@majkinetor You can run tests by name, but actually doing Discovery and from there taking the tests to run would be much easier way to do this. A cmdlet that can only do discovery is not published but is implemented and mostly works. That said, why I consider only file-level parallelization is because tests have their own Setups and teardowns and you would need to run those for every test based on where you are running. You would also have to make sure that your teardowns can run more than one time, because we would never know how many times they would run.
Another reason is that you need to start one executor for each, and load modules in it, so this is far from lightweight.
All of this would burden the user with a lot (we see the same problems and complaints in VSTest / MSTest), and for the majority of code bases the added performance is not worth the complication that per-test paralellization would introduce.
That said, why I consider only file-level parallelization is because tests have their own Setups and teardowns and you would need to run those for every test based on where you are running.
Its IMO not generally wise to deduce how people will use the library. In my case, my setup routines are very fast, taking less then a sec. In fact, I often do 1..1000 | % { invoke-pester -TagFilter debug }
when chasing flaky test.
But I get your point and it could be easily solvable using options. Parallelization could be run using following option:
- Number of tests to run from single file at once: 1..1000.
If set to 1, it runs 1 test from test file per thread (my scenario)
if very large, like 1000, it runs all the test in a file (your scenario)
And people can fine tune it, depending on the speed of the setup/teardown, lets say somebody will for example use 10. If I set 10, then test file with 55 tests will run in 6 threads in parallel.
This allows maximal flexibility without complicating implementation almost at all.
I agree with what's been said above: wrapping pester in parallelization would be easy, and I don't think I need in-process or even multi-core aware parallelization, but I do think that having slicing built in would help. I do not need ThreadJobs or any other multi-threading or multi-processing in Pester, but supporting splitting tests across multiple agents by taking a count and an index would be really helpful.
And of course, if we had that, I could parallelize in-process trivially:
1..$count | foreach -parallel { Invoke-Pester -TotalSlices $count -SliceIndex $_ }
But more importantly, on big projects, I could spin up a dozen agents like vstest.
An anecdote:
In the biggest build I have, we actually tried splitting up the tests several different ways (tests, test files, etc), and actually ended up using tagging of dependencies, splitting the test files based on the modules they need to import to run. It turned out that although our test suite originally took well over an hour, I was able to get it down to under 10 minutes (even on Windows PowerShell) without turning off code coverage. But so much of the test time was installing and importing dependencies, and then importing the modules under test and setting breakpoints for code coverage, that splitting the tests up further just didn't pay off, and splitting it up any other way could actually increase the run time.
The moral of the story is: sometimes, you're going to need to roll your own slicing anyway.
And of course, if we had that, I could parallelize in-process trivially:
In that case TotalSlices
should still not randomly choose tests, but prioritize those from the same file in order to avoid having single test per file per slice.
It would also require an option to join sliced results into single test result , perhaps:
$results = 1..$count | foreach -parallel { Invoke-Pester -TotalSlices $count -SliceIndex $_ }
$result = Join-PesterResults $results
@majkinetor I have to deduce at least a bit how the library will be used, I don't want to overcomplicate the common scenarios. ๐
The splitting on count seems a bit arbitrary. We would have to do discovery locally, count the tests, then split them into batches. Then send them to the target station. Load Pester. Do discovery there again, to ensure that session states are bound correctly. Then run the tests.
Now if we get this wrong and split a file that has 100 tests into 90 and 10 test batch, we will have to run the setup for that file 2 times, once on each agent. If that setup takes 1 second to finish, we just lost 1 second (of CPU time that could do something more useful) because we run it twice.
If we split by files this won't happen, or will happen in a more controlled way. You could run discover test, and find all files that have more than 1000 tests each, and split those to half. Otherwise parallelize on file.
@Jaykul Did you see this btw? PowerShell/PowerShell#13589 When implemented it should give us Code Coverage with minimal overhead.
Agreed: I don't think you should split files. ๐
In my opinion, the whole point of having a file with more than one test in it is for it to be a suite of tests. When I write rspec syntax, I still prefer to write the way @splatteredbits described here, and I still teach others to write that way, so although I may not need the order, I do expect them to run and be reported in the order that they are in the file.
I wouldn't mind having the ability to use foreach -parallel
functionality within single files at the Describe
level but personally, I'd want it to be: run BeforeAll
once, and then spin out threads to parallelize the describes. I think it would be hard to implement, and sufficiently complicated to work with that it might only be suitable for tests that were carefully written with that in mind. Not sure I'd use that functionality even if it was available.
If we split by files this won't happen, or will happen in a more controlled way.
Lets not forget that this also has its own problems. If I have a 10 threads and 5 files, with 50, 30, 20, 10 and 5 tests respectively, half of the threads will do nothing except the first two ones majority of the time. So, I think added 1s for repeated beforeAll would do better in that scenario. The big differences in test numbers per file are usual thing on my projects at least.
It depends entirely on the overhead, and there's more overhead than you think. Your hypothetical "1s for repeated beforeAll" doesn't take into account the bare minimum for parallelizing tests:
- starting a new runspace
- importing Pester
- importing whatever you're testing
- parsing all the tests to find the ones you want to run
In a CI scenario, where you're splitting onto different agents, there's even more overhead:
- Provisioning the agent
- Downloading the build artifact
- Installing dependencies (like the right version of Pester).
- ...
On top of all that, even the best production code isn't always threadsafe. Test code even less so. From my experience, if you want to run tests in parallel, you're going to need to take that into account when you're writing your tests. Therefore, for Pester it's going to be much easier to just say: if you want to parallelize, make sure each test suite takes roughly the same time to run, because we parallelize at the test suite level.
From my experience, if you want to run tests in parallel, you're going to need to take that into account when you're writing your tests
Abso-fukin-lutely. Its hard to even come to those kind of problems because you first have to solve test dependencies and flakyness. To handle problems with parallelization, one need to think about it in every test he writes, and it usually also requires adding some functionalities to backend that are used only be tests.
Has anyone spotted weird behavour when trying to wrap tests with -ThrottleLimit
of the ForEach-Object
cmdlet?
Take this simple test.ps1
test file:
param(
$parameter
)
Describe 'test' {
It 'test' {
$parameter| Should -Be $parameter
}
}
It seems -ThrottleLimit
isn't managing the number of allowed concurrent threads, at least with Pester tests.
The output seems to suggest its only executing the first 2 objects in the pipeline, or 3 if -ThrottleLimit
has a value of 3 etc.
PS C:\Users\acc\Desktop> 1..5 | % -Parallel { .\test.ps1 -parameter $_ } -ThrottleLimit 2
Starting discovery in 1 files.
Starting discovery in 1 files.
Discovery found 1 tests in 22ms.
Running tests.
Discovery found 1 tests in 22ms.
Running tests.
[+] C:\Users\acc\Desktop\test.ps1 87ms (5ms|60ms)
[+] C:\Users\acc\Desktop\test.ps1 88ms (4ms|62ms)
Tests completed in 88ms
Tests Passed: 1, Failed: 0, Skipped: 0 Tests completed in 89ms
NotRun: 0
Tests Passed: 1, Failed: 0, Skipped: 0 NotRun: 0
This isn't a problem with -ThrottleLimit
itself, as with the below all objects in the pipeline are processed:
PS C:\Users\acc\Desktop> 1..5 | % -Parallel { $_ } -ThrottleLimit 2
1
2
3
4
5
What could Pester be doing to cause it to abandon the remainder of the pipeline?
Try five different ps1-files, use Invoke-Pester
or use 5.4.0-rc1.
Interactive execution (not calling Invoke-Pester
explicitly) has a file-check to avoid duplicate runs in 100ms. That logic was replaced in 5.4.0
@nohwnd @fflaten Regarding running Pester tests in parallel, I have custom in-house module that I use for infrastructure testing. Some tests take long time to execute so I run the Pester tests in parallel using ThreadJob module.
The problem is that once a while, and only during initialization phase of Pester, Pester throws an exception:
You can see exception happening in Pester.psm1 module at line 13998:
I'm using Pester 5.5.0 on PowerShell Core 7.3.6, ThreadJob 2.0.3 and terminal that supports virtual terminal:
#in current session we get $true
> $host.UI.psobject.Properties['SupportsVirtualTerminal'].Value
$true
#in runspace we get $false
> 1..3|% -Parallel {$host.UI.psobject.Properties['SupportsVirtualTerminal'].Value}
$false
$false
$false
I will do some more investigation.
Edit:
I did some debugging and it looks like sometimes the $host.UI.psobject.Properties
is $null
during runspace initialization and hence the exception. Anyways, I did following manual change in the line 13998 and I have no exception since:
elseif (($hostProperties = $host.UI.psobject.Properties) -and ($supportsVT = $hostProperties['SupportsVirtualTerminal']) -and $supportsVT.Value) {
Edit 2:
OK $host.UI.psobject.Properties
when $null
should not throw an error, unless in Strict-Mode
? And I do not set Strict-Mode
in my modules. I need to poke some more...
@kborowinski Thanks for the report. Would you mind moving it to a new bug report issue so we can discuss further?
Is $host.UI
also null? Would appreciate a sample of how Invoke-Pester is invoked to reproduce it. Expected this test to catch any errors, but clearly not enough ๐
Pester/tst/Pester.RSpec.InNewProcess.ts.ps1
Lines 372 to 393 in 7ca9c81
@fflaten Will move it to bug reports, and I guess that $host.UI
is not null
since my fix would throw an error too.
I'll try to come up with some repro sample, because I cannot publish the in-house module. I hope that's OK.
I'll try to come up with some repro sample, because I cannot publish the in-house module. I hope that's OK.
No problem. Mostly wondering how the Start-ThreadJob
looks like. Invoke-Pester
in InitializationScript
vs Scriptblock
(I assume this), StreamingHost
or not etc.
Ah, OK, that I can provide:
Start-ThreadJob
initialization:
$splatThreadJobParameters = @{
ThrottleLimit = [Environment]::ProcessorCount
StreamingHost = $Host
ErrorAction = 'Stop'
InitializationScript = {
$modulesToImport = @(
'Pester',
'Storage',
'SQLServer',
'KBSystemFunctions',
'KBSystemManagement',
'KBServiceManagement',
'KBDomainComputerManagement',
'KBInfrastructureTesting'
)
Import-Module -Name $modulesToImport -Force -ErrorAction Stop
}
}
- The main infrastructure monitoring loop:
$threadJobs = @($Tag).ForEach{
Start-ThreadJob -Name $_ -ScriptBlock {
$WarningPreference = 'SilentlyContinue'
$splatTestParameters = @{
Tag = $Using:_
Credential = $Using:Credential
PassThru = $true
OutputVerbosity = 'None'
ErrorAction = 'Ignore'
}
[PSCustomObject]@{
Test = $Using:_
Result = Test-KBInfrastructure @splatTestParameters
}
} @splatThreadJobParameters
}
- And this is main body of
Test-KBInfrastructure
function that is called inStart-ThreadJob
:
[CmdletBinding()]
param(
[Management.Automation.PSCredential]
[Management.Automation.CredentialAttribute()]
[ValidateNotNullOrEmpty()]
$Credential = [Management.Automation.PSCredential]::Empty,
[ValidateScript({Assert-KBInfrastructureTestTag -Tag $_})]
[String[]]$Tag,
[ValidateSet('None', 'Normal', 'Detailed', 'Diagnostic')]
[String]$OutputVerbosity = 'Detailed',
[ValidateSet('None', 'FirstLine', 'Filtered', 'Full')]
[String]$StackTraceVerbosity = 'None',
[Switch]$PassThru
)
try {
$saveProgressPreference = $Global:ProgressPreference
$Global:ProgressPreference = 'SilentlyContinue'
if ($Credential -eq [Management.Automation.PSCredential]::Empty) {
$Credential = try {Get-KBCredentialFromFile -Verbose:($PSBoundParameters['Verbose'] -eq $true) -ErrorAction Stop} catch {Get-Credential}
}
if ($Tag) {
$testFilterRegex = [Regex]('(?i)({0})\.Tests\.ps1$' -f [String]::Join('|', ($uniqueTags = [HashSet[String]]::new($Tag))))
$tagFilteredTests = @($Script:Tests).Where{$testFilterRegex.IsMatch($_)}
} else {
$uniqueTags = $Script:Tags
$tagFilteredTests = $Script:Tests
}
#Concurrent dictionary that allows to exchange data between Discovery and Run mode
$Data = [ConcurrentDictionary[String, [ConcurrentDictionary[String, Object]]]]::new()
$uniqueTags.ForEach{$Data[$_] = [ConcurrentDictionary[String, Object]]::new()}
$pesterConfiguration = New-PesterConfiguration
$pesterConfiguration.Run.Container = New-PesterContainer -Path $tagFilteredTests -Data @{Credential = $Credential; Data = $Data}
$pesterConfiguration.Run.SkipRemainingOnFailure = 'Block'
$pesterConfiguration.Output.Verbosity = $OutputVerbosity
$pesterConfiguration.Output.StackTraceVerbosity = $StackTraceVerbosity
if ($Tag) {$pesterConfiguration.Filter.Tag = $Tag}
if ($PassThru) {$pesterConfiguration.Run.PassThru = $true}
Invoke-Pester -Configuration $pesterConfiguration
} catch {
Write-Error -ErrorRecord $_ -ErrorAction $ErrorActionPreference
} finally {
$Global:ProgressPreference = $saveProgressPreference
}
@fflaten I created the issue and provided minimal set of code to repro the issue.