should test files really be run once-per-@test? intentional? would patch be possible/acceptable?

Question

should test files really be run once-per-@test? intentional? would patch be possible/acceptable?

jzacsh opened this issue 8 years ago · 3 comments

tl;dr can I chnage bats, per the below note[1]?

I was a bit confused that some failure-checking I placed atop a .bats file was executing once-per @test, as I expected that was solely the job of setup and teardown functions

but then I saw pull #23 and this helpful bit in the wiki:

Then, each test file is executed n+1 times, where n is the number of test cases in the file.

Is this intentional? If so, is there any functional difference between setup/teardown and all other top-level code (aside from teardown being run after tests)? If not intentional, and I can figure out how to fix it[1], would such a pull request be welcome?

[1]: "Fix it" I imagine means:

setup and teardown run on every @test
all other top-level code (just like @tests themselves) run once-per-file

Answer 1 · 2016-07-15T18:08:53.000Z

also, thanks for bats :)

Answer 2 · 2016-09-25T23:33:23.000Z

I'm not the author of Bats. I have a pretty good grip on its internals, but my explanation may be incomplete.

Why source multiple times?

Yes. This is intentional. This is done so that every test is run in a clean environment independent of each other. Bash has very limited scoping and running each test in a separate process is the only way of avoiding variables leaking into the environment of subsequent tests. This, of course, means that each test process has to source the test file.

Global scope vs `setup` and `teardown`

You are right that setup, teardown and top-level code (code in global scope) are very similar. Top-level code is run every time the test file is sourced, including the first time when only the number of tests is counted, while setup and teardown is run only when a test is being executed. This has subtle, but important ramifications.

First, setup and teardown has access to certain global variables that top-level code does not, because they are not set yet. The following table summarises which variable is accessible in which function.

variable	top-level	`setup` and `@test`	`teardown`
$BATS_TEST_NAMES	✗	✓	✓
$BATS_TEST_NAME	✗	✓	✓
$BATS_TEST_DESCRIPTION	✗	✓	✓
$BATS_TEST_NUMBER	✗	✓	✓

The following variables are used internally and are not part of the public interface. However, they can be useful in writing test helpers.

variable	top-level	`setup` and `@test`	`teardown`
$BATS_TEST_COMPLETED	✗	✗	✓
$BATS_TEST_SKIPPED	✗	✗	✓
$BATS_ERROR_STATUS	✗	✗	✓

Second, code in global scope is run n+1 times, where n is the number of tests. If the code does something time consuming, e.g. large download or database setup, that extra run can become a significant overhead.

My opinion

Load helpers in global scope. Loading all helpers in global scope at the top of the file improves readability, and sourcing them an extra time incurs virtually no overhead.
Intialise test environment in setup. Setup and cleanup naturally maps to setup and teardown, respectively. It's intuitive, clean, and ensures that they are only done when necessary, i.e no wasteful operations during the first sourcing (test case counting) of the test file.

Answer 3 · 2016-11-14T15:16:45.000Z

I understand why test files are sourced on every run, but the user experience is confusing. It seems to me that most of the confusion here stems from the inability to do once-per-suite setup and teardown. I've had some success in using $BATS_TEST_NUMBER and $BATS_TEST_NAMES to determine when we're running the first and last tests. Here's are a couple of examples:

Like @ztombol I agree there's immense value in having a clean environment across runs, but there are times when you do need state to spill over. The file linked to above is testing a long running interactive process. Sure, it can be started for every test, but fact is that having the tests share this process is beneficial for a few reasons:

It makes tests after the first quicker
Since it's a long running interactive process, we want to catch issues stemming from state pollution
We can more easily track what happens during the test if we have only one log for the process

I think the conditions in those setup and teardown functions could easily be extracted and allow two new suite wide functions to be created, perhaps setup_suite and teardown_suite? It'd look something like this:

setup_suite() {
  # Runs *once* per file
  # The first function to be called, before setup() and before the first test
}

teardown_suite() {
  # Runs *once* per file
  # The last function to be called, after teardown() and after the last test
}

setup() {
  # Called *once* before each test, as usual
}

teardown() {
  # Called *once* after each test, as usual
}

@test "some thing" {
  # Run as usual
}

Happy to contribute a PR to this, if #150 can be resolved.

Why source multiple times?

Global scope vs setup and teardown

My opinion

Global scope vs `setup` and `teardown`