shadow-maint/shadow

New system test framework proposal

ikerexxe opened this issue · 15 comments

Introduction

Currently, shadow has a very limited system test suite written in bash that helps us sanitize the new code. The framework is difficult to use, and obliges the person writing the test to ensure that the system under test is left in the same state at the end of the test. That’s dangerous and it involves having a knowledge that many people don’t have. In addition, the test suite only runs in Ubuntu. All of this makes test development hard work, when it should be an easy task.

Thus, I’d like to propose the usage of a new testing framework. One that keeps us up-to-date with the latest developments in this area.

Requirements

I have identified a series of requirements that the new framework must fulfill:

  • It must work on all the distributions that are part of our CI (i.e. Alpine, Debian and Fedora).
  • The system under test can’t be the one where the tests are run. We can user VMs or containers to deploy the system under test.
  • We must be able to run it locally.
  • The tests must have a proper setup and teardown, where the environment is cleaned and left as before it was run. This includes the test cases that have failed, as we don’t want to pollute the environment.
  • Automatically collect logs and artifacts when a test is finished.
  • Build rich, high level test API to access common functionalities (i.e. adding a user, modifying it, checking the logs, etc.)
  • The APIs must be documented, so that test developers can check and understand their behaviour.

Proposal

The proposal would be to use python, pytest and at least two plugins. The first plugin is pytest-mh, a pytest plugin to run shell commands and scripts over SSH on remote hosts. The second plugin will be new and it will be implemented to meet our needs. It will be in change of handling the common functionalities (i.e. useradd, chage…) that are required for this project, similar to sssd-test-framework but for shadow. These plugins are written in python, so the tests will be written in this language and we will depend on it. I plan to use the python virtual environment to avoid having to install all these dependencies directly in the test runner (a.k.a. our systems).

In addition, I’d like to add other python plugins to make test engineers life easier. An example of those would be pytest-ticket, which adds the ability to filter test cases by an associated ticket of a tracker of your choice. Another example, pytest-tier, adds the ability to filter test cases by associated tier.

Dependencies

The following dependencies will be incorporated into this project:

  • Python and its virtual environment
  • pytest
  • pytest-mh
  • shadow-test-framework, a new pytest plugin implemented specifically to meet our needs

The following are nice to have dependencies:

  • pytest-ticket
  • pytest-tier or pytest-importance
  • pytest-output

Examples

To get a good idea of how it works I will take two tests that are already implemented in the old framework and port them to the one proposed in this issue. The first test case is simple, it adds a new user:

@pytest.mark.topology(KnownTopology.Client)
def test_useradd__add_user(client: Client):
    username = "tuser"
    client.local.user(username).add()

    #The following 3 lines weren't in the original test but I think they are useful
    result = client.tools.id(username)
    assert result is not None, f"Username {username} was not found using id"
    assert result.user.name == username, f"Username {result.user.name} is incorrect, {username} expected"

    for file in ["/etc/passwd", "/etc/group", "/etc/shadow", "/etc/gshadow"]:
        log = client.fs.read(file)
        assert username in log, f"Username {username} was not found in {file}"

    assert client.fs.exists(f"/home/{username}"), f"Home for username {username} was not found"

Note: I’m using the API as it is provided by sssd-test-framework, this will change in our environment as I plan to use more direct APIs for shadow users.

For the second example I took a more complicated test case, one that checks that no user has been created when gshadow is locked.

@pytest.mark.topology(KnownTopology.Client)
def test_useradd__locked_gshadow(client: Client):
    client.fs.touch("/etc/gshadow.lock")

    username = "tuser"
    try:
        client.local.user(username).add()
    except Exception as e:
        assert 10 == e.rc
        assert "useradd: existing lock file /etc/gshadow.lock without a PID" in e.stderr
        assert "useradd: cannot lock /etc/gshadow; try again later." in e.stderr

    #The following 2 lines weren't in the original test but I think they are useful
    result = client.tools.id(username)
    assert result is None, f"Username {username} was found using id"

    for file in ["/etc/passwd", "/etc/group", "/etc/shadow", "/etc/gshadow"]:
        log = client.fs.read(file)
        assert username not in log, f"Username {username} was found in {file}"

As you can see the code is reduced a lot because a great part of the complexity is handled by the framework itself. More specifically, it is not necessary to clean up the environment after the test. In addition, the test cases are easier to understand and more can be done (i.e. running `id` and asserting its result) without great complications.

Conclusion

This is just a proposal and I would like to hear your opinion, so please take your time to read it and find weaknesses, requirement addition, improvements, or even another framework that better suits our needs. I am open to anything as long as it improves our current situation and meets the requirements I have explained above.

Additional references

pytest-mh documentation

sssd test framework documentation

hallyn commented

Thanks for this. I have two questions:

  1. You mention running commands over ssh. Do you have something in mind for spinning up the hosts? (If not, might i recommend that we consider using incus (formerly lxd) vms?)
  2. In your examples, you are using python code to run the shadow commands to test, like trying to add a user. A lot of what the many tests do is test various command options and flags in great detail (seeming quite shell dependent). Can you give an example of how that would look?
1. You mention running commands over ssh. Do you have something in mind for spinning up the hosts? (If not, might i recommend that we consider using incus (formerly lxd) vms?)

If you mean our CI (and the local env) my idea was to use docker and its dockerfiles. As we are currently doing. However, we can use incus, but I'll need your help to set it all up, as I've never used it.

Is it possible to define a recipe in a file so that incus processes it?

2. In your examples, you are using python code to run the shadow commands to test, like trying to add a user. A lot of what the many tests do is test various command options and flags in great detail (seeming quite shell dependent). Can you give an example of how that would look?

As an example, if you'd like to set the password alongside the user creation you can do it in the following way:

client.local.user("user1").add(password="Secret123")

Or the uid and password:

client.local.user("user1").add(password="Secret123", uid=5000)

As you can see you can add the optional parameters by naming them when you call add(). The underlying implementation for SSSD is located at https://github.com/SSSD/sssd-test-framework/blob/master/sssd_test_framework/utils/local_users.py#L130. Just in case you'd like to take a look.

In our case, as we'll have to reimplement this part of the framework, we should provide all options and flags from the very beginning.

hallyn commented

Most of the shadow functionality is in binary programs. So I think it's important to test them that way. I worry that encoding the tests as something like 'client.local.user("user1").add(password="Secret123")' will paper over bugs in argument parsing.

I think the most useful thing would be to implement a testing framework that "just works" in more places. A docker command is probably an easy and familiar way to go about it. I'd still prefer incus (using either containers or vms) for several reasons. We might want to consider using vms to avoid restrictions in our testing of full subuid ranges.

Is it possible to define a recipe in a file so that incus processes it?

I've always just done 'incus exec -- ', but you can use https://linuxcontainers.org/incus/docs/main/cloud-init/ to automatically run setup commands. You can bind mount host directories (so, with your shadow.git tree) to install from, and it'll use virtiofs if you're using incus vms instead of containers.

Most of the shadow functionality is in binary programs. So I think it's important to test them that way. I worry that encoding the tests as something like 'client.local.user("user1").add(password="Secret123")' will paper over bugs in argument parsing.

So, you'd prefer to use the short version? Something like ...add(p="Secret123")? When adding a user, the only required argument is the user's name, so any other arguments will need to be specifically set.

By the way, take into account that CLIBuilderArgs will handle the argument handling in python. We just need to explain this object how to do it.

I think the most useful thing would be to implement a testing framework that "just works" in more places.

What would that be? Do you know of a framework that could help us?

I've always just done 'incus exec -- ', but you can use https://linuxcontainers.org/incus/docs/main/cloud-init/ to automatically run setup commands. You can bind mount host directories (so, with your shadow.git tree) to install from, and it'll use virtiofs if you're using incus vms instead of containers.

Ok, I see they use yaml format to provide a recipe to run the most typical actions (i.e. package installation, upgrade) and a way to run commands.

hallyn commented

Most of the shadow functionality is in binary programs. So I think it's important to test them that way. I worry that encoding the tests as something like 'client.local.user("user1").add(password="Secret123")' will paper over bugs in argument parsing.

So, you'd prefer to use the short version? Something like ...add(p="Secret123")? When adding a user, the only required argument is the user's name, so any other arguments will need to be specifically set.

Well I was thinking that I prefer that the tests actually explicitly use the shell. Or at least expect spawn(cmd[0], cmd[1:]...). But I am interesting in seeing what your proposal ends up looking like.

By the way, take into account that CLIBuilderArgs will handle the argument handling in python. We just need to explain this object how to do it.

I think the most useful thing would be to implement a testing framework that "just works" in more places.

What would that be? Do you know of a framework that could help us?

What's there is IMO actually not as bad as you think it is :) It's actually quite nice - it specifies the shadowfiles to use for starting point, specifies the command(s) to run, and then the checks to do to ensure the command did the right thing. The problem with it is simply that it is invasive/desctructive. Therefore, a thin custom built layer around that which creates a new instance of a golden snapshot, then does the above steps (copies in the beginning files, runs the commands, checks the results) and then destroys the snapshot instance, would be perfect. That's why I believe incus vms would be perfect. We build or download the golden image as pre-run step. Incus should be trivially installable on all distros and trivial to set up locally.

I'm sorry, I should've recommended this before. Perhaps rather than waste more of your time, I should take a stab at an implementation.

Or maybe we should do it in parallel, and see where we end up.

I see where you are going. Your proposal involves using incus to set up the ephemeral testing environment, run a test case, destroy the test environment, jump to the next test case and go through all the steps again with the next test case. Am I right?

I'm worried about log and artifact collection. How would you do this? Also, would we be able to remove all (or at least most) of the echo lines that plague existing test cases? They make the test difficult to read.

Or maybe we should do it in parallel, and see where we end up.

I would like to avoid spending time implementing the framework if we are not going to use it. It will take some time to get the basics ready, and the difference with the current sssd offering is minimal.

However, I'm up to answering any questions, and even porting some of our current test cases to this framework to demonstrate how it works and try to reach a solution that the two of us like.

hallyn commented

I think what my gut feel has been trying to say to me is: we have lots of command line flags which interoperate in subtle ways, and they all need to be tested. If all of our command invocations are through an API (client.local.user(username).add()) then every flag would need an API option, and there's the added complexity of ordering.

Yes, all options need to be added manually. That's a disadvantage to this framework that I'm proposing. Still, do we add new options frequently? If we don't, then the drawback isn't so big.

CLIBuilderArgs uses a dictionary to hold the argument data, and this container is insertion ordered since python 3.6. So, if we want to have the arguments in a given order, we just need to create the dictionary in that order.

I understand your reluctance to add this framework because it hides a lot of complexity, including hardcoding certain things, in itself and does not allow you to change them easily. As said, I'm open to other ideas, I only want to improve our current situation.

Another possibility that occurred to me while I was on vacation. Instead of creating classes that handle the objects being handled (i.e. users, groups, passwords), we can provide the underlying CLI with a simple python wrapper. Example:

client.useradd("user1")

or

client.useradd("user1", password="Secret123", uid=5000)

It might even be possible to eliminate the client part altogether. WDYT?

Hi, I am the author of pytest-mh and sssd-test-framework, so I'd like to provide my two cents...

First, I don't know how your current tests looks, but pytest-mh would definitely work for this project nicely. I agree with @hallyn that you should have a more direct and flexible control over the commands you run, since you are testing a command line tool and not a daemon.

The main purpose of wrapping stuff into a nice framework API is to allow proper backup/restore/cleanup functionality (if we add user to LDAP, the user must be deleted when the test is finished). The other purpose is readability, documentation and easy usage. In case of shadow-utils, you can just backup and restore selected files (like /etc/passwd,group) so you do not need to track precise changes. On the other hand, you need the flexibility because you are testing each command line argument.

What you could do is to run client.host.ssh.exec(["useradd", ...]) directly, but for better readability I suggest to add a wrapper around this e.g. client.useradd(*args) where args is the command line. This will run the command over ssh in bash (bash -c) so you are not loosing shell; it is also possible to specify different shell if desired.

pytest-mh is a generic plugin that gives you the core functionality and building blocks required for testing. sssd-test-framework is for SSSD project. You can use it as inspiration, perhaps even include some bits from it, but you definitely should not follow it 1:1. You need to come up with what works for your project.

Thanks @pbrezina and @ikerexxe, that sounds good.

zeha commented

The system under test can’t be the one where the tests are run. We can user VMs or containers to deploy the system under test.

I'll note that Debian's test infrastructure is often already running inside some form of virtualization, and the entire system will be thrown away after the tests. It'd be good if tests can still be run in such an environment.