Testing: tests for correctness, coverage

Question

Testing: tests for correctness, coverage

Closed this issue 3 years ago · 3 comments

The testing included currently tests that some of the main functions can be called and the code executes without error. Tests should be expanded to:

Test that all of the functions execute without error.
Add tests to verify that the functions work correctly, not just without error -- for example, can you assert what the coefficients should be when you fit a particular model? Or at least how many coefficients there should be? There are multiple ways you may get at whether the output is correct.

Tests for the correctness of the results are needed so that you know if future changes to your code change the output in anyway. As this could have a significant impact for the users (could change reported research results), this is important.

If there are published examples of what the model should produce for various inputs (for example, in the papers you cite), those would be good cases to use for testing -- verify that your code produces the same results as the published algorithms.

Answer 1 · 2021-06-22T16:14:42.000Z

Thank you very much for the important point about testing. We have incorporated more thorough tests for the functions in the package. We have incorporated and implemented unittest (from Python’s standard library) in our IDCeMPy package to address this excellent suggestion. The tests in this case are implemented using the following files in our package: “ziopc_unit_test.py,” “miopc_unit_test.py,” and “gimnl_unit_test.py”. This allows users to test and evaluate via the unittest library whether or not the main functions in the package execute without error. “ziopc_unit_test.py” evaluates whether the functions in IDCeMPy that fit the Zero-Inflated Ordered Probit models execute without error, “miopc_unit_test.py” assesses whether the functions that fit the Middle-Inflated Ordered Probit models execute without error, and “gimnl_unit_test.py” checks if the functions that fit the Generalized-Inflated MNL model in the gimnl module without error.

Beyond ensuring that the functions run without error, the tests that we include in this case also allows users to check whether the output they obtain from their main functions replicate or diverge from known or published output. As an empirical exercise, we thus selected some coefficients obtained from estimating the inflated discrete choice models [e.g. MiOP(C) and GiMNL] using our code in IDCeMPy and then used these tests to “assert” whether these coefficients match the same coefficients reported by (i) Bagozzi and Mukherjee (2012) for the MiOP(C) models as we employed their data for these models in IDCeMPy, and (ii) Bagozzi and Marchetti (2017) for the GiMNL model since we used their data as a replication exercise in this case. The coefficients we obtain from estimating these models using our IDCeMPy package match the respective output published by the aforementioned authors. Finally, the included tests permit researchers to check and assert what the results of the likelihood function and Vuong tests (for the models in the zmiopc module) should be. These tests are integrated to the GitHub repository’s Python package workflow. Every time there is a new push to the repository, the tests are triggered to run with Python 3.7, 3.8, and 3.9.

Furthermore, following your advice on asserting what the coefficients should be, we first compared the results obtained from the functions in our IDCEMPy package that fit the Middle-Inflated Ordered Probit models without and with correlated errors [MiOP(C) models] to the MiOP(C) results published by Bagozzi and Mukherjee (2012) in Political Analysis. The data on support for European Union membership that we employ to assess the MiOP(C) models estimated by the relevant code in IDCEMPy is fully drawn from the data that the two authors mentioned above use in their published paper. The coefficient for each covariate in the MiOP(C) models estimated from IDCEMPy are exactly similar to those reported by Bagozzi and Mukherjee (2012). We also compare our estimates obtained from the Generalized-Inflated MNL (GiMNL) model in our IDCEMPy package to the results from the same model by—and using the same Presidential vote choice data as—Bagozzi and Marchetti (2017) in their published Political Science Research Methods paper. The GiMNL model results generated from IDCEMPy mirrors those reported by Bagozzi and Marchetti (2017). This exercise indeed allowed us to verify that our code produces the same results as the published algorithms. The ZiOP(C) model results from IDCEMPy that we extract (using the CDC’s National Youth Tobacco Survey Dataset from 2018) for the documentation's example is unique as these set of models have not to our knowledge been applied to this dataset. A brief summary of the comparison of the results mentioned above is stated in the IDCEMPy package’s GitHub repository.

Answer 2 · 2021-06-29T17:16:57.000Z

Great!

If you set places=0 though, doesn't this round the values to zero decimal places? For larger values, this may be appropriate (just comparing the integer part of the number), but for coefficient values < 1, this doesn't seem to be the right choice. I may be missing something here though.

Move your test files to your test directory.

Answer 3 · 2021-07-06T14:15:43.000Z

This is an important question. When we set places = 0, the tests will pass for coefficients with differences of up to 1 with their respective asserted values. It is thus plausible that setting places = 0 may not be appropriate for smaller coefficients obtained (for example) from the data we use for our tests. However, to ensure that the functions in our package provide results that converge approximately to the asserted values, we went over all our tests and made the adjustments for them to be more restrictive. For instance, we set places = 1 for tests to compare coefficients. For tests for the likelihood and Vuong test functions, we set places = 2. Doing so will make the tests more sensitive and likely when the functions’ results are too different from the asserted values. Following your request, the test files have been moved to our /test directory.