facebook/duckling

[Question] Which elements are relevant for Test-Failures?

Opened this issue · 2 comments

Hello,

I came accross this project as it is mentioned in HaFla and two Ducklings PRs are used there as Datapoints for buggy (and then fixed) programs.

However, I don't fully understand how Duckling runs it's tests / build.
I used GHCUP to set my GHC to 8.2 and cabal to 3.10.1.0, checking out Duckling at Commit ddbb6fd (Issue #196 and PR #199 ).

It touches 3 Files: A Classifier, a Rule and a Test.
I try to reproduce a failing CI by doing the following:

git checkout ddbb6fd
git revert ddbb6fd --no-commit
git restore --staged ./Duckling/Time/DE/Rules.hs
git restore ./Duckling/Time/DE/Rules.hs
cabal exec duckling-regen-exe
cabal test

But it passes.
Instead I have to do:

git checkout ddbb6fd
git revert ddbb6fd --no-commit
git restore --staged ./Duckling/Ranking/Classifiers/DE_XX.hs
git restore ./Duckling/Ranking/Classifiers/DE_XX.hs
cabal test

Which also fails after re-running the duckling-regen-exe.
If I revert both files, I get a passing TestSuite again.

In a similar manner, the other PR used (Commit d051632 , Issue #273) touches a Rule, Corpus, Classifier and Test.
If I only revert the Classifier, the expected Tests fail. If I revert any other element, tests pass.
I would expect that I have to revert the Rule and Regenerate the classifiers for a test-failure.

This behaviour is a bit opaque to me, and any explanation would be greatly appreciated.

❯ ghcup install ghc 8.2

[ Info ] downloading: https://raw.githubusercontent.com/haskell/ghcup-metadata/master/ghcup-0.0.8.yaml as file /Users/kellybrower/.ghcup/cache/ghcup-0.0.8.yaml
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
[ Warn ] Assuming you meant version 8.2.2
[ Error ] [GHCup-00010] Unable to find a download for 8.2.2
[ Error ] Also check the logs in /Users/kellybrower/.ghcup/logs

Classifiers are generated from Rules. it's possible that there's a buggy rule, and a corresponding test failure.