idris-lang/Idris-dev

Regression: type checking failures and unacceptable type checking times

Closed this issue · 33 comments

Type checking SimpleProbBasicOperations.lidr from the https://github.com/nicolabotta/SeqDecProbs/tree/master/frameworks/14- framkework works fine in Idris-dev-769301b:

nicola@cirrus:~/lala/SeqDecProbs-master/frameworks/14-$ time idris +RTS -K32000000 -RTS -p contrib -p effects -V --check SimpleProbBasicOperations.lidr
Type checking ./SimpleProbBasicOperations.lidr

real    0m25.926s
user    0m24.828s
sys 0m1.104s
nicola@cirrus:~/lala/SeqDecProbs-master/frameworks/14-$

With the most recent Idris-dev revision, however, one gets

nicola@cirrus:~/lala/SeqDecProbs-master/frameworks/14-$ time idris +RTS -K32000000 -RTS -p contrib -p effects -V --check SimpleProbBasicOperations.lidr
Type checking ./SimpleProbBasicOperations.lidr
Killed

real    21m14.810s
user    13m31.180s
sys 0m56.064s
nicola@cirrus:~/lala/SeqDecProbs-master/frameworks/14-$

Notice also that, with the current Idris version, it takes about 20 minutes for type checking to fail. In contrast, type checking with the old Idris version succeeds in about 25 seconds.

Thus, as far as type checking capabilities are concerned, a dramatic regression seem to have occurred over the last three weeks!

@nicolabotta Did you try to bisect to find out which commit had this regression?

I do agree that it is problematic that code which took instantly to type check now uses around 20 minutes.

@nicolabotta Did you try to bisect to find out which commit had this regression?

Not yet. I'll try and report

Correction: 769301b was committed on Aug. 26. Thus, the regression has occurred over the last 10 days, not over the last three weeks!

Not yet. I'll try and report

Great, thanks!

Started bisecting: 5b2711f also fails to type check. Thus, the target is between 769301b (Aug. 26) and 5b2711f (Aug. 31).

Further bisecting: 037ea33 fails to install on Debian stable.

Further bisecting: f754d9c succeeds. Thus, the target is between f754d9c (Aug. 27) and 5b2711f (Aug. 31)

Further bisecting: 9997caf fails. Thus, the target is between f754d9c (Aug. 27) and 9997caf (Aug. 28)

Further bisecting: 160eb14 fails. Thus, the target is between f754d9c (Aug. 27) and 160eb14 (Aug. 27)

Further bisecting: ff776da succeeds. Thus, the target is between ff776da and 160eb14 (Aug. 27)

7c1a2c6 fails. Thus, if I have made no mistakes, this commit is responsible for the regression!

@nicolabotta Thank you very much for the bisection! I will take a look

@nicolabotta Could you please check whether you experience the same issues with 7e0c70a? As far as I can see 7c1a2c6 should not influence you directly, but it might be its parent commit which I mentioned.

jfdm commented

Just to let people know, I am too seeing a regression when compiling the DList data type from idris-containers.

@ahmadsalim The problem is that installing 7e0c70a (and 037ea33) fails on my Debian stable system:

nicola@cirrus:~/src/Idris-dev-7e0c70a09c634771ec9823d99f8d9bf7ece68b82-installation_problems$ make
cabal install --enable-tests
Warning: The package list for 'hackage.haskell.org' is 20 days old.
Run 'cabal update' to get the latest list of available packages.
Resolving dependencies...
In order, the following will be installed:
idris-0.12.2 *test (reinstall) changes: tagged-0.7.1 added, tasty-0.11.0.3
added, tasty-golden-2.3.1 added, tasty-rerun-1.1.6 added
Warning: Note that reinstalls are always dangerous. Continuing anyway...
Configuring idris-0.12.2...
make[1]: Entering directory '/home/nicola/src/Idris-dev-7e0c70a09c634771ec9823d99f8d9bf7ece68b82-installation_problems/rts'
rm -f idris_rts.o idris_heap.o idris_gc.o idris_gmp.o idris_bitstring.o idris_opts.o idris_stats.o idris_utf8.o idris_stdfgn.o mini-gmp.o getline.o idris_net.o libidris_rts.a 
make[1]: Leaving directory '/home/nicola/src/Idris-dev-7e0c70a09c634771ec9823d99f8d9bf7ece68b82-installation_problems/rts'
fatal: Not a git repository (or any of the parent directories): .git
Generating dist/build/autogen/Version_idris.hs for prerelease PRE
Building idris-0.12.2...
Preprocessing library idris-0.12.2...
In-place registering idris-0.12.2...
Preprocessing executable 'idris' for idris-0.12.2...
Preprocessing executable 'idris-codegen-c' for idris-0.12.2...
Preprocessing executable 'idris-codegen-javascript' for idris-0.12.2...
Preprocessing executable 'idris-codegen-node' for idris-0.12.2...
Preprocessing test suite 'regression-and-feature-tests' for idris-0.12.2...
Building libraries...
make[1]: Entering directory '/home/nicola/src/Idris-dev-7e0c70a09c634771ec9823d99f8d9bf7ece68b82-installation_problems/libs'
make -C prelude build
make[2]: Entering directory '/home/nicola/src/Idris-dev-7e0c70a09c634771ec9823d99f8d9bf7ece68b82-installation_problems/libs/prelude'
../../dist/build/idris/idris --build prelude.ipkg
make[2]: Leaving directory '/home/nicola/src/Idris-dev-7e0c70a09c634771ec9823d99f8d9bf7ece68b82-installation_problems/libs/prelude'
make -C base build
make[2]: Entering directory '/home/nicola/src/Idris-dev-7e0c70a09c634771ec9823d99f8d9bf7ece68b82-installation_problems/libs/base'
../../dist/build/idris/idris --build base.ipkg
Loading /home/nicola/.cabal/share/x86_64-linux-ghc-7.6.3/idris-0.12.2/base/System/Concurrency/Sessions.ibc failed: Incompatible ibc version for: "/home/nicola/.cabal/share/x86_64-linux-ghc-7.6.3/idris-0.12.2/base/System/Concurrency/Sessions.ibc"
This external module was built with a later version of Idris.
Please reinstall: base

Makefile:5: recipe for target 'build' failed
make[2]: *** [build] Error 1
make[2]: Leaving directory '/home/nicola/src/Idris-dev-7e0c70a09c634771ec9823d99f8d9bf7ece68b82-installation_problems/libs/base'
Makefile:2: recipe for target 'build' failed
make[1]: *** [build] Error 2
make[1]: Leaving directory '/home/nicola/src/Idris-dev-7e0c70a09c634771ec9823d99f8d9bf7ece68b82-installation_problems/libs'
Failed to install idris-0.12.2
cabal: Error: some packages failed to install:
idris-0.12.2 failed during the building phase. The exception was:
ExitFailure 2
Makefile:18: recipe for target 'install' failed
make: *** [install] Error 1
nicola@cirrus:~/src/Idris-dev-7e0c70a09c634771ec9823d99f8d9bf7ece68b82-installation_problems$

These commits are also marked with a "All checks have failed" warning.

@nicolabotta Does it still fail if you use make relib?

@ahmadsalim Yes, same error with make relib:

...
Type checking ./Control/Catchable.idr
Type checking ./System/Concurrency/Raw.idr
Loading /home/nicola/.cabal/share/x86_64-linux-ghc-7.6.3/idris-0.12.2/base/System/Concurrency/Sessions.ibc failed: Incompatible ibc version for: "/home/nicola/.cabal/share/x86_64-linux-ghc-7.6.3/idris-0.12.2/base/System/Concurrency/Sessions.ibc"
This external module was built with a later version of Idris.
Please reinstall: base

Makefile:5: recipe for target 'build' failed
make[2]: *** [build] Error 1
make[2]: Leaving directory '/home/nicola/src/Idris-dev-7e0c70a09c634771ec9823d99f8d9bf7ece68b82-installation_problems/libs/base'
Makefile:2: recipe for target 'build' failed
make[1]: *** [build] Error 2
make[1]: Leaving directory '/home/nicola/src/Idris-dev-7e0c70a09c634771ec9823d99f8d9bf7ece68b82-installation_problems/libs'
Failed to install idris-0.12.2
cabal: Error: some packages failed to install:
idris-0.12.2 failed during the building phase. The exception was:
ExitFailure 2
Makefile:49: recipe for target 'relib' failed
make: *** [relib] Error 1

Anyway, I think that the issue is from this 7e0c70a commit, since it seems to sometimes use the normalized version of types in mkClaims now (where it didn't before), which I would guess might hurt performance (there was even a comment that it would be better if WHNF was used).

Unfortunately, I am not sure how to fix this issue; I guess we need WHNF, but I am not experienced enough with the compiler to know how to implement it I think.
Maybe @edwinb or anyone else from @idris-lang/commiters could provide their opinion?

I had similar error msgs today when I built idris from source. Probably because I mixed the idris from the haskell package and that from the github source.

Unfortunately, I am not sure how to fix this issue; I guess we need WHNF, but I am not experienced enough with the compiler to know how to implement it I think.
Maybe @edwinb or anyone else from @idris-lang/commiters could provide their opinion?

An obvious fix would be to revert to the original implementation of the incriminated function.

But, of course, I do not know what was the intention behind the commit and which advantaged it actually brings to Idris. In my specific case, it is obviously a dealbreaker.

The other question is that of testing. This is the second time in a few months that commits essentially break type checking (parsing) on perfectly legal code. Once the damage is done, identifying the culprit through bisection is very time consuming. It seems to me that these kind of regressions could be easily avoided by including relevant examples in the test suite. If I can help in providing such examples, please let me know.

@xekoukou I only have Idris from the github source. In my case, the installation problem seems to only appear for commits that are marked with a red cross.

jfdm commented

As a reminder: When building idris from source, make relib may fail if the idris packages in libs had namespace/name changes. It is a good idea when switching between idris versions (i.e. during a bisect) to clobber the build files in libs and in idris --libdir. This will remove the offending ibc and allow for a clean environment.

jfdm commented

To provide a concrete MFE, one can examine: https://github.com/jfdm/idris-containers/blob/master/Data/DList.idr#L360-L367

When given said file to Idris, it 'hangs' on this function.

jfdm commented

I've created a MFE example here for people to use and to be added to the regression tests:

https://gist.github.com/jfdm/751a44887d24dfa0f743fb7472290421

The other question is that of testing. This is the second time in a few months that commits essentially break type checking (parsing) on perfectly legal code. Once the damage is done, identifying the culprit through bisection is very time consuming. It seems to me that these kind of regressions could be easily avoided by including relevant examples in the test suite. If I can help in providing such examples, please let me know.

In most cases once a regression has been fixed, there is a tendency to add the regression to the test suite. Remember the suite already has a plethora of regression tests under regxxx or regressionxxx.

If you do notice that a fix doesn't have an accompanying regression test, then please do raise an issue. Further when reporting an issue, if you can construct a small (as in compile/checking time) examples their contribution would be appreciated. Such an example can then become the test case in the regression tests.

Further, we do have documents for how to add a test case, any problems with those instructions please do raise your problem in a separate issue.

@jfdm Could it be a good idea, to add major Idris repositories such as @nicolabotta's and others to the CI now?
We are in a way reaching more "stable" versions of Idris now, and so it might not be bad to revive the old suggestion of adding third party libraries to CI.

jfdm commented

Could it be a good idea, to add major Idris repositories such as @nicolabotta's and others to the CI now?

No. The reason is that we want our tests to run in a reasonable time, and having large projects in the CI is counterproductive to this. However, it would be nice to test idris against these projects when doing a release.

Maybe a better idea would be to have a script that takes several known projects and builds them?

jfdm commented

In other news commit 566b6a0 seems to have addressed the regression for myself. @nicolabotta Is it possible for you to check if your issue has been addressed as well.

In other news commit 566b6a0 seems to have addressed the regression for myself. @nicolabotta Is it possible for you to check if your issue has been addressed as well.

I can install 566b6a0 but ...

nicola@cirrus:~/lala/SeqDecProbs-master/frameworks/14-$ time idris +RTS -K32000000 -RTS -p contrib -p effects -V --check SimpleProbBasicOperations.lidr
Can't find import Builtins
Can't find import Prelude
Can't find import Data/List

real    0m0.547s
user    0m0.500s
sys 0m0.048s
nicola@cirrus:~/lala/SeqDecProbs-master/frameworks/14-$

I am going to try with the latest Idris commit and report.

I can confirm that the latest Idris commit 62751fa still suffers from the regression. The most recent Idris version (that I know of) that does not exhibit the regression is ff776da.

I have added a stripped down test to the repository. You should be able to do

cd SeqDecProbs-master/frameworks/14-/
idris +RTS -K64000000 -RTS --checkpkg issue3405.ipkg
rm Issue3405.ibc
time idris +RTS -K64000000 -RTS --checkpkg issue3405.ipkg

on a fresh installation of https://github.com/nicolabotta/SeqDecProbs. With ff776da, the last command takes about 30 seconds to terminate. With the commits affected by the regression, it takes ages to fail to type check. The test file is

> module Issue3405

> import SimpleProb
> import NonNegRational
> import NonNegRationalBasicOperations
> import NonNegRationalBasicProperties
> import NatPositive
> import Fraction
> import NumRefinements
> import FractionNormal
> import ListOperations
> import ListProperties

> %default total
> %access public export
> %auto_implicits off

> -- %logging 5

> normalize : {A : Type} -> SimpleProb A -> SimpleProb A
> normalize {A} (MkSimpleProb Nil sum) = MkSimpleProb Nil sum
> normalize {A} (MkSimpleProb (ap :: Nil) sum) = MkSimpleProb (ap :: Nil) sum
> normalize {A} (MkSimpleProb (ap :: ap' :: aps) sum) = MkSimpleProb aps' sum' where
>   aps' : List (A, NonNegRational)
>   aps' = discardBySndZero (ap :: ap' :: aps)
>   sum' : sumMapSnd aps' = 1
>   sum' = trans (discardBySndZeroLemma (ap :: ap' :: aps)) sum

With Idris revisions affected by the regression, uncommenting the log option clearly shows where type checking hangs.

Since the beginning of September I am stuck with ff776ds because of this issue. Today, I have checked the most recent Idris version and the probems have worsened. Type checking takes meanwhile hours if not days! Is there a chance to get the problem fixed before 1.0? This is finally a regression and fixing it a low-hanging fruit. Thanks, Nicola

I have retested the MFW by @jfdm and the provided issue test code by @nicolabotta and it seems to no longer be an issue.

@jfdm You are of course welcome to add the regression test.

Thanks Ahmad, I can now move up from 0.12 to 0.99!