haskell/cabal

Make cabal new-doctest

Closed this issue ยท 41 comments

cabal doctest doesn't work with nix builds:

% git diff
diff --git a/distributive.cabal b/distributive.cabal
index 94ad11e..f0d049e 100644
--- a/distributive.cabal
+++ b/distributive.cabal
@@ -12,7 +12,7 @@ bug-reports:   http://github.com/ekmett/distributive/issues
 copyright:     Copyright (C) 2011-2016 Edward A. Kmett
 synopsis:      Distributive functors -- Dual to Traversable
 description:   Distributive functors -- Dual to Traversable
-build-type:    Custom
+build-type:    Simple
 extra-source-files:
   .travis.yml

...

 % cabal --version
cabal-install version 2.1.0.0
compiled using version 2.1.0.0 of the Cabal library 

% cabal doctest
Run the 'configure' command first.

I assume we are talking about nix-style builds, and the need for new-doctest?

Yes.

If this can be fixed on the cabal-install side only, the fix can still make it into cabal-install 2.0 (or 2.0.1).

I think that the design of the current cabal doctest will want some changes when it becomes cabal new-doctest. I'm just going to jot down some thoughts here.

  • I'm a bit worried that a naive implementation will take a long time to run on larger projects. I think incrementality is likely necessary: some kind of state will have to be kept, and only the actually-changed components (including transitively changed) re-doctested. This may not be an issue if doctest is actually blazing fast, but I don't believe it is.

  • There should probably be a way to opt out of doctests on a module-by-module basis, as well as a component-by-component or package-by-package method. The module-by-module method could be something like an {-# OPTIONS_DOCTEST no-doctest #-} pragma, which I guess would be doctest's concern and not Cabal's, but the coarser-grained method would be in Cabal's court, and should probably go in the .cabal file.

  • We'll need a doctest-depends field. Motivating example: a library that provides lenses without depending on lens, but which wants to use lens in its examples.

  • Should doctests be their own components? We'd need some way to distinguish, e.g., the doctests of lib:foo and exe:foo, since we can't map both of them to doctests:foo.

  • Should cabal new-test run doctests by default? I think it probably should.

@quasicomputational I do like your approach. Note that cabal doctest was only a fist stab at the issue, mostly driven by my disliking for custom setups.

I'm on the fence though if doctest should be considered a test-suite, if it is though we should rather have a test-suite-extra-depends instead of doctest-depends.

That said, on the whole I'm not convinced that treating doctest special in any form is the right approach anymore. I'd rather see some form of wrapping doctest into a proper cabal test-suite and treat it in a more generic fashion.

I think I'm coming around to the idea of doctests being a full-blown test-suite of their own, with all the benefits of being able to refer to them with component syntax, use conditionals, etc. I think some kind of special treatment is inevitable, because doctesting is profoundly unlike other tests (e.g., it requires source access, it needs extra dependency information), but minimising that special treatment is a good goal.

How about with a new test-type? Something like this:

library
  -- Does have doctests
  ...
  haddock-example-depends:
    lens

executable foo
  -- Does have doctests
  ...

executable bar
  -- Doesn't have doctests
  ...

test-suite doctests
  test-type: doctests-1.0
  doctest-components:
    lib
    exe:foo

haddock-example-depends is, I think, the most specific name for the type of dependency incurred here, which doctest needs to know about. It ought to go on the stanza of the component under test, because it's really information about it. In theory other tools might also want to consume this information, but I'm struggling to think of concrete examples that aren't doctest shaped.

A doctest test-suite wouldn't have the usual build information fields. buildable makes sense; other-modules, cc-options, etc, less so.

doctest-components would only be able to refer to components from the same package; if it's not present I think it should default to *, meaning "all components in this package". This makes the decision to use doctest at all in a package opt-in but with a low threshold (literally two lines), which is about as nice a UX as we might hope for.

I also realised that we likely don't want special-cased time-saving mechanisms for doctests: other test suites run unconditionally, and if we give users the power to toggle doctests on and off they can come up with schemes of their own (e.g., a fast-tests flag enabled by default that turns off the doctests).

The fun thing here is that it requires practically no cabal-install changes: the knowledge about how to run test suites is in Cabal.

I can see a path to implementing this. @angerman, are you still intent on working on this, or should I put together a 1.0 myself and see what people think? This is missing the boat for 2.4, but if we can get it in for 3.0 and I can start getting rid of cabal-doctest (a wonderful hack, but still a hack), I'd be quite happy.

doctest-components should be singular, it will make things simpler. With common stanzas repetition can be reduced, so need for copy & paste is not a valid concern.

And yes, implementing this is simple:

  • Intstall doctest as you would install it using with build-tool-depends
  • prepare component (i.e. install additional dependencies, haddock-example-doctest above)
  • build a command line to run doctest
  • ...
  • profit!

There should be a doctest-version field, cabal-doctest works around that by having doctest in build-depends. FWIW, every feature in cabal-doctest is needed by someone, so it's good experiment.


BONUS: for interested people, I'm working on another HACK solution cabal-doctest-cli:
haskell-servant/servant@master...phadej:cabal-doctest-cli
It's less reliable than cabal-doctest, but doesn't require build-type: Custom, its TODO isn't that long:

  • implement hardcoded parts (7.10.3)
  • recognise stack, support it
  • refactor

In short: it's kind of doctest using .ghc.environment files, yet we "read" the environment from plan.json (or whatever is in Stack).

Good shout on N test suites for N doctestable components simplifying things; I'd been attracted to the ease of use of having a "doctest everything" default, but that would definitely be more complex. Also, yeah, doctest-version needs to be a thing, along with doctest-options. doctest-version needs to be a required field, right?

Can we get away without doctest-source-dirs and doctest-modules? IIUC the point of both of those is to support doctesting modules which aren't in any other component, but I'm not convinced that that's something we would want to support. Since there's a workaround (create a fake executable with buildable: false solely to doctest it), I don't think they're necessary for a 1.0; we can add them later if it's really a pain point and we decide it's sensible.

Yes, doctest-source-dirs + doctest-modules is used (as in cabal-doctest) only in servant to test few "this should fail to compile" things. They are a corner case we don't need support. Also they could be moved somewhere else. Even to external "test-only" package in servant, there are indeed workarounds.

Discussion in #3788 suggests that a new test-type won't work as well as it could: versions of Cabal that don't understand it will noisily fail, but doctests not working isn't something that should break the whole package. So it'll probably be better to have a doctests stanza, since that does get ignored silently, so it can be used in packages that want to keep their cabal-version low.

Still thinking aloud, but a the doctest binary used can't be solved as an independent component in the way that build-tool-depends or setup can: it needs to agree with the component under test about GHC and QuickCheck, at least. My hunch is that this isn't a particularly big complication: just treat doctest as an unqualified goal.

Why it does need to agree with QuickCheck? doctest doesn't even depend on QuickCheck.

Oh yeah, you're right. I had it in my head that the runner binary needed to be linked against it, but it actually only touches it in interpreted code.

erikd commented

I've been having a look at this issue, and can't really figure out what a solution to this would look like.

I am currently using cabal-install from git and running cabal test in a project that uses doctests and passes the test when running stack test. When this fails under cabal, it seems to be that the tests are not being built with the packages listed in the cabal file.

Over the last few days I stumbled into an intersection of Cabal, doctests, and also the impact on some build ecosystems like Nix.

Since new-doctest is not yet implemented, I'd like to put a perspective out there in hopes it might have some influence. Specifically around two sentiments:

  • Not many people do "does not compile tests."
  • Setup.hs is so open-ended that we can not support building components one-at-a-time.

Regarding both points, I feel like there's a bias towards compromising to the legacy of what we have over giving users more features. In some ways, this feels inverted from how I see Haskell the language. But at this point, if resources were finite and fungible (definitely not fungible in real life), I'd trade off progressive language features for progressive tooling.

With respect to the argument that "does not compile tests" are limited to just a few projects like Servant, I'd like to consider that we might have far more of these if they were better supported by tooling. We're often trying to make typesafe DSLs in Haskell, and free theorems, though amazing, are never comprehensive enough in practice. To really know if something is typesafe, I need to test more than the happy path, especially with a system with as much complexity as GHC.

Regarding building components one at a time, I feel we can use all the help we can get to have a better caching story in Haskell. Build times can be really bad, which affects developer productivity more than I think we admit to ourselves. Every little step helps. I think building components one-at-a-time is one of those steps. I know what we have is cached locally, but not necessarily in a distributable way (Nix- or Bazel-style).

So I agree that Setup.hs is extremely free-form. But would it be possible to at least provide a specified path forward to allowing packages to opt into a way to build components one at a time and pass artifacts amongst them? Obviously, legacy projects with extremely free-form Setup.hs files would not be ready to opt into this new world without some rework. They'd have to be compiled all in one step, as they are now. But most projects with a simple/default Setup.hs are already good to go.

So I think for me, this is in part about doctests, which I do think are great, for the reasons I mentioned above. But it's also larger than doctests for me. I'm interested in thinking about how to begin a process of constraining the freedom Setup.hs gives us, with the goal is setting us free to not only have less hacky doctests, but also any number of neat things people have the mind to do in the future.

So I haven't said much technical here. Part of that is because I think there are devils in the details that I don't know about yet, and I don't want to speak out of turn. But I would like to help where I can. Mostly I wanted to see if I could get my sentiment to have some traction. And also learn more about these details where devils are, so I can see how I can contribute best.

Related to Setup.hs @michaelpj makes a good point in another thread that a primary motivation for Haskell.nix compiling components separately is for cross-compilation, which I think might be a compelling motivation in addition, and beyond doctests even.

input-output-hk/haskell.nix#388 (comment)

doctests (or tests in general) in cross-compilation setting is problematic to begin with. How to run them?

I think @michaelpj's comment alludes to the kernel of a solution.

the setup has to be compiled for the build arch, while the other modules are compiled for the host arch.

But it's certainly work to get there. I think this is where the Haskell.nix folk have a perspective I'd be really interested in, because I think they actually cross-compile a non-trivial number of packages, and have a better sense for what went well and where there are problems. I'm not sure who in that project would be worth tagging into this conversation. Maybe they are already already watching/participating in this thread.

@phadej

doctests (or tests in general) in cross-compilation setting is problematic to begin with. How to run them?

that's why we have --test-wrapper ๐Ÿ˜„ You may have a way to invoke a program either through emulation or scp+ssh or something. Cabal shouldn't be aware of the details here, and a shell script sandwiched between cabal and the executable can do the trick most of the time. I think we did talk about this at some point prior? ๐Ÿค”

On to the doctest issue at hand. We should acknowledge that @sol did a great job with providing an inline testing facility. By design doctest relies on ghci, and this makes it a bit special compared to other test facilities we have. Most of the haskell tests we write end up being executables that are then executed. Thus when cross compiled they end up being a regular executable for the target architecture and as such can be executed there. For doctest this would mean we need ghci on the target, which is non-trivial as it involves reading and loading object code.
We do have iserv though, and we use it extensively to get TemplateHaskell to work. Similarly we can get a remote ghci via iserv (with some limitation). And I believe @Ericson2314 has pushed most of the required changes into ghc by now. If it's not already working out of the box, I believe it's very little that would be left to fix up.

With that said, I believe we could make doctest work in a cross compilation setting (where we have iserv), by making doctest use ghci via iserv. This of couse complicates things a bit as doctest now needs to run on the build machine, unlike all other tests which would run on the target.

The most practical way forward for now would be to test native -> native, and hope for the best with cross compiation without testing. That's not very satisfactory and as layed out above I believe we have most of the infrastructure in place to make cross testing more viable; I don't see myself being able to spend much time on this this right now though.

I envision some metaprogramming thing to extract the doctests, after which they can be compiled at run like any other test suite. Today's GHCi monster confused the metaprogramming and test running parts, breaking cross.

Even simpler than --test-wrapper is just treating benchmarks and test suites like plain executables. I don't need cabal-install to run my tests, I'll happily make another Nix derivation for that which Nix will schedule on the right sort of remote builder to run it. Problem solved!

Even simpler than --test-wrapper is just treating benchmarks and test suites like plain executables. I don't need cabal-install to run my tests, I'll happily make another Nix derivation for that which Nix will schedule on the right sort of remote builder to run it. Problem solved!

haskell.nix does this for Setup.hs not behaving with stdout/stderr properly (and simplicity). But there are those that don't use nix and like to use cabal (cabal-install).

Right I do think that --test-wrapper is a good idea for precisely those reasons and I support it. I just want to convey how it is inessential and further illustrate how by separating concerns: 1) extracting the code from the docs 2) building the code 3) running the code, things become simple.

@angerman hi, dont want to bother you too much but maybe would be great to know what are the plans to continue working on this or if it is blocked for some reason and we could help in any way.
many thanks for your work here! ๐Ÿ˜„

fgaz commented

In the meantime we could advertise https://github.com/phadej/cabal-extras/tree/master/cabal-docspec in the docs

gbaz commented

migrating a comment from the above linked ticket:

One strawman idea -- what if instead of a doctest stanza as such, we extended the test stanza with a code generation step that invoked an executable that took the code directory itself as an argument?

So we have a test stanza, with a build-tools-depends on my-doctest-program and further with a field test-code-generator: my-doctest-program. This would then invoke the program (with sourcedirs as arguments) to generate some source, and then cabal would proceed to compile and test the result like normal?

Note, this stanza would still need to duplicate the build-depends of the library, and also the extensions. With common stanzas this seems not particularly onerous, but maaaaybe a little confusing? I can't think of a way to avoid this that doesn't violate least expectations and make things even further confusing however.

gbaz commented

I'm playing with implementing the above, and I ran into a question. Why do doctest drivers prefer to run things through ghci (either as a library, or as an executable) rather than just generating and compiling code? Is there any particular reason for this, or is it just a historical decision that's continued indefinitely...

gbaz commented

PR for this approach here: #7688

sol commented

With doctest-0.20.0 it is possible to run doctest via cabal repl --with-ghc=doctest. Something that I hope is reasonably bullet proof and suitable for a CI setup is:

cabal install doctest --overwrite-policy=always && cabal build && cabal repl --build-depends=QuickCheck --with-ghc=doctest

The cabal build is not strictly necessary, but we don't ever want ghc-paths to be built with --with-ghc=doctest. So I included it basically to guard against the case that somebody (a) depends on ghc-paths + (b) cabal repl triggers a rebuild.

If we want to improve on this, adding two new flags could help:

  1. Allow to specify --build-tool-depends for cabal repl on the command line (similar to how you already can specific --build-depends)
  2. Add a new flag --with-repl which tells cabal repl which tool to use as the repl command. This way ghc can be used for any builds, while doctest is only used when invoking the repl.

At that point I could simply say something like

caba repl --build-tool-depends=doctest --with-repl=doctest

to run doctest.

The big difference here, apart from being more concise, is that you don't end up with a globally installed doctest.

@gbaz

I'm playing with implementing the above, and I ran into a question. Why do doctest drivers prefer to run things through ghci (either as a library, or as an executable) rather than just generating and compiling code? Is there any particular reason for this, or is it just a historical decision that's continued indefinitely...

Examples like:

>>> :t traversed % to not
...

or

>>> :kind! SomeTypeFamily 1 2 3
Result

For example in https://hackage.haskell.org/package/symbols-0.3.0.0/docs/Data-Symbol-Ascii.html

@sol

With doctest-0.20.0 it is possible to run doctest via cabal repl --with-ghc=doctest.

That's clever.

sol commented

@gbaz doctests are ghci examples. So by using ghci to verify them you (a) get all features of ghci (basically what @phadej pointed out) and (b) you have the guarantee that your docs are in sync with the user experience (that is that your examples work in ghci and that the output matches that of ghci).

That said, going through ghci is (a) slow and (b) more involved. I hope that using cabal repl addressed (b). If you need to address (a) then go with @phadej's approach.

If you refer to cabal-docspec as "@phadej approach", then it still uses ghci to drive the examples. The difference is that uses compiled libraries.

gbaz commented

Glad you two have chimed in. I'd appreciate it if you took a look at #7688 and let me know if it might be useful. I think it could provide the same functionality (via a different route) as the --with-repl flag suggested above.

@angerman should we unassign you from this or do you still plan to work on it?

gbaz commented

I'm going to close as obsoleted by #7688 -- if you feel there's a use-case not covered by that, feel free to reopen.

I don't feel like #7688 really solves this ticket.

The approach along the lines of cabal repl --with-compiler=doctest is imo the right way to approach things. Interpreting doctests in a project is really another way to interpret the module source.. just like cabal build, cabal repl, cabal haddock etc etc etc.

The comment by @quasicomputational seems to lay out the design considerations for the command quite nicely (#4500 (comment))

It seems there has been quite a variety of mental gymnastics over the years to try to work out other solutions but is there a convincing explanation about why cabal doctest is different in nature to cabal repl or cabal haddock?

gbaz commented

My belief is 7688 solves the ticket because the demos show that existing doctest runners can be retrofitted to work with it. I.e. from a feature-completeness standpoint, it subsumes this.

One reason cabal doctest is different than repl or haddock is cabal knows specifically about ghci and haddock but there is no single doctest executable to teach it about canonically, and even if we picked one, it would be more unfortunate to introduce further coupling than to remain agnostic.

I don't think #7688 is right on the conceptual level ( see #9238 ).

There have been tickets for years about adding cabal support for running doctests.

Perhaps another solution is to have a cabal-docspec command which works with the hypothetical support for external cabal commands which simply calls cabal repl -w /path/to/docspec. Then a user can type cabal docspec <target> and things will hopefully work nicely.

gbaz commented

I don't think that issue points to the "wrong conceptual level" -- its easy to set the flags you want in the test stanza.

sol commented

I'm going to close as obsoleted by #7688 -- if you feel there's a use-case not covered by that, feel free to reopen.

@gbaz at the risk of repeating myself, the documented way to run doctests is:

cabal install doctest --overwrite-policy=always && cabal build && cabal repl --build-depends=QuickCheck --build-depends=template-haskell --with-ghc=doctest --repl-options='-w -Wdefault'

This is more lengthy than I would hope it to be, but it is robust.

If somebody wants to do something on the cabal side of things, then my request would be to start with basically wrapping this into a nicer user interface.

Regarding any alternative approaches to running doctests (including #7688), people are of cause free to experiment. However, I'm not prepared to provide support for those alternatives approaches. I also want to avoid the situation where people open issues on the sol/doctest repository when they didn't use the documented approach. For that reason, I'm not eager to recommend any alternative approaches to new users, and I discourage others from doing so as well.

gbaz commented

Thanks sol, that's very clear. I appreciate you don't want to take on the burden of supporting alternative approaches -- I hope in time others will provide solutions that do support such approaches.

We had some further discussion on IRC and I think what it comes down to is that "wrapping that up" is no more than dropping it into a more elegantly named shell-script -- perhaps one named cabal-doctest to be used with the external command system.

What an approach along the lines of code-generators provides is that doctests can be executed just like other tests with cabal test, which means that all downstream tooling does not need to special case them.