idris-lang/Idris2

[RFC] Library organisation

edwinb opened this issue ยท 15 comments

I'm opening this, with some trepidation, because I'd like to do something about the state of contrib, and it'd be interesting to hear people's opinions. Essentially, the problem is that it's become - or maybe it always was - something of a dumping ground for experimental libraries, but it also includes some libraries that are being actively used (syntax/proof related things), and libraries that are good to have in a "batteries included" language implementation (e.g. JSON, parsing, pretty printing). But, everything that's in this repository implies a committment on our part (and, ultimately, on my part) to maintain it. There's only a certain amount of that I can do, realistically, as many of you have probably noticed lately.

So, what I want to do is organise them better. Some should be carved out and put in a library that's listed on https://github.com/idris-lang/Idris2/wiki/Libraries. Some should be moved into more meaningfully named packages that are shipped with Idris itself. Anything left over, maybe it's a sign that it's not that useful after all.

Please let me know:

  • Any parts of the library you're willing to claim and maintain. In this case, I think the best thing to do would be to start a new repo with those parts, and add them to https://github.com/idris-lang/Idris2/wiki/Libraries
  • Opinions - I don't promise to respond to them or necessarily agree, but I will at least read them. Including:
    • Any parts of the library you really can't live without
    • Suggestions for good names for subsets of contrib that you think would be useful ways of organising things

I think base is mostly okay - these are things that are effectively part of the language. I don't think Control.App belongs there, for example, because it's too experimental, although I'd still like that to be available because I know people are using it.

I'd still like to be a bit careful about dropping things, because without package management we could end up with awkward dependency problems.

I'm also willing to consider moving all libraries that aren't essential to the Idris 2 build into a new repository. In this case, we'd need a volunteer to maintain that new repository. I would also update the release scripts to package Idris 2 and the libraries together in a single distribution if we did this. Actually, this is my preferred approach to providing a "batteries included" installation.

To make things more complicated (but recognizing that at the end of the day as long as I can get this stuff from somewhere I am happy to pull in other additional libraries), some of the Extra modules are a mix of really useful and maybe less often needed things even within the same module; the name "Extra" also doesn't give much guidance on what to put there or what to expect to find there.

So, on the topic of reorganization, maybe it's still worth some things in contrib being considered for base and maybe this reorganization will motivate a closer look and ultimatum for those Extra modules.

Yes, the Extra ones did leap out as being the most in need of organisation! I bet a lot of that belongs in the relevant base modules by now.

We need criteria for what we will consider to include into the libraries shipped with Idris2, maybe we should start by considering what use cases we want to support out of the box. Should it contain all the stuff needed to write a server app (HTTP, JSON, URIs etc) for example?

I think the best thing we can do to take the pressure off contrib is to ensure that there is a blessed package manager that can help with discoverability of packages.

I'm also willing to consider moving all libraries that aren't essential to the Idris 2 build into a new repository
In this case, we'd need a volunteer to maintain that new repository.

I would lean more toward having people maintain their own libraries rather than attempt to centralise them. I suspect having a central repository of libraries will not decrease the burden of managing external contributions. Granted it will move them away from the people working on the compiler, but given how small the community is, I would assume that the intersection between the people managing the libraries and the people working on the compiler is fairly large.

My experience with Idris1 has mostly been based on cloning libraries from people's repositories and installing them with idris --install I had to fork a couple of abandoned projects but that was not really an issue since I was able to take over the maintenance of the forked package. This is only anecdotal evidence but "it worked for me".

Personally, I think the Idris2 project should at least keep supporting those parts of contrib that also make an appearance in the compiler, such as lexing, parsing, and pretty printing as these have to be maintained by the Idris2 team anyway. Ideally, these things would not be part of idris2api but reside in their own libraries (as part of the Idris2 project), on which the compiler would then depend.

In addition, if people decide not to keep the JSON stuff in contrib, I could volunteer to move this to the idris2-json library.

The contrib stuff that I have added or worked on is all code that doesn't "do" anything. It's all proofs of facts that get verified at compile-time and then I guess vanish.

Although it doesn't have any "use", this code does have some value as a set of test cases for inference and theorem proving. It ought to compile, and compiling it regularly verifies that it does. Examples of challenging code to compile:

Testing the compiler is the most tangible benefit. I also think that code is pretty cool, and it serves as an example of something that can be written in Idris but not in most other languages.

Thanks everyone for the comments so far. I'm going to leave this open for a bit and see what else comes on before I decide what to do.

Some interesting questions raised: what are the criteria for inclusion out of the box? What about testing the compiler via the libraries (this is on my mind, that contrib in particular is good for testing the compiler, so I'd keep doing this whatever).

I have mixed feelings about the stuff @nickdrozd has mentioned, what I like to call "formalised mathematics": proofs of algebra laws, Euclid's Algorithm and so on. On one hand, Nick is right that they serve as a nice test for the compiler and even better as an example of what Idris can do. On the other hand importance of this kind of code depends much on how one wants to use Idris.

If it is to be used mainly as a programming language for writing "practical" programs, then this kind of code may or may not be useful to some extent in proving correctness of these programs. That largely depends on the domain I guess.

However, if Idris is to be treated as a proof assistant, then this kind of code is extremely useful as it formalises very basic mathematics, which more complex theories will make use of all the time.

In my opinion the conclusion of this is that this kind of code belongs in a separate package which may or may not be shipped along with the compiler. As I personally mostly like to play with maths in Idris, I would opt for having it included, but it's my personal opinion only. I didn't do much with Idris lately, but nonetheless I could try to gather contrib modules falling into this category, make a package of them and commit myself to maintaining it.

My two cents as a user of Idris for writing maths software and for theorem proving: I always get worried when someone says Idris is a practical programming language and not a theorem prover for doing maths and so on, that it is going in a different direction from Agda etc. What I like about Idris is exactly that it is both. I believe there should be no distinction. A best programming language WILL be the best for maths also, it isn't a choice of either/or IMHO. That it is a practical language in my view makes it better than Coq, Agda and so on ALSO for theorem proving, and better than just about anything else as a high level practical programming language too, because it is a theorem prover. Or at least that's the potential IMHO. I think the proof assistant idea is old fashioned. Why not just do maths as programming? My usage is to define mathematical objects as type classes, and then prove stuff about the type classes. I claim that maths is programming in a practical programming language, and vice versa; that doing maths is just a hardcore usage of a practical programming language, a sort of "power user" use, in the same way as performance programming is just a "power use" of a language. As for library organisation, I think Idris should be organised in such a way that the core language is robust, bug free, consistent, fully featured, stable in terms of backwards compatibility, ultimately verified. I don't think it matters if maths libraries are built in or downloaded separately. What I mean is, to do maths, maths libraries are secondary to a far more important criteria. Besides users can write maths libraries for themselves. What is needed to enable such users to do maths in any sort of way they choose is for the core language to be rock solid and fully featured as a core language. So, for example, I am currently blocked by there not being cumulative type levels yet, and am also blocked by the fact that rewrite isn't stable yet or featured for dependent types. There are a number of bugs that make me feel unsure of what I am programming too. So I'd like to promote the idea of maths/theorem proving not as an obscure specialisation, but as a power use of a practical programming language like Idris. And that what is needed is in effect simply performance, soundness and features in that core language. Libraries then follow.

ohad commented

@reswatson , this post is fascinating, but I don't quite see what it has to do with the practical question: what do we do about all the stuff that's in contrib?

ohad commented

An undercurrent in this discussion is 'what's the alternative?'.

Being in contrib means your package is way more discoverable and installable than any other alternative. It also means that any changes to the language/compiler/other packages are immediately discovered in the CI process, and if you're especially lucky, fixed by someone else before you even hear about it.

So a few related issues/questions are:

  1. A/several official package manager(s).
  2. Better support for modularity in modules (eg, fixing this issue: #1793 and finer control over which package gets used during compilation)
  3. Package repositories, supporting the official package managers
  4. Out-of-the-box CI configurations that re-build a package with each release/update to HEAD to help maintenance.

Having good answers to these questions would mitigate the effect of being removed from contrib, so hopefully it would be easier to converge on very strict criteria for staying in contrib/migrating to base.

A/several official package manager(s).

Please don't, there are already enough language specific package managers. Just use an existing one like Guix or Nix, then you can just use the CI those already have.

Nix is (currently) notoriously not user friendly. It has a crazy learning curve, and requires a massive time investment in order to become productive. Given that Guix has a similar level of generality, I would assume it's a similar situation. I don't think requiring Idris programmers to learn these tools is a reasonable ask, and I say that as someone who's made the time investment for Nix already.

This issue is about splitting off contrib into parts that are meant to be included in
base and separate packages maintained by the community. Let us please not debate
the way in which these packages shall be shipped to users & built on their machines
as it is off topic.