SMLFamily/Successor-ML

Evolution vs. revolution

Opened this issue · 12 comments

One Meta issue that we ought to discuss is how much breakage we are willing to tolerate. The Successor ML features that were discussed in the previous incarnation of this process (and which are documented in the Successor ML Definition) are mostly backward compatible (i.e., most existing SML programs won't break). I'm not opposed to greenfield (or is it brownfield?) language design, but that places a much bigger burden on implementors (and users).

It’s a good question that has tripped us up very often over the years. I guess what I think now is that a change in syntax can, in a sense, be “revolutionary” in that it breaks old code, but it’s “evolutionary” in the sense that it’s quite easy to change the compilers to comply with it. Some suggestions are perhaps further along than just syntax, eg my suggestion earlier today about laziness. But this is still pretty simple to implement.

As a practical matter, I suggest evolution, but maybe in the sense of “punctuated equilibrium”. You can’t make an omelette, etc.

Bob

On Apr 4, 2016, at 16:49, John Reppy notifications@github.com wrote:

One Meta issue that we ought to discuss is how much breakage we are willing to tolerate. The Successor ML features that were discussed in the previous incarnation of this process (and which are documented in the Successor ML Definition) are mostly backward compatible (i.e., most existing SML programs won't break). I'm not opposed to greenfield (or is it brownfield?) language design, but that places a much bigger burden on implementors (and users).


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub #16

If I've learnt one lesson from my now 5 years working on both the implementation and the standardisation of a mainstream language, then it is to appreciate the importance of backwards compatibility. For a mature language, it's simply not an option to break users in significant ways. With every such breakage you'll lose two thirds of them. It is far, far more relevant not to break users than not to break implementations.

So the questions we should ask are: Do we consider SML a mature language? Given its less than healthy user base, how many can we afford to lose? Do we even particularly care about existing users or existing code bases?

In particular, if the answer to the last question is anything but a wholehearted "yes", then we are not talking about evolving SML, but about designing a new language. Intellectually, I'd like that as much as the next person. But whether it would have any chance of adoption, and whether SML would even be the best starting point for it, is rather non-obvious to me.

There have historically been several attempts at revolutionizing a language, and those attempts usually do not fare well. Perl and Python are good examples of this. On the other hand, slow but steady evolution of a language tend to support the old user base. As long as the user can take their, possible very large, code bases and then get them working on the new version without too much effort, then you have the chance of supporting the revolution over time through small evolutionary steps. From my 10 years of Erlang programming experience, you can usually make changes breaking compatibility backwards as long as each major release does a limited amount of them. Programmers need time to adapt their code to the new features.

One strength Standard ML has over almost any other language out there is the formal level of its specification. So one could imagine an SML to SuccML translator capable of rewriting programs automatically. I know the Go language by Google did this before their 1.0 release, so when you encountered code written for an older version, you could mechanically rewrite it so it would work on newer versions of the compiler. We would only need one such rewriter to support every implementation, nodding toward the formal specification.

Another strength of having a language which has not changed much since 1997 is stability. Standard ML programs, written almost 20 years ago tend to work to this day. The same can not be said of many other programming languages, where old programs end up "rotting" so they eventually have to be rewritten. For a small community such measures are devastating since there is no way by which you can throw more programmers against the problem.

The rewrite-by-tooling strategy worked for Go because (a) it's a fairly primitive language, (b) there were no old code bases yet, and (c) they always advocated gofmt, which basically locks down coding style to a single blessed format for everybody. For better or worse, SML's syntax and type system are far more tricky and existing coding styles very diverse, so I doubt this strategy could work well, even for trivial syntactic changes.

I very much agree with the goal of maximizing backward compatibility. The Successor ML changes that we've started to implement (as well as the proposed Basis Library changes) should not break very much code. I think that we can make near-term progress by cleaning up problems in the Definition and adding useful features (like the proposed record and pattern extensions).

Getting the community of users to see Standard ML as something that is being actively developed may give us the foundation for more radical changes in the future. I also like the idea of automating language upgrades; in addition to the example of Go, Apple has been using the same approach for Swift.

I've been using SML for personal projects for about two years now. I really like the language, but it feels more dead than anything I've seen before. There's no community, no libraries, no important projects written in it (I'd appreciate counterevidence). The most important SML projects I've seen are the compilers. If there are more, then they're probably from academia, which doesn't care/want to advertise their projects too much.

I also follow SML-related tags on StackOverflow and try to answer questions there. Most of these are really basic questions coming from, sometimes lazy, students. Also, it's the same 5-6 people that provide answers.

SML isn't Perl, Python or whatever language that is actually used a lot in the industry. It can actually indulge in some breaking changes, IMHO.

Having said that, sufficiently large projects written in SML are probably using a build tool such as CM or MLton's ML Basis system. Both of them allow specifying directives to opt in for certain features. This is the path taken for the recent SuccML additions to both MLton and SML/NJ. Any future SuccML features will probably be introduced that way, so users can gradually migrate to them. And it's probably a good idea to let these features be used for a while before they get standardized.

If the changes are so big and intricate that an opt in flag doesn't make sense or can't be implemented, then maybe SuccML is a new language altogether.

Opt-in flags work well for introducing new features that are mostly conservative extensions. They also work well for syntax changes.

They work not so well for semantic changes, because those typically have non-local implications, especially if the static semantics is affected. You'll often run into tricky interoperability issues: e.g., are two modules compiled with and without a given flag compatible with each other?

Flags also do not work well for features that have non-trivial interactions with other new features. In those cases they can quickly become a pain for implementations, since worst case implementation complexity will be quadratic in the number of feature flags.

So in general, in my experience, fine-grained flags are only viable as a temporary measure.

On Apr 9, 2016, at 1:11 PM, Ionuț G. Stan notifications@github.com wrote:

I've been using SML for personal projects for about two years now. I really like the language, but it feels more dead than anything I've seen before. There's no community, no libraries, no important projects written in it (I'd appreciate counterevidence). The most important SML projects I've seen are the compilers. If there are more, then they're probably from academia, which doesn't care/want to advertise their projects too much.

We clearly need a centralized list of SML projects and libraries. Of course there’s http://www.mlton.org/Users and http://www.smlnj.org/links.html. But surely http://sml-family.org could use such a list.

hear, hear.

On Apr 10, 2016, at 11:21, Alley Stoughton notifications@github.com wrote:

On Apr 9, 2016, at 1:11 PM, Ionuț G. Stan notifications@github.com wrote:

I've been using SML for personal projects for about two years now. I really like the language, but it feels more dead than anything I've seen before. There's no community, no libraries, no important projects written in it (I'd appreciate counterevidence). The most important SML projects I've seen are the compilers. If there are more, then they're probably from academia, which doesn't care/want to advertise their projects too much.

We clearly need a centralized list of SML projects and libraries. Of course there’s http://www.mlton.org/Users and http://www.smlnj.org/links.html. But surely http://sml-family.org could use such a list.

You are receiving this because you commented.
Reply to this email directly or view it on GitHub #16 (comment)

Ud71p commented

A user's vote: revolution. Please be bold. Instead of trying to
keep 2/3 of the (truth be told) tiny user base, try thinking big:
what to do to get 10^6 users? At my university SML was used in
most courses. Yet people after all these courses abandon SML and
choose some ugly languages later on. Why is that? Well, I know my
reasons, and I shall soon open issues describing the shortcomings
I run into.

But let's look at one small concrete example: orelse/andalso. I
remember being a bit put off by this when I first saw it. 6/7
characters for one of the most commonly used language features,
while C-like langs use 2 characters. I would love that Successor
would break backwards compatibility and require me to search
replace all my files to something shorter (||/&&, or/and, or best
∨/∧). I'd cry of joy seeing my old files compatibility-broken,
but now short and readable.

So my general advice: revolutionize.

point well-taken. I am more inclined to make such changes than are others involved. The usual counterargument is why break existing code for relatively trivial reasons? having said that, I do find it frustrating that people think writing “&&” and “||” is somehow ordained by god and cannot possibly use “andthen” and “orelse” instead. on the other hand i find “=“ for assignment intolerable, so it’s not as though i don’t have my preferences as well.

On Jun 3, 2016, at 04:14, Ud71p notifications@github.com wrote:

A user's vote: revolution. Please be bold. Instead of trying to
keep 2/3 of the (truth be told) tiny user base, try thinking big:
what to do to get 10^6 users? At my university SML was used in
most courses. Yet people after all these courses abandon SML and
choose some ugly languages later on. Why is that? Well, I know my
reasons, and I shall soon open issues describing the shortcomings
I run into.

But let's look at one small concrete example: orelse/andalso. I
remember being a bit put off by this when I first saw it. 6/7
characters for one of the most commonly used language features,
while C-like langs use 2 characters. I would love that Successor
would break backwards compatibility and require me to search
replace all my files to something shorter (||/&&, or/and, or best
∨/∧). I'd cry of joy seeing my old files compatibility-broken,
but now short and readable.

So my general advice: revolutionize.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub #16 (comment), or mute the thread https://github.com/notifications/unsubscribe/ABdsdU_OCws_PuNgA3hlEEiWIX6L-CF3ks5qH-JWgaJpZM4H_fmr.

I would join a revolution too ;) But, I think andalso is a tiny issue, especially because using booleans is so often bad style indicative of newcomers from languages without pattern matching.
A far, far bigger problem, IMHO, and I suspect one that does discourage SML adoption, is the lack of type classes or something similarly convenient. Remember how many issues in this project complain about the verbosity of the module language, and now you have to do one or two such things every time you need a map between two new types.