Design import and publicity semantics

Question

Design import and publicity semantics

jfecher opened this issue 2 years ago · 11 comments

Ante's current import system is far too barebones and unusable for most programs. Since it only allows importing every name from a module in scope we already cannot import two modules that declare the same symbol names (like Vec and Future.HashMap or StringBuilder which all define an empty function).

This is a good first issue for anyone interested to tackle. It has some design decisions but is mostly syntax.
I have some ideas as to what I'd like here to fix and expand the system:

Fully qualified names should be able to be used in any module without importing. So foo = Vec.empty () should work regardless of whether Vec is imported (at least for Vec's current position in stdlib/Vec.an). This syntax for . on modules is currently unimplemented.
We should likely move away from import Vec automatically bringing all symbols from Vec into scope. Instead, users should be able to bring them into scope one by one via e.g. import Vec.push. This would fix the above problem without requiring users to rename functions they may not use.
- We will need a new syntax for import all symbols of a module. import Vec.* would match syntax used from most other languages and seems to work well enough.
- Importing multiple items will also need a dedicated syntax, e.g. import Vec.push empty reserve. I'd like to avoid any syntax that necessitates editing multiple locations to change from importing one item to multiple. E.g. with rust's use foo::bar; if we want to import foo::baz as well we need to edit both before and after bar: use foo:{ bar, baz }; Instead of just after.
- Renaming imports or excluding imports from a * import will also need special syntaxes. E.g. import Foo.(bar as baz), or import Foo.(baz = bar), and import Foo.* hiding qux. The former syntax would be nicer if we had a comma separator for importing multiple names instead of a space separator.
Publicity needs to be designed as well. One of my goals here is to balance ease of use with encouraging good defaults. I find the default private publicity of rust can be quite annoying when you need to go back and make it pub(crate) later. Perhaps Ante should default to public for the current project. This could potentially be the wrong default for larger projects. We also need to decide whether we should have publicity attached to the names or elsewhere in the file, e.g. at the top of the file in an exports list which may help IDEs.
Should we allow an analogue to rust's pub use?
- Related: publicity of struct fields. Certain types may have internal invariants they want to uphold and will want to prevent mutation to its fields. I don't think per-field visibility is too useful (but I may be wrong), the only API I know of that uses it in rust is Cranelift's Function struct. A useful keyword here may be opaque. E.g. type Foo = opaque ... or opaque type Foo = ... Alternatively we could swap out the type keyword: opaque Foo = ... but then there would be multiple keywords to declare types.

Edit: Additional items to design.

Instead of a pub modifier to mark publicity, we could consider having an explicit export list at the top of a module instead. An export list at the top could be easier for IDEs to parse if there are syntax errors in the rest of the file, and it keeps the definition site itself somewhat cleaner at the cost of occasionally needing to jump around source code to add exports or reference definitions.
What should the default publicity of an item be? Rust's default is only visible within the same module. My vote would be the equivalent of pub(crate) as the default. Then users writing applications and new users can largely be spared from worrying about publicity for the things they create and know the internals of, while library authors still have the manual control of making things private (to the library) by default.

Answer 1 · 2022-06-23T18:14:09.000Z

I am working on this. We should avoid overloading . operator. This makes every compiler stage harder to implement.

We can introduce :: namespace token similar to Rust and C++.

So it looks like:

import Vec::empty push

v = mut Vec::empty ()
Vec::push v 1

Answer 2 · 2022-06-23T19:20:15.000Z

I'd like to avoid a :: operator unfortunately for very subjective reasons. It is less ambiguous but looks much worse visually. Modules/Types were made to be syntactically separate from other values such that Capitalized.foo always refers to a path and lowercase.foo is always a field access so I don't think it is too ambiguous.
Modules are also theoretically compile-time structs with each definition as a separate field so I like the symmetry there with normal field access.

Answer 3 · 2022-07-18T16:33:54.000Z

How about indentation?

import Foo
  bar as baz
  quux

Answer 4 · 2022-07-18T16:58:36.000Z

Indentation could work, but would currently require a keyword before the indentation to work around current technical limitations with how indentation is handled in the lexer. Namely ante's line continuation mechanism is based around certain keywords expecting indentation after them or not, and if no keyword is before an indent then the indent token isn't issued.

So we could pursue a similar syntax like:

import Foo with
    bar as baz
    quux

Also worth noting that several points from this proposal were implemented and merged in #114

Answer 5 · 2022-07-27T19:19:05.000Z

I'd suggest to grab use from Unison.

import stdlib.{List, Vec}

// ...
use List.*
<function that makes heavy use of operations from hypothetical module List, referring to them just as "empty", "push", etc.>

// ...
use Vec.*
<function in the same file that refers to operations from Vec module, without explicitly specifying this at every call>

... so that use brings required functions into local scope.

Answer 6 · 2022-09-04T15:54:39.000Z

Two quick thoughts: imports anywhere but the top of the file feel pretty confusing to me. I'd highly recommend discouraging/forbidding it...

As for relative imports, I think python handles it nicely? I'd also think that importing everything into the global namespace should not be allowed? I just can't think of why that'd be a good idea compared to string.to_int type imports.

Answer 7 · 2023-02-22T02:40:01.000Z

Edited the original issue to add some additional features to design/consider. The list is quite large so perhaps I will break it out into separate issues.

Answer 8 · 2023-02-22T10:35:56.000Z

I'm fond of the idea of including an export statement at the top of the module. However I would still suggest requiring that in cases where everything in a module is meant to be public, but providing a simple shorthand for it. Somewhat how it's done in Haskell.

In Haskell, if you want to implicitly export everything, you can just write it as such:

module Foo where

foo x = (+ x * x)

bar x = foo . foo x

But if you want export only specific functions you'd list them after the module name:

module Foo (bar) where

That way making everything public requires 3 keywords at the top of the module, but is neither opt in or opt out, since the developer has to explicitly make that decision. The syntax that I think would fit ante, is something like this:

export (..) with

foo x = _ + x * x

bar x = fn y -> foo ((foo x) y)

And if we wanted to export only specific functions we would do something like so

export (bar) with

It should also be possible to line fold multi line export statement like these:

export (
  foo,
  bar,
  baz,
) with

I chose (..) instead of * since that makes syntax for exporting everything and vs only exporting specific things more similar to one another.

Answer 9 · 2023-02-22T11:37:54.000Z

If we were to adopt that route, I'd also suggest rewriting the current module import syntax to be analogous.
import Vec would import the module into the current scope under the name Vec so that you could use items that it exported like so Vec.empty. However if you wanted to import everything that module exports into the current module scope, you could do import Vec (..), or selectively import items by listing them between the parenthesis.

// Items from this import would be present as ModA.foo
import ModA

// Items from this import would be present as ModB.SubMod.foo
import ModB.SubMod

// Items from this import would be present as MC.foo
import ModC as MC

// This would import the item foo from module ModD into the current module's namespace
import ModD (foo)

// This would import all items from the module ModE.SubMod into the current module's namespace
import ModE.SubMod (..)

Answer 10 · 2023-02-22T19:22:38.000Z

I'm leaning towards the separate export list approach as well. I'm actually not sure if we should provide a way to explicitly export everything though. The disadvantages of providing this would be that one can no longer read the export list for the full list of exported names, and thus the compiler as well needs to scan the entire file for exports. The later point is less important if the default visibility of a name is not module-private however. If the default visibility is crate-private then we would need to scan each definition anyway (assuming export = completely public outside the crate).

export is also less granular than rust's pub(item) syntax where you can specify the exact degree of publicity you need. I think the basic visibility levels that would be needed are:

Visible only within current module. Usecase: helper functions & implementation details
Visible only within current crate. Usecase: widely used modules like util modules or modules that define basic data structures used by the rest of the crate.
Visible to external users importing the program as a library. Usecase: the public API of a library

If we say that export is case 3, then we still need to decide how to differentiate cases 1 and 2, be it an additional modifier on export, or something else. One way could actually be to only have export, but make it so intermediate modules themselves need to be exported to be visible. For example, if we have a library structure of:

MyLib
|- lib.an     // MyLib
|- foo
   |- foo.an  // MyLib.Foo
   |- bar.an  // MyLib.Foo.Bar

Then to use a function MyLib.Foo.Bar.baz (if we have module-visible is the default) across the whole crate we would have to do:

// in bar.an
export baz

// in foo.an
export Bar

Then to change it from pub(crate) to pub we would need one more export to expose the Foo module in main:

// in lib.an
export Foo

This lets us achieve arbitrary level exports (beyond the basic 3 listed above), but has the downside that now all exported definitions in Foo or Bar would be completely public with no option for some to remain public to only the current crate.

Answer 11 · 2023-10-02T22:26:52.000Z

From my limited experience with elm, I think exporting/importing everything from a module can quickly become quite problematic.
Maybe ante could take a middle ground, akin to something like what Erlang does.
Erlang has a compile flag -compile(export_all) that you can add in your file to just export everything, but then you also get a compile time warning to remind you to change it later, when you finalize your APIs.

Also have you looked at roc's syntax? (https://www.roc-lang.org/tutorial#app-module-header)
It's quite clever imo and even encodes some other metadata.
The provides [main] to pf especially could be nice to specify the visibility for the module/crate/extern level.
e.g.

export [push, pop] to crate
// or
export [push, pop] to module
// etc..