Shopify/packwerk

Support for treating package root module as public

Closed this issue · 11 comments

[Enhancement]

We are just beginning to use Packwerk and loving its power! However, it doesn't allow us to follow a common pattern in our existing packaged code. I would like this feature and wanted to see if you'd accept the PR or if we should think about forking.

We commonly follow this directory structure

app/
  domains/
    identity_domain/
      user.rb
      package.yml
    identity_domain.rb
package.yml

In this structure, the root module sharing its name with the package serves as the entry point to the domain:

# app/domains/identity_domain.rb
module IdentityDomain
  def self.get_user(id)
    IdentityDomain::User.find(id)
  end
end

This seems to conflict with Packwerk, because AFAICS there is no way to tell it to treat the root module as public. The best we can come up with is moving the methods to e.g. IdentityDomain::Public::API but this feels quite verbose. I envision solving this with a per package opt-in like:

# app/domains/identity_domain/package.yml
enforce_privacy: true
public_root: true

This seems to overlap with the conversation in #98 but with a much narrower focus. I'm not talking about allowing arbitrary public entrypoints, but rather the option to treat the root module as a public constant.

As I say, would you consider a PR for this? Do you consider this an anti-pattern perhaps?

ah, so this relates to how autoload roots and package roots interact, as the namespace path starts at the autoload root.

For us, most packages contain autoload roots, so that we don't have this problem. E.g. for the most of our packages, a file path may be components/sales/app/models/something.rb; the package would be components/sales and the autoload path would be components/sales/app/models.

We do have some packages within those packages, and packages within lib, that don't fit this pattern.

The solution you propose ("public root") would require a robust definition of what a root is; actually, following packwerk conventions, the identity_domain.rb file in your example wouldn't be contained in the app/domains/identity_domain package at all.

One way we work around that problem when it occurs is just more nesting, similar to your IdentityDomain::Public::API example. But many of those smaller packages don't enforce privacy in our case so that we don't need two additional levels of nesting.

I see that this is not fantastic and am open to other solutions.

One that I can think of is to smartly define autoload roots for each package, e.g. per making all packages rails engines.

Hey @exterm, thanks for engaging even though you don't need this functionality :)

actually, following packwerk conventions, the identity_domain.rb file in your example wouldn't be contained in the app/domains/identity_domain package at all.

Is this the case in practice? I assumed this, but when I actually referenced an app/domains/identity_domain.rb from my root package, I found that Packwerk correctly identified the constant belonged inside a package that, as you say, in reality the file is not in:

app/controllers/application_controller.rb:19:4
Privacy violation: '::IdentityDomain' is private to 'app/domains/identity_domain' but referenced from '.'.

When you refer to autoload roots, are you referring to https://github.com/fxn/zeitwerk#root-directories-and-root-namespaces i.e. Zeitwerk's? loader.push_dir? I don't totally follow what your "smartly define autoload roots" alternate solution would entail.

I see the issue around the public root being outside the Packwerk package root, as in up a level in the directory structure from it. Without knowing intimately Packwerks internals yet, I can't say if this is robust enough, but hows this algorithm:

  1. Given an application including a load path app/domains
    And a Packwerk root package package.yml
    And a Packwerk package app/domains/identity_domain/package.yml
  2. When discovering a constant referenced in one package from another (violating privacy), in this case IdentityDomain
  3. Capture the path the constant resolves to ( app/domains/identity_domain.rb)
    And check to see if there is a app/domains/identity_domain package (i.e. replace .rb with /package.yml)
    And check that package sets public_root: true
  4. Before considering the reference to IdentityDomain a violation

Does that seem robust enough to constitute a solution? From a user perspective the rule is:
"To use public_root:, exactly one Zeitwerk load path must be defined up the directory tree from your package(s), or your packages will not have a single root module in the first place. When public_root: true, the root module for the package (which sits alongside the package in the directory structure) is considered public and can therefore be referenced from outside the package".

Packwerk correctly identified the constant belonged inside a package that, as you say, in reality the file is not in

"Which file defines a constant" doesn't have a general unambiguous answer in Ruby, since one constant can be (and commonly is, for namespaces) defined in multiple files. I'm pretty sure each of the files in app/domains/identity_domain/**/*.rb defines the constant IdentityDomain.

IIRC Packwerk in this case tries to find the most specific information, meaning the longest path that can contain a definition.

When you refer to autoload roots, are you referring to fxn/zeitwerk#root-directories-and-root-namespaces i.e. Zeitwerk's? loader.push_dir? I don't totally follow what your "smartly define autoload roots" alternate solution would entail.

Yes, I am referring to Zeitwerk's autoload roots, as that defines which parts of the file path are parts of the namespace path.

To illustrate, our main monolith is set up like this:

components/
           sales/ # this is a rails engine, and a package
                 app/
                     models/ # this is an autoload root
                            sales/
                                  some_model.rb # Sales::SomeModel, part of the components/sales package

Actually, looking into this more deeply, I think IdentityDomain should be resolved to app/domains/identity_domain.rb. I'm going to add a test to https://github.com/Shopify/constant_resolver/ to make sure that's what's happening.

Yeah, what you described should not be happening. See Shopify/constant_resolver#23 for a test that seems to prove it doesn't.

Can you provide some info on your example so that I can try to reproduce it?

Hey @exterm, sorry about the long pause. You're absolutely right that Packwerk already behaves well mapping IdentityDomain to app/domains/identity_domain.rb.

I figured the best thing I could do to progress this is try and implement and document the behaviour we rely on, and demonstrate it in a dummy Rails app. I figure if I can't define the behaviour clearly in docs, its too complex to hope for an upstreaming.

I've had a go at it, and in terms of code the implementation is quite simple, if anything writing the docs to be easily understandable has proven the hardest part!

main...bessey:public-root

and demonstrated in a fresh rails app, to exercise the new behaviour.

https://github.com/bessey/packwerk-root-rails

The output of which is:

app/controllers/application_controller.rb:6:4
Privacy violation: '::AutoloadDomain' is private to 'app/domains/autoload_domain' but referenced from '.'.
Is there a public entrypoint in 'app/domains/autoload_domain/app/public/' that you can use instead?
Inference details: this is a reference to ::AutoloadDomain which seems to be defined in app/domains/autoload_domain/app/models/autoload_domain.rb.
To receive help interpreting or resolving this error message, see: https://github.com/Shopify/packwerk/blob/main/TROUBLESHOOT.md#Troubleshooting-violations


app/domains/broken_domain/private.rb:1:7
Dependency violation: ::BrokenDomain belongs to '.', but 'app/domains/broken_domain' does not specify a dependency on '.'.
Are we missing an abstraction?
Is the code making the reference, and the referenced constant, in the right packages?

Inference details: this is a reference to ::BrokenDomain which seems to be defined in app/models/broken_domain.rb.
To receive help interpreting or resolving this error message, see: https://github.com/Shopify/packwerk/blob/main/TROUBLESHOOT.md#Troubleshooting-violations


app/domains/identity_domain/private.rb:9:4
Dependency violation: ::BrokenDomain belongs to '.', but 'app/domains/identity_domain' does not specify a dependency on '.'.
Are we missing an abstraction?
Is the code making the reference, and the referenced constant, in the right packages?

Inference details: this is a reference to ::BrokenDomain which seems to be defined in app/models/broken_domain.rb.
To receive help interpreting or resolving this error message, see: https://github.com/Shopify/packwerk/blob/main/TROUBLESHOOT.md#Troubleshooting-violations


3 offenses detected

Do you think something like this would have a hope of getting merged?

Hey @bessey, Alex here from Gusto, where I work on the product infrastructure team. We love packwerk too!

I'm just popping in as I'm curious about your use case, and I'm wondering if something we created can help you while keep Packwerk assumptions simple (i.e. only files within a package belong to that package).

As Philip suggested:

One that I can think of is to smartly define autoload roots for each package, e.g. per making all packages rails engines.

Having each of your packs contain autoload paths could help resolve this (and also perhaps serve the benefit of colocating your public APIs with the packs in question).

The org where I work (Gusto) has (somewhat privately) published a tool that we use to do this automatically: https://github.com/Gusto/stimpack

By including this in your gemfile, packages are automatically rails engines. In our system, your file system would be structured as:

package.yml # root level pack
# Unpackaged code and code that cannot be in `packs`
app/
  models/
  ...
packs/
  identity_domain/
    app/
      public/
        identity_domain.rb
        identity_domain/
          identity_domain.rb
      services/
        identity_domain/
          some_private_class.rb

The first part of these paths, packs/identity_domain/app/public is autoloaded, and thus is only meant to group the code and not part of the namespace. After packs/*/app/*, zeitwerk considers subsequent folders to be part of the namespace. These autoload paths are automatically added by stimpack.

So, for example:

# packs/identity_domain/app/public/identity_domain.rb
module IdentityDomain
  def self.get_user(id)
    IdentityDomain::User.find(id)
  end
end
# packs/identity_domain/app/services/identity_domain/some_private_class.rb
module IdentityDomain
  class SomePrivateClass
  end
end

Let me know if this makes sense to ya. I'd also be down to zoom with you if you'd like to chat more!

Happy holidays,
Alex

Hey Alex, that's certainly a solution to the problem I'm trying to solve. My first reaction was that writing X/identity_domain/ 5 times just to get a public API for one model and service seems painful, but playing around with it I see these directories get collapsed in VSCode, and the Rails engine generator creates most of the directories. Presumably with Stimpack this is a fair bit of toil though, and you still have that odd public/ folder containing two identity_domain.rb's.

Prior to this conversation (which perhaps we should move to Discussions?) I thought my ideal solution was:

  app/
    controllers/
    graphql/
    policies/
    views/
    domains/             # Rails autoload path as usual, these are not Rails Engines
      identity.rb        # ::Identity public API
      identity/
        models/          # ignored thanks to Zeitwerk directory collapsing
          account.rb     # ::Identity::Account
        services/        # ignored thanks to Zeitwerk directory collapsing
          login.rb       # ::Identity::Login
        package.yml

Which gives us constant namespacing for all of the private constants within a domain, without polluting the directory tree, and a single entry point to call into the domain from outside.

To support that we wuold need to add these Packwerk features

  • Support public roots as per this issue
  • Support Zeitwerk directory collapsing

But now that I play with Rails engines a bit, its much easier to set up than I believed. Perhaps a better approach would be to lean into the Rails engine approach:

app/
  models/
  ...
domains/                 # each of these is a textbook Rails engine
  identity/
    app/
      services/
        identity/
          some_private_class.rb
    lib/
      identity.rb        # ::Identity public API, this is the Rails engine and gem default 
    package.yml

So we've still got the namespacing, but we are allowed that more complex directory structure, to abide by Rails conventions better. Now feature wise, Packwerk needs a way to mark ./lib/identity.rb as public, but not the rest of ./lib/. This would be covered by the proposal in #98, but since this is following ruby gem conventions perhaps a more focused approach would be to add a gem_public_entrypoint: true API that marks ./lib/{package_name}.rb public?

Oh and for context, the reason I want this top level public API so much is that we're using Packwerk to implement Domain Driven Design. In our architecture, only the domain layer is divided into Packwerk packages. We consider controllers, views, serializers, GraphQL schema, all to be in the higher application and UI layers of the architecture, which may depend on multiple domains in lower layers. Today all of the aforementioned objects are directly in the monolith. For us, having a single clear place that defines the entire public API for the package is very valuable for the same reason a single routes.rb is.

Down to Zoom, but not this side of Christmas :D feel free to message me. Happy holidays to you too

bessey@gmail.com

Thanks for sharing more Matt.

Instead of lib/identity.rb, could you have identity.rb live in domains/identity/app/public/identity.rb? domains/identity/app/public would be part of your autoload path, and by default, everything in that folder is considered public by Packwerk.

Big advocate of domain driven design! I've also been pushing for the single, top-level API location at the org where I work -- its a nice convention. Let me know if the above would work for you or if you have concerns/needs that aren't addressed with that approach.

I think it might well do, yes. Thanks Alex! I think ideally I would still prefer one didn't have to add autoload paths, since it breaks the widely understood rules of Rails around where to find constants of a given name. I suppose that's already broken when you're dealing with Rails engines / gems though, so its not too much additional overhead.

Bring on the language servers that make this sort of constant lookup a doddle in the future :D

Going to close this for now because I believe we have resolution on the issue! Please leave another comment or open another issue if you or anyone has an unresolved use case.