Shopify/packwerk

Packwerk doesn't support custom root namespaces for root directories in the load path

Closed this issue · 5 comments

In Zeitwerk, the load path is list of directories that it calls "root directories". They are "root" because any files that reside in them are expected to represent modules that are part of the root namespace. If an object is nested in some more specific namespace, it needs to be placed in a subdirectory of one of these roots. For example, in the following, lib is a "root directory" with the following structure and the ruby files have the noted expected constants:

.
├── lib
│   ├── my_namespace
│   │   └── my_file.rb            # MyNamespace::MyFile
│   └── my_other_file.rb          # MyOtherFile

The thing is, Zeitwerk also supports setting a "default namespace" to each root directory via push_dir which is called when the Zeitwerk loader is being configured. So in the above, it was assumed lib was added to the loader via loader.push_dir("lib") but it could also be added with loader.push_dir("lib", MyCompany) which means the constants inside would be expected to be declared as MyCompany::MyNamespace::MyFile and MyCompany::MyOtherFile.

At GitHub we rely on this feature to work around some extremely legacy directories in our load path that do not conform to Zeitwerk's expectations. Up until now, we have worked around them in Packwerk by skipping some and "hacking" at others. At this point, we'd like to just add support for custom namespaces to Packwerk so we can be confident in Packwerk's operation.

While I haven't dived into the Packwerk's source for this yet, I suspect this would mean moving Packwerk::ApplicationLoadPaths from config.autoload_paths, eager_load_paths, and autoload_once_paths to Rails.autoloaders to get our list of root directories as a hash of paths to modules. By default, these modules would all just be Object as they are in Zeitwerk, but it would allow us to override that default as needed for certain paths.

I realize this would also necessitate a change to ConstantResolver as well and have a WIP branch here. This branch uses strings instead of module instances for the namespaces in order to keep us as disentangled from a dependency standpoint as possible.

While I realize this is a change that is probably not needed for most codebases, I think that improving Packwerk's support for Zeitwerk's configuration is overall a positive.

So at this point, before I go ahead with a PR for Packwerk, what do folks think about adding such support?

Just to be clear, if there is no opposition to this change (which shouldn't affect anyone except those using the esoteric Zeitwerk::Loader#push_dir method), then I'm happy to start work on a branch to add this support.

I think this generally makes sense. I'm curious to see what the implementation would look like, but I don't expect any blockers there.

It looks like the bulk of the implementation for this would live in constant_resolver, so as a next step I'd be interested in seeing a test case that shows the new behavior on that level.

I'd also be curious to hear what @rafaelfranca and @alexevanczuk think about this change, in case I'm missing something. My impression is that this adds a small amount of complexity, in a pretty contained fashion--I don't expect many additions to packwerk itself--for the benefit of being more compatible with Rails.

I'm ok with this change, we already require Zeitwerk to be used in the application in order to Packwerk to check.

Agreed – my understanding is this is all about making sure that we can properly identify the file path where a constant is defined, and part of doing so is making sure that constant_resolver understands all of zeitwerks features.

Quick question... is there any reason we don't just ask zeitwerk itself for where a constant lives? Now that rails is loaded in the environment while parsing files, I wonder if we can just ask zeitwerk directly. I'm not sure that makes sense, but it might allow us to resolve all issues where constant_resolver doesn't implement a zeitwerk feature in one fell swoop. I'm not sure this API is even available in zeitwerk, or if it is, if it's reliable without eager loading the application. Just a thought!

is there any reason we don't just ask zeitwerk itself for where a constant lives?

We investigated that in the beginning, but for zeitwerk determining the location of a constant involves loading constants. IIRC, to resolve Sales::Order, zeitwerk would first load Sales, then register an autoload on it for Order on it, then execute that.

My memory is a little fuzzy but the takeaway is that for zeitwerk, resolving constants involves loading constants, and we deemed that to be too slow, especially seeing as loading one constant can create a whole cascade of other constants to load.