fxn/zeitwerk

Get all descendants of a class without eager loading the Rails app

adrianthedev opened this issue ยท 29 comments

Hello ๐Ÿ‘‹.

I'm building a gem that is configured through class inheritance.
To be more explicit, one can configure a resource like so class UserResource < Avo::BaseResource.

During the initialization of the gem I need to find all the descendants of the BaseResource class and generate some routes for them.
My issue is that they are available in ObjectSpace only after I run Rails.application.eager_load!.

The issue with that is that it reloads the whole app on every file reload. Even the files that weren't touched.

avo-hq/avo#1249 is the ticket that has all the reproduction steps (pretty easy to test out) if anyone wants to dive in.

My question is: How can I get the descendants of a class without running an eager_load! on the whole app? Or can I run eager_load! on a path?

Thank you for your help!

There's unfortunately no general solution for this. You can look at a proposal we did for Active Record a few years ago: rails/rails#36487

What you can do is redefined descendants in your base class that is expected to have descendants when in lazy loading mode, and use that to trigger the load of the subclasses, but that requires all your subclasses to follow a conventions. e.g. (pseudo code):

class Component
  unless Rails.autoloaders.main.eager_load
    class << self
      def descendants
        Dir["**/*.rb", base: "path/to/components/"].each do |component_path|
          component_path.sub(/\.rb$/, '').camelize.constantize
        end
        super
      end
    end
  end
end

I guess Zeitwerk could offer some kind of helpers to make this easier, but that's Xavier's call.

Thanks for this response. Even though this approach a bit "archaic", this could solve my issue.

Just spitballing here, I'd love to have something like eager_load(directory: ) or eager_load(namespace: ), or something similar.

Again, thanks for the reply!

I'd love to have something like eager_load(directory: ) or eager_load(namespace: ), or something similar.

I personally that would be a nice helper that would make the above pattern easier enough to implement that I'd consider it a solved problem.

fxn commented

Thinking about it.

fxn commented

I believe we are going to restore a better version of the tentative preload interface that existed in the first versions of the library. Usage I imagine would be:

loader.preload("/path/to/directory")    # eager loads the directory
loader.preload("/path/to/file.rb")      # eager loads the file
loader.preload("/path/to/**/*.pattern") # expand and apply

Preloading would happen after setup, you wouldn't need to distinguish lazy vs eager load in the caller, just configure these preloads unconditionally.

Implementing this needs some care, because it needs to be aware of custom namespaces, collapsing, and ignored directories/files, off the top of my head, but I think it is doable.

What do you think?

I think that works, but maybe it might make more sense to expose a constant or constant name based API rather than a path based one?

e.g. to re-use the Components:: namespace example, wouldn't it feel cleaner to do:

Zeitwerk.preload_namespace(Components)

The advantage being that it would work in situations where the namespace is shared between multiple loaders. That effectively decouple this concern from how files are laid out. Just a thought though.

fxn commented

But that approach (which could be complementary), does not solve this use case, because UserResource lives in Object, no? Here, you want to preload anything in app/resources.

Indeed. I confess I didn't look at OP's example in particular, I was answering in the general sense.

IMO both works with their own tradeoffs. In some way it's approaching the problem from two directions.

  • loader.preload("/path/to/**/*.pattern") offer a solution using a loader configuration.
  • Zeeitwerk.preload ... offer an "application API" solution.

I don't have a strong opinion either way, my only preference would be for the solution to reside in the parent class declaration rather than in the app config.

But that approach (which could be complementary), does not solve this use case, because UserResource lives in Object, no? Here, you want to preload anything in app/resources.

That's correct. I'd like to preload all descendants if a class or all files in one directory (and all its subdirectories).

fxn commented

It's two usage patterns, the way I see it.

One usage pattern is "I want this to be eager loaded always". That means, you want it to be eager loaded in lazy mode on boot, and on each reload. That belongs to configuration, you declare this need, and loading is automated for you.

If you want to eager load a namespace as in (improvised):

stack = [Namespace]
while mod = stack.pop
  mod.constants(false).each do |cname|
    value = mod.const_get(cname)
    stack.push(value) if value.is_a?(Module)
  end
end

This is certainly easy, and the code proves it is aligned with the Zeitwerk mental model, because you work at the constants level, not filesystem level. So, all internal logic and edge case support is baked in.

However, do we have a use case for this?

I'd like to preload all descendants if a class

Note that Zeitwerk doesn't know anything about who inherits who, it doesn't even really know whether something is a class or a module.

So the only options here are:

  • Everything under a path.
  • Everything under a namespace.

do we have a use case for this?

That class hierarchy might be slow to load and rarely used. I see value for only loading all this if you use it in big apps. Hence why I'm not a big fan of loader.preload(...) and prefer something declared inside the "entrypoint" namespace.

But for smaller use cases it probably doesn't matter much.

fxn commented

Oh yes, regarding subclasses, Zeitwerk only helps having things in memory. So your current ObjectSpace-based loop or ActiveSupport::DescendantsTracker would be needed.

Hence why I'm not a big fan of loader.preload(...) and prefer something declared inside the "entrypoint" namespace.

I don't follow you here :).

If you want to load something on boot and on each reload, you already opt-ed in to whatever time doing that work takes. You want to do that work! And you can do that for multiple types of arguments, potentially.

Even more, it would not bring any penalty to eager loading mode, because remember that eager loading is a recursive const_get following internal metadata. So whatever was already loaded would not be attempted to be eager loaded again.

Now, what I don't get is your proposal for eager loading a namespace manually:

loader.load(Admin)

this only runs once, you need to have the loader ready (to be able to refer to Admin as a constant), and I don't know of a use case for it.

fxn commented

@byroot ahhh, I see where are you coming from.

You mean that if your logic allows you to detect when something needs to be eager loaded on demand, it is more efficient to be able to do that on the spot, rather than paying the price upfront always regardless of whether that spot is going to be hit.

Absolutely.

I don't know if that is the case here.

fxn commented

@adrianthedev you have an ObjectSpace-based loop somewhere, and it is there that you need the resource classes in memory. So I guess you could then loader.eager_load("app/resources") at that point?

You mean that if your logic allows you to detect when something needs to be eager loaded on demand, it is more efficient to be able to do that on the spot, rather than paying the price upfront always regardless of whether that spot is going to be hit.

Exactly.

I don't know if that is the case here.

Yeah, It doesn't seem to be the case case in the linked, but @adrianthedev could decide to move all the BaseResource descendants in a Resources namespace or something like that to make it easier. And Shopify have a bunch of cases where this would be useful.

@adrianthedev you have an ObjectSpace-based loop somewhere, and it is there that you need the resource classes in memory. So I guess you could then loader.eager_load("app/resources") at that point?

Yes. That would work!

Yeah, It doesn't seem to be the case case in the linked, but @adrianthedev could decide to move all the BaseResource descendants in a Resources namespace or something like that to make it easier. And Shopify have a bunch of cases where this would be useful.

Unfortunately, that would be a pretty significant breaking change for our users. I'd rather have a way to plug in where I can and trigger the reload.

Thank you both for looking into this and for the help.

fxn commented

And Shopify have a bunch of cases where this would be useful.

Oh, then we do have use cases, I'll write this too.

However, the library can provide both features: You could eager load a namespace, and you could eager load a directory. Both are possible. If you eager load a directory, you won't eager load the whole namespace if it is scattered over multiple directories, but that is fine, you didn't ask for it.

In the case of this ticket, you want to eager load app/resources without eager loading the whole Object (which would mean eager loading the application in this context).

fxn commented

BTW, I still believe preload makes sense eh? I can perfectly imagine situations in which the programmer doesn't want to bury an eager_load deep in the code, and it is easier to work with the precondition that if the loader is set up, that code is in RAM. You might not even be in a Rails context.

However, preload is going to be implemented via eager_load anyway, the only thing that changes is who runs the code and when.

BTW, I still believe preload makes sense eh?

I think so too, it's too somewhat similar but still a bit different use cases.

fxn commented

Quick followup regarding this snippet:

stack = [Namespace]
while mod = stack.pop
  mod.constants(false).each do |cname|
    value = mod.const_get(cname)
    stack.push(value) if value.is_a?(Module)
  end
end

It honors collapses, custom namespaces, ignores, ..., but it does not honor eager load exclusions.

fxn commented

I'm working on this. Will followup when I have something.

Thank you for helping with this!

fxn commented

This is in main, implemented in 80f8e00.

I'll let it rest for a bit before releasing. @adrianthedev do you have a way to give it a try?

In Avo you can eager load resources the very last moment they are needed. But if you prefer not to bury accessing the loader deep in the source code and keep things simple, you can eager load on boot and on reload this way (untested):

# In an initializer of the engine.
initializer "eager load resources" do |app|
  app.config.to_prepare do
    Rails.autoloaders.main.eager_load_dir("#{app.root}/app/resources")
  end
end

Enable logging with Rails.autoloaders.log! in config/application.rb to see it working.

@byroot I think I can tackle eager loading namespaces next. I'll open a different ticket.

Thank you @fxn! Let me try it today and report back.

This work on my end, but I'm still waiting for confirmation from the OP to see if the issue persists on their side.

fxn commented

@adrianthedev probably won't matter, but since I wrote the original patch I changed my mind and now eager_load_dir is more strict. That is, if the argument is not managed by the loader the method raises. Docs have been updated.

I plan to do a release including this feature really soon.

Got it @fxn! Waiting for the new release.

Thanks for taking the time to work on this.

fxn commented

Version 2.6.2 is out, includes this feature.

๐Ÿ™Œ Thank you!