fxn/zeitwerk

Getting the source location of a constant while loading it

alxckn opened this issue · 12 comments

I would like to be able to determine the location of a class or module while it's being loaded.

We have some code that end up doing something along these lines:

class Somewhere
  include ActiveModel::Model

  class_attribute :location

  def self.set_location
    klass_name = self.to_s

    self.location = Object.const_source_location(klass_name)&.first
  end

  set_location
end

Calling Somewhere.location will be give us "/home/alex/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/zeitwerk-2.6.12/lib/zeitwerk/loader.rb". I assume this is a normal behavior, the file location will be set later in the loading lifecycle.

Looking at zeitwerk's README, there is an event on_load that looks like it's doing exactly what I want:

class Somewhere
  include ActiveModel::Model

  class_attribute :location

  def self.set_location
    klass_name = self.to_s

    Rails.autoloaders.main.on_load(klass_name) do |klass, abspath|
      self.location = abspath
    end
  end

  set_location
end

From within a rails console in development mode, this will work: Somewhere.location gives us ".../app/models/somewhere.rb", however it breaks eager loading:

irb(main):001> Rails.application.eager_load!
/home/me/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/zeitwerk-2.6.12/lib/zeitwerk/loader/config.rb:255:in `synchronize': deadlock; recursive locking (ThreadError)

I am surely going at it the wrong way, could you point me to a better approach please?

fxn commented

The first option should work:

# app/models/foo/bar.rb
class Foo::Bar
  def self.location
    @location ||= Object.const_source_location(name)
  end
end

See:

% bin/rails r 'p Foo::Bar.location'
[".../app/models/foo/bar.rb", 1]

In the case of implicit namespaces, the location points to the loader because the loader is the one creating the autovivified module. In the example above, assuming there is no foo.rb, we'd get:

% bin/rails r 'Foo; p Object.const_source_location(:Foo)'
[".../gems/zeitwerk-2.6.12/lib/zeitwerk/loader/callbacks.rb", 56]

However, it has to be said that those APIs are useful for generic situations. If the method is being written in the source file itself as in the issue description, of course you could simply do this:

# app/models/foo/bar.rb
class Foo::Bar
  def self.location
    __FILE__
  end
end

but that depends on the details of the project, would not work with inheritance, for example.

Does any of that help?

Thanks a lot for your answer, we are indeed using the first option you propose to get around the issue: load the code (Foo::Bar loaded) and then execute the location method. This way we can indeed have the right answer when using Object.const_source_location, I was looking for a way to avoid having to lazily compute the location and what we need to do with this location (which might be costly).

The specific use cases we are having trouble with are indeed related to inheritance (so __FILE__ does not have the right value)

fxn commented

I see. I'll followup with further details (on my phone).

Could you give me some more context?

  • What is the goal of this?
  • Why do you want it to be eager instead of lazy?
  • If inheritance is in place __FILE__ does not work, but calling set_location at the bottom of the class body does not work either. Which is the real setup?
  • One last thing: Is there a criteria that identifies the class or module objects that need their location to be set? Descendant of something? Responds to something? Something else?

We've currently got two use cases in our codebase that I know of: load json files that are placed besides the class referencing them on disk and another one related to codeowner resolution (get the name of the team that owns a given file).
For both those use cases, I would have a preference for resolving the file path eagerly to handle JSON parsing or codeowner lookup at boot time but more importantly ensure that the app fails fast during boot if something is missing or malformed.

Regarding the real setup, this is (again, equivalent) what we are doing (or would like to be doing):

module CodeOwners
  def self.extended(klass)
    klass.class_attribute(:team)
    set_codeowner(klass)
  end

  def self.set_codeowner(klass)
    klass_name = klass.to_s
    location = Object.const_source_location(klass_name)&.first

    klass.team = CodeOwners.find(location)
  end

  def self.find(location)
    return :backend
  end
end

# ---

class SomeClass
  extend CodeOwners
end

# ---

SomeClass.team # => :backend

Regarding the criteria for such classes, what we use:

  • descendent of some class
  • extends another class (ex. CodeOwners)

One could argue this is a ruby bug / limitation.

# test.rb
autoload :Const, "const"

p Object.const_source_location(:Const) # test.rb
Const
p Object.const_source_location(:Const) # const.rb
# const.rb
module Const
  Object.const_source_location(:Const) # test.rb # IMO this should be `const.rb`
end

It sounds to me module Const should immediately take precedence over the autoload when opened, not once completed.

fxn commented

@casperisfine agree.

@alxckn so the underlying issue is that Ruby is the one controlling the value returned by const_source_location, we cannot modify that externally (except by creating constants). Zeitwerk sets autoloads for your constants, and when those constants are autoloaded, logic is triggered, like callbacks, while the constant is being autoloaded.

OK, when an autoload is set, const_source_location returns the location of the autoload call. If autoloaded, the location of the constant becomes the expected one. However, that location is only updated when the autoload has finished. As @casperisfine said, this does not seem right, because in the module body you already created the definitive constant.

I'll send some ideas to make it work with that limitation.

I got a PR that seem to work: ruby/ruby#9549. Need to clean it up, add some spec and open a Ruby ticket.

fxn commented

@alxckn so, when that behaviour is changed in Ruby, I believe your initial approach could be good.

If the code owner feature is only used on demand, I wonder if that one could use the lazy approach.

For the JSON files, perhaps a verifier:

# config/initializers/json_companion_verifier.rb

if Ruby >= 3.4
  abort <<-EOS
    Please, check if https://bugs.ruby-lang.org/issues/20188 shipped
    and in that case consider deleting this file and updating the technique
    as discussed in https://github.com/fxn/zeitwerk/issues/281.
  EOS
end

Rails.autoloaders.main.on_load do |cpath, value, abspath|
  if value is a class that should have a JSON companion
    json_companion = json_companion_for(abspath)
    if json_companion is verified
      value.json_companion = json_companion
    else
      abort "..."
    end
  end
end

That callback is going to be invoked for all constants managed by the main autoloader, but probably that won't be noticeable, it is easy to understand, and works in all execution modes.

What do you think?

@fxn @casperisfine Thanks a lot for your answers and swift handling of the issue!

I will dig into your initializer suggestion to have a working use case, this looks definitely like a good working alternative for our use cases 🙏

fxn commented

Hey! This seems to be well-understood and belongs to Ruby. There's a discussion and a patch in the works, so eventually some way or another there is going to be a resolution there.

Here, I believe we can close for now. However, if you need any further help with the workaround please feel free to followup!

I just merged the fix in ruby master. The bug is marked as needing backport, so it may be applied to future ruby patch releases (not guaranteed though).