drwl/annotaterb

Annotate models using Zeitwerk namespaces

Opened this issue ยท 21 comments

I tried to migrate from annotate gem, to see if this gem fixes an issue we had with the original in which it does not work with Zeitwerk namespaces.

Our project structure like this:

/app/models/
/app/namespace/models/
/spec/models/
/spec/fabricators/
/spec/namespace/models/
/spec/namespace/fabricators/

In an initializor we have:

Object.const_set('Namespace', Module.new) unless Object.const_defined?('Namespace')
Rails.autoloaders.main.push_dir(Rails.root.join("app/namespace/models"), namespace: Namespace)

In .annotate.yml I tried with this setup:

:model_dir:
- app/models
- app/namespace/models

When I run annotaterb models the models in app/models/ are annotated correctly, but I see these errors for models in app/namespace/models dir:

Unable to process app/namespace/models/foo.rb: file doesn't contain a valid model class
Unable to process app/namespace/models/bar.rb: file doesn't contain a valid model class

The file app/namespace/models/foo.rb is like

module Namespace
  class Foo < ApplicationRecord
  end
end

I also tried using root_dir config, but got same error:

:root_dir:
- 'app'
- 'app/namespace'

For the original annotate gem version 3.2.0 I have developed the following monkey patch to make this setup work in our project:

# frozen_string_literal: true

annotate_gem_specs = Gem.loaded_specs['annotate']

return unless annotate_gem_specs

module AnnotateModelsWithZeitwerkNamespaces
  module FilePatterns
    def test_files(root_directory)
      super + NAMESPACE_DIRS_WITH_MODELS.map do |namespace_dir|
        File.join(root_directory, 'spec', namespace_dir.to_s, 'models', '%MODEL_NAME_WITHOUT_NS%_spec.rb')
      end
    end

    def factory_files(root_directory)
      super + NAMESPACE_DIRS_WITH_MODELS.map do |namespace_dir|
        File.join(root_directory, 'spec', namespace_dir.to_s, 'fabricators', '%MODEL_NAME_WITHOUT_NS%_fabricator.rb')
      end
    end
  end

  def get_model_class(file)
    loader = Rails.autoloaders.main
    root_dirs = loader.dirs(namespaces: true)
    expanded_file = File.expand_path(file)
    root_dir, namespace = root_dirs.find do |dir, _|
      expanded_file.start_with?(dir)
    end
    _, filepath_relative_to_root_dir = expanded_file.split(root_dir)
    filepath_relative_to_root_dir = filepath_relative_to_root_dir[1..].sub(/\.rb$/, '')
    camelize = loader.inflector.camelize(filepath_relative_to_root_dir, nil)
    namespace.const_get(camelize)
  end

  def resolve_filename(filename_template, model_name, table_name)
    super.gsub('%MODEL_NAME_WITHOUT_NS%', model_name.split('/', 2).last)
  end
end

AnnotateModels.singleton_class.prepend(AnnotateModelsWithZeitwerkNamespaces)
AnnotateModels::FilePatterns.singleton_class.prepend(AnnotateModelsWithZeitwerkNamespaces::FilePatterns)
drwl commented

@guigs thanks for the detail issue. I'll dig into this now to understand the issue.

drwl commented

Just an update, I'm able to reproduce what you have: #83, if you want to take a look and make sure it matches.

I'm still learning more about how Zeitwerk works under the hood to better understand the problem at hand.

Thanks @drwl, yes, that representes the issue we have.

drwl commented

Got it. I spent some time today learning more about Zeitwerk works, and I think that a Zeitwerk compatible loader would make things easier given that Rails 7+ is Zeitwerk only.

It would solve the other issues around Zeitwerk collapsed directories too like in ctran/annotate_models#969.

drwl commented

@guigs I was able to get a proof of concept working locally. Could you try using the latest on this branch? #85

  • In my local dummyapp, I created an annotaterb config (bin/rails g annotate_rb:config)
  • Added - app/namespace/models line under :model_dir
  • Ran migrations
  • Ran bundle exec annotaterb models

And it was able to successfully annotate app/namespace/models/test_child.rb.

drwl commented

I didn't test any of the file patterns stuff, so that might not work.

@drwl Thank you. I tested #85 and it worked, ie. it correctly annotated the models in the namespace, specified in model_dir config.

I just can't annotate fabricators and tests. I tried this config:

:additional_file_patterns:
- spec/namespace/fabricators/*.rb
- spec/namespace/models/*.rb

But didn't work.

That was the part I had customized test_files and factory_files methods and the trick with MODEL_NAME_WITHOUT_NS and resolve_filename.

drwl commented

@guigs Thanks for testing it and getting back to me. I did see that in your patch in the main issue message looks to support namespaced fabricators. I'll work to also get that working soon. Will ping you again once it's ready.

Great, thank you so much.

drwl commented

@guigs I added some changes and force pushed to the branch, in #85. Can you give it another try and give feedback?

I used your file structure with these in the configuration:

:additional_file_patterns:
  - spec/namespace/models/%MODEL_NAME_WITHOUT_NS%_spec.rb
:model_dir:
- app/models
- app/namespace/models
:require: []
:root_dir:
- ''
drwl commented

Just another update, I landed changes mentioned in #85 into main and pushed up version 4.6.0 of the gem. If you can give it a try and report any errors that would be helpful ๐Ÿ™Œ

I also have this issue (kind of). For us we have model files in app/components/component1/models, app/components/component2/models, etc..

I currently added those model dirs manually to configuration model_dir: but since we may add new components maintaining that list in that configuration file is overhead. So it currently works as a workaround.

Thank you for the Zeitwerk improvements, we are using it & it works for us.

I'm hoping that I have some time to look into this and improve code such that this manual model_dir list is not needed. My plan is to solve it for Rails:

  1. Force Rails to eager load (in case in the configuration it's turned off) all classes. In the backend Rails uses Zeitwerk by default now, so I'm not sure the API is the same for those who have legacy classloader working for them. My focus would be to solve it only for Zeitwerk at first.
  2. Instead of going through file tree manually, find all classes that subclass ActiveRecord::Base, ie c.ancestors.include?(::ActiveRecord::Base)

EDIT: This is big part of the job: https://stackoverflow.com/a/58290028/925315 . Plus I found out that in Ruby 2.7+ you can do Module.const_source_location("Namespace::ModelName") to get path to all those models.

FYI: I started experimenting with the feature, but currently ran into block where I don't understand how dummyapp works for rspec, since I don't want to add feature without writing specs to it so that Zeitwerk is actually loaded. Probably this can be worked out with Aruba, but the whole stuff is quite new to me.

drwl commented

Hi @ilvez, I'll respond to bits inline

I also have this issue (kind of). For us we have model files in app/components/component1/models, app/components/component2/models, etc..

I currently added those model dirs manually to configuration model_dir: but since we may add new components maintaining that list in that configuration file is overhead. So it currently works as a workaround.

I can understand that maintaining the list would add a lot of overhead, especially with changing code. Let's see if we can add something to support this use case since it seems to be common.

Thank you for the Zeitwerk improvements, we are using it & it works for us.

Thank you for testing it out and letting me know ๐Ÿ˜„

drwl commented

I'm not sure I totally understand the use case here, could you elaborate? Here's what I understand:

In your Rails app you have the standard Rails project structure

- app/models/...
- app/controllers/...

Then also namespaced/modularized components, so in addition
- app/components/component_name_1/models/...
- app/components/component_name_1/controllers/...
- app/components/component_name_2/models/...
- app/components/component_name_2/services/...

It also sounds like you're using Zeitwerk. In this case, my questions are:

  1. How are classes namespaced in component_name_1? Is it Components::ComponentName1::ModelName? Is it ComponentName1::ModelName?
  2. How have you configured Zeitwerk to load the classes? (i.e. collapsed directories)

In your Rails app you have the standard Rails project structure

- app/models/...
- app/controllers/...

Then also namespaced/modularized components, so in addition
- app/components/component_name_1/models/...
- app/components/component_name_1/controllers/...
- app/components/component_name_2/models/...
- app/components/component_name_2/services/...

Yes, exactly. Those default Rails directories are legacy and only used for some very common shared stuff, most of the business logic lives in components.

It also sounds like you're using Zeitwerk. In this case, my questions are:

Yes, Zeitwerk. Switched to it during Rails 6.0 upgrade.

  1. How are classes namespaced in component_name_1? Is it Components::ComponentName1::ModelName? Is it ComponentName1::ModelName?

It's ComponentName1::ModelName. Rails is set up such that all directories under app/ are considered as "root" directories and namespaces are created based on sub-directories.

  1. How have you configured Zeitwerk to load the classes? (i.e. collapsed directories)

Nothing special, Rails default. No hacks, very basic stuff.

So how I currently understand how this gem works is that it tries to find model files directly from filesystem. Then for each file it tries to load this class and if it's a model, annotate them. My initial post here outlined (implicitly) different approach, which for me seems more reasonable given that Zeitwerk is already working and has done all the classloading work anyway. This means that with 3 simple commands we can omit all the guesswork what model classes are and where they are:

  1. Zeitwerk::Loader.eager_load_all -- We force Zeitwerk to load in all the classes in project. If we don't do this, Zeitwerk can do it lazily based on how it's configured and this means for next steps we may skip some existing models.
  2. ApplicationRecord.descendants or ActiveRecord::Base.descendants. With this we get list of all model classes, iterate over these and fetch their paths.
  3. Module.const_source_location("Namespace::ModelName") -- we get path for this specific model file which is needed for annotating.

It's pretty simple solution & I already working on it. I'm thinking about implementing this without tests at first (testing locally with my project) and then figure out tests.

EDIT: Note that my use-case is very simple -- I just want to annotate app model files during migrations. This gem supports annotating different types of classes, so support for those is bit more trickier, but I'm guessing similar tactics can be used to find all fixtures, FactoryBot factories and tests.

This is very dirty POC solution (did small refacto there, but can be avoided as well): https://github.com/drwl/annotaterb/compare/main...ilvez:models-with-rails-autoloader-poc?expand=1

drwl commented

Sorry for the delay, but I've had a chance to dive in a bit deeper. On the POC, I think the motivations make sense. If I'm understanding it correctly, it's something like "Why do we have to specify model files? Can't we just Rails/Zeitwerk load them and then we can look at the ancestor tree for ApplicationRecord?". Feel free to correct me if that's incorrect.

At a previous company, I was working in a fairly large Rails monolith (1+ million lines of Ruby code). The old gem defaulted to eager loading and that caused annotations to take a few minutes, and a lot of that time was waiting for the Rails app to fully load. I wonder if there may be an approach that we can still leverage the lazy loading of Zeitwerk but make it easier for custom project structures like your's (with the app/components/component_name/) or @guigs. I haven't thought too hard about this, but perhaps some sort of glob that could support matchers like **? Thoughts?

Fair point with really big monoliths, don't have experience with such big Rails project. If current way of doing things is considerably faster than eager loading, then obviously this would be preferred option for fetching list of all the models. It also justifies of not having alternative way of using eager loading & fetching list of models from ActiveRecord at all. So I'm dumping my branch, glad that you mentioned it, otherwise I would have put more time to wrap it up as PR ๐Ÿ‘

As a solution app/**/models/*.rb glob could work very well. Probably best if it would be configuration option, because I'm guessing there could be very different ways people structure their code ๐Ÿ˜„

Let me know if you need any help here.

drwl commented

So I'm dumping my branch, glad that you mentioned it, otherwise I would have put more time to wrap it up as PR ๐Ÿ‘

First off, I appreciate the effort as open source work can be hard. Hopefully my comments/thoughts don't leave a bad taste for contributing to this project, all input is appreciated.

Also, I'm not opposed to the idea of eager loading an entire project and reduce the need to specify model directories. If we wanted to add something like that, I think it could be done as a configuration option to give flexibility to the end user. So if that's still of interest I would be glad to work together on getting that into the project.

Some considerations that I'm not knowledgable about yet -- how Zeitwerk handles eager loading auxiliary ruby files like factories or fixture.

First off, I appreciate the effort as open source work can be hard. Hopefully my comments/thoughts don't leave a bad taste for contributing to this project, all input is appreciated.

Not at all. All cool.

I'm not even sure that the eager loading approach is needed at all, since there is technically superior way to handle this already existing. (even though this technically superior way is implementation-wise lots more code, verbose and harder to maintain).

By factories you mean FactoryBot factories? I'm assuming they don't end up in Zeitwerk at all, because Zeitwerk deals with classes (being a class-loader). FactoryBot factories are being defined through FactoryBot.define. Zeitwerk parses those files (if they contain any classes), but I'm guessing those FactoryBot.define blocks are being ignored.

About fixtures I'm not sure what you are referring to exactly, I haven't used annotating anything else than models.