/the_schema_is

ActiveRecord schema annotations done right

Primary LanguageRubyMIT LicenseMIT

The schema is ...

Gem Version Build Status

the_schema_is is a model schema annotation DSL for ActiveSupport models, enforced by Rubocop. Jump to detailed description →.

Why annotate?

An important part of class' public interface is what attributes objects of this class have. In ActiveRecord, attributes are inferred from DB columns and only can be seen in db/schema.rb, which is unfortunate.

We believe it should be part immediately available information of class definition. "It is drawn automatically from DB" is kinda clever, but it does not helps to read the code. "Auto-deduction from DB" could be used to compare actual table content's to the definition in Ruby but not to skip the definition.

Fun fact: most of other languages' ORM have chosen "explictly list attributes in the model" approach, for some reason! For example, Python's Django, Elixir's Ecto, Go's Beego and Gorm, Rust's Diesel, most of popular NodeJS's options, and PHP's Symphony (but, to be honest, not Laravel).

Well then, why not annotate gem?

Annotate gem provides a very powerful and configurable CLI/rake task which allows adding to your model (and factory/route/spec) files comment looking like...

# == Schema Information
#
# Table name: users
#
#  id                             :integer          not null, primary key
#  email                          :string           default(""), not null
#  encrypted_password             :string           default(""), not null
#  last_sign_in_at                :datetime
#  last_sign_in_ip                :inet
#  created_at                     :datetime         not null
#  updated_at                     :datetime         not null
# ....

It kinda achieves the goal, but in our experience, it also brings some problems:

  • annotation regeneration is disruptive, just replacing the whole block with a new one, which produces a lot of "false changes" (e.g. one field with a bit longer name was added → spacing of all fields were changed);
  • if on different developer's machines column order or defaults is different in dev. DB, annotate also decides to rewrite all the annotations, sometimes adding tens files "changed" to PR;
  • regeneration makes it hard to use schema annotation for commenting/explaining some fields: because regeneration will lose them, and because comments-between-comments will be hard to distinguish;
  • the syntax of annotations is kinda ad-hoc, which makes it harder to add them by hand, so regeneration becomes the only way to add them.

So, how your approach is different?..

the_schema_is allows you to do this:

class User < ApplicationRecord
  the_schema_is "users" do |t|
    t.string "email", default: "", null: false
    t.string "encrypted_password", null: false
    t.datetime "last_sign_in_at"
    t.inet "last_sign_in_ip"
    t.datetime "created_at", null: false
    t.datetime "updated_at", null: false
    # ...
  end
end

Idea is, it is exactly the same DSL that db/schema.rb uses, so:

  • it can be just copied from there (or written by hands in usual migration syntax);
  • it is code, which can be supplemented with comments explaining what some column does, or why the defaults are this way; it also can be structured with columns reordering and extra blank lines.

So, in reality, your annotation may look like this:

class User < ApplicationRecord
  the_schema_is "users" do |t|
    t.string "email", default: "", null: false
    # We use RSA encryption currently.
    t.string "encrypted_password", null: false

    t.inet "last_sign_in_ip" # FIXME: Legacy, we don't use it anymore because GDPR

    t.datetime "last_sign_in_at"

    t.datetime "created_at", null: false
    t.datetime "updated_at", null: false
    # ...
  end
end

Now, the_schema_is gem consists of this DSL and custom Rubocop cops which check the correspondence of this DSL in model classes to your db/schema.rb (and can automatically fix discrepancies found).

Using existing Rubocop's infrastructure brings several great benefits:

  • you can include checking "if all annotations are actual" in your CI/pre-commit hooks easily;
  • you can preview problems found, and then fix them automatically (with rubocop -a) or manually however you see suitable;
  • the changes made with auto-correct is very local (just add/remove/change line related to relevant column), so your custom structuring, like separating groups of related columns with empty lines and comments, will be preserved;
  • rubocop is easy to run on some sub-folder or one file, or files corresponding to some pattern; or exclude permanently for some file or folder.

But what the block itself does?

Nothing.

Ugh... What?

That's just how it is (at least for now) ¯\_(ツ)_/¯

The block isn't even evaluated at all (so potentially can contain any code, and only Rubocop's cop will complain). In the future, it can do some useful things (like, on app run in development environment compare scheme of the real DB with declarations in class), but for now, it is just noop declarative schema copy-paste.

Usage

  1. Add to your Gemfile gem 'the_schema_is' and run bundle install.
  2. Add to your .rubocop.yml this:
require:
  - the_schema_is/cops
  1. Run rubocop and see what it now says about your models.
  2. Now you can add schema definitions manually, or allow rubocop --auto-correct (or -a) to do its job! NB: you can always use rubocop --auto-correct --only TheSchemaIs to auto-correct ONLY this schema thing

To make reporting cleaner, all cops are split into:

  • Presence
  • WrongTableName
  • MissingColumn
  • UnknownColumn
  • WrongColumnDefinition

It is not advisable to selectively turn them off, but you may know better (for example, some may experiment with leaving in models just t.<type> '<name>' without details about defaults and limit, and therefore turn off WrongColumnDefinition), all of it is pretty experimental!

Setting

the_schema_is cops support some configuration, which should be done on the namespace level in your .rubocop.yml, for example:

TheSchemaIs:
  Schema: db/other-schema-file.rb

Currently available settings are:

  • TablePrefix to help the_schema_is deduce table name from class name;
  • Schema to set path to schema (by default db/schema.rb);
  • BaseClass to help the_schema_is guess what is a model class (by default ApplicationRecord and ActiveRecord::Base);
  • RemoveDefinitions: list of definition keys to remove (for example, [index, foreign_key, limit]) when copying definitions into models; this might be desirable for leaner the_schema_is statements, displaying only field types/names.

So, if you have your custom-named base class, you should do:

TheSchemaIs:
  BaseClass: OurOwnBase

Note that Rubocop allows per-folder settings out of the box, which allows TheSchemaIs to support complicated configurations with multiple databases and engines.

For example, consider your models are split into app/models/users/ and app/models/products which are stored in the different databases, then you probably have different schemas and base classes for them. So, to configure it properly, you might want to do in app/models/users/.rubocop.yml:

# Don't forget this for all other cop settings to not be ignored
inherit_from: ../../../.rubocop.yml

TheSchemaIs:
  BaseClass: Users::BaseRecord
  Schema: db/users_schema.rb

Some Q&A

  • Q: It doesn't check the actual DB?
    • A: No, it does not! At the current moment, our belief is that in a healthy Rails codebase schema.rb is always corresponding to DB state, so checking against it is enough. This approach makes the tooling much easier (with existing Rubocop's ecosystem of parsers/offenses/configurations).
  • Q: What if I don't use Rubocop?
    • A: You may want to try, at least? Do you know that you may disable or configure most of its checks to your liking? And auto-correct any code to your preferences?.. Or automatically create "TODO" config-file (which disables all the cops currently raising offenses, and allows to review them and later setup one-by-one)?.. It is much more than "linter making your code to complain about some rigid style guide".
  • Q: Cool, but I still don't want to.
    • A: ...OK, then you can disable all cops except for TheSchemaIs namespace :)
  • How do I annotate my fabrics, model specs, routes, controllers, ... (which annotate allows)?
    • A: You don't. The same way you don't copy-paste the whole definition of the class into spec file which tests this class: Definition is in one place, tests and other code using this definition is another. DRY!
  • Rubocop is unhappy with the code TheSchemaIs generated.
    • A: There are two known things in generated the_schema_is blocks that Rubocop may complain about:
      • Usage of double quotes for strings, if your config insists on single quotes: that's because we just copy code objects from schema.rb. Rubocop's auto-correct will fix it :) (Even in one run: "fixing TheSchemaIs, then fixing quotes");
      • Too long blocks (if you have tables with dozens of columns, God forbid... as we do). It can be fixed by adding this to .rubocop.yml:
    Metrics/BlockLength:
      ExcludedMethods:
        - the_schema_is

Author and License

Victor Shepelev aka "zverok", MIT.