kwrooijen/gungnir

Built-in migration system

kwrooijen opened this issue · 13 comments

It would be nice to have some sort of data-driven migration system built in. This doesn't mean that we have to build everything from scratch. We could also decide to use a migration library such as Ragtime and expand on that.

I'm not sure what we exactly want with this. But I think it would be nice to have a syntax for basic migration action such as

  • create / update / delete table
  • create / update / delete field
  • setting the type of the field
    ...

And if you need to do something very specific, grant the ability to fall back to raw SQL.

From Slack:

  • Use Ragtime for the heavy lifting - We can use Ragtime to actually apply migrations, and create an interface around that. That way we don't have to reinvente the wheel on that part.
  • Data driven - Something like a data driven syntax. Possibly a simplified HoneySQL? Or something that translates to HoneySQL? AFAIK HoneySQL doesn't support creating tables. But there's a PostgreSQL honeysql extension. I also like the idea of a similar syntax to Malli definitions. e.g.
;; 001-add-user-table.edn
{:up
 [:table/create "user"
  [:table.add/column {:primary-key true :type :uuid} "id"]
  [:table.add/column {:type :text} "email"]]
 :down
 [:table/drop "user"]}

;; 002-add-user-email-remove-password.edn
{:up
 [:table/modify "user"
  [:table.add/column {:type :text} "password"]
  [:table.remove/column "email"]]
 :down
 [:table/modify "user"
  [:table.remove/column "password"]
  [:table.add/column {:type :text} "email"]]}
  • Fallback to SQL string - In case the migration gets too complex, users can opt out and use raw SQL instead
  • Validation with Malli - Malli should check if the structure is actually valid, and return an error message

PS: Maybe we could even use Malli for building the migrations? Might be something interesting to look into.

PPS: Another nice thing would be if types (e.g. :text / :uuid) are extendable. Something along the lines of

(defmethod gungnir.migration/column-type :varchar [[_ n]]
  (format "VARCHAR(%s)" n))

I've done some investigation and so far the only minor issue I've found is that ragtime really wants to operate on it's own set of files (instead of say passing a string for it to slurp), so in the end I'd have to have our own files and then under the hood generate the ragtime migration files for it to process them normally.

For an early draft it's probably fine.

Are other migration libraries also worth considering?

Most migration libraries work with directories and X-files. (I assume) This is because they need the entire collection of migrations to determine what needs to be applied, and if migrations have changed. I'm not too familiar with other Clojure migration libraries, however I can only assume they all need a directory with specific types of files in them.

An interesting thing about Ragtime is that you can define your own types of files based on their extension. For example see the .edn extension

This is definitely worth examining. I believe generating files would be a very bad choice. We could instead create our own extension (or override .edn, but maybe that wouldn't be a good idea), and have it create migrations for us.

(defmethod load-files ".gedn" [files]
  (for [file files]
  ;; Create an `SqlMigration` record for each file   
  ))

What do you think @ma-Rac ?

Oh I didn't notice that while I was running around the ragtime code.

That's great! Other than the minor that it has to be a new format which might confuse IDE's a bit, It seems like a pretty nice solution.

I'll give it a try.

Oh yeah and HoneySQL indeed doesn't support creating tables directly (as far as my quick perusal could determine), but like you said it seems the Postgres extension that I believe you were referring to does.

We could use the postgres extension as a middleman for generating the queries I think. Gungnir Migration -> (Postgres) HoneySQL -> SQL, but I don't think we should write it directly since that would be verbose for very simple migration tasks. Having a fallback to Postgres HoneySQL / Raw SQL however should also be an option.

That's great! Other than the minor that it has to be a new format which might confuse IDE's a bit, It seems like a pretty nice solution.

I agree that it's unfortunate we'd have to create a new extension. We could still try and override the .edn match, but we do run the risk of the user manually requiring Ragtime and overriding it again. Not sure if there's a proper solution for that. Being able to simply use .edn would be the best case scenario I think.

Would making the migration files .clj be reasonable? IDEs would understand it, but I honestly don't know if that is dangerous in some way.

I don't know if there's anything dangerous about it. It might be possible. The unfortunate thing about .clj files is that it won't be pure data anymore, it will be code. One benefit is that we can compose migrations with functions, but I also like the "clean" feel of how Models work, and that you can add them to .edn files. All in all .clj files might be the best solution granted all the benefits (not overriding .edn multimethod and not breaking IDEs). As long as these .clj files aren't in the source-path it should probably be fine.

After doing a little bit of research, we don't need to supply files at all. This works fine:

  (def migrations
    [(ragtime.jdbc/sql-migration
      {:id (str :uuid)
       :up ["CREATE EXTENSION IF NOT EXISTS \"uuid-ossp\";"]
       :down ["DROP EXTENSION \"uuid-ossp\";"]})

     (ragtime.jdbc/sql-migration
      {:id (str :comment) ;; NOTE `:id` must be a string
       :up
       ["CREATE TABLE IF NOT EXISTS comment
        ( id uuid DEFAULT uuid_generate_v4 () PRIMARY KEY
        , content TEXT
        );"]
       :down
       ["DROP TABLE comment;"]})])

  (ragtime.core/migrate-all
   (ragtime.jdbc/sql-database
    {:datasource *database*})
   (ragtime.core/into-index {} migrations)
   migrations
   {:strategy ragtime.strategy/raise-error
    :reporter ragtime.reporter/print})

That's what I was trying to figure out how to do! My lack of experience is showing. 😅

This way is preferred, right?

This is probably the best option. It gives us the most flexibility and we can use .edn files without a problem.

FWIW I would like the possibility to retain control over the migration process, so hopefully this would be optional / possible to turn off.

The migration system should be a separate namespace, and shouldn't be required by any of the existing namespaces. So it will be optional yes.

Made a very early proof of concept (#13), it does the basic things that were detailed here and I followed the ideas as closely as I could.

It should provide a solid testing ground before a proper merge request is made.

Let me know what you think.