/annotate

Primary LanguageClojure

annotate

Annotate is a library for adding type annotations to functions and checking those types at runtime.

DEPRECATED

Purpose

  • Documenting function input and output types.
  • Catching bugs that occur when unexpected data is passed to or returned from a function.
  • Validating user data (e.g., data submitted via a form or passed via an API).
  • Providing a lingua franca for describing the shape of Clojure data.

Rationale

Currently there are two projects that offer similar functionality.

Annotate is in the same category as Schema, that is, runtime data validation. We do, however, attempt to reuse many of the conventions from core.typed (e.g., all types begin with an uppercase letter) as well as specific names from core.typed (e.g., U for the union of two or more types).

Annotate was written to provide a consistent and rich out-the-box experience for developers interested in adding types to their functions. Another primary objective is to provide a means of checking types during development and testing, while generating code that incurs no performance penalty in production.

So how does annotate differ from Schema?

  • defn forms do not require a namespace prefix.
  • Comprehensive set of types, including: Keyword, Int, Symbol, Ratio, Atom, Date, UUID, Regex, Vec, Set, List, Option, Count, Empty, NonEmpty, Coll, Seq, LazySeq, Seqable, NilableColl, CanSeq, Queue, SortedSet, SortedMap, U (union), I (intersection), Pred (predicate), and more.
  • Multi-arity functions can have different return types per arity.
  • Type annotations for functions are not commingled with arglists.
  • Automatic truncation of collections and strings when generating error messages.
  • Vector literals represent vectors.
  • Keyword arguments are supported.
  • Types are displayed tersely by default. For example, String instead of java.lang.String.
  • Ability to annotate functions outside your control.

Installation

[com.roomkey/annotate "1.0.1"]

API Documentation

http://roomkey.github.io/annotate

How to use

Types in annotate are comprised of Clojure data, and can be composed at runtime to produce more complex types (or less complex types) as needed. You should be able to express your data purely by composing the existing types provided by annotate. All types extend the Typeable protocol in annotate.core.

Let's take a look at some basic types. All the built-in types can be found in annotate.types. The check function, which is used internally, can be found in annotate.core. We'll need to refer that as well for these examples.

(use '[annotate core types])

Classes

The easiest way to check if a value is a string or keyword is to reference the class in your type. For example:

(check String "hello")
;; nil

(check String :billy)
;; (not (instance? String :billy))

(check Keyword :billy)
;; nil

The Keyword type is defined in annotate.types, and references clojure.lang.Keyword. The String type is a reference to java.lang.String, and is automatically imported in all Clojure code.

Scalar values

Scalar values indicate a type with exactly one value.

(check 1 1)
;; nil

(check 1 2)
;; (not= 1 2)

Any

The Any type can be used to describe any possible value.

(check Any 3)
;; nil

(check Any "hello")
;; nil

(check Any :billy)
;; nil

Maps

There are three different ways to express the shape of a map in annotate.

  1. A homogeneous map, that is a map whose keys all conform to a single type, and likewise, whose values all conform to a single type.
  2. A map with specific, named keys whose corresponding values conform to a specific type.
  3. An empty map. This type checks against an empty map, and only an empty map.

The first two patterns cannot be mixed, as that can lead to ambiguity in the types.

Homogenous Maps

Homogenous maps use the Clojure map literal syntax and must contain a single key and value.

The check function takes a type and some data to check the type against. If the shape of the data conforms to the type, then nil is returned. Otherwise, a type error will be returned.

The output from each function is printed on the line beneath the code, and shown as a Clojure comment.

(def M {Keyword String})

(check M {:name "Billy"})
;; nil

(check M {})
;; nil

(check M {:name :Billy})
;; {:name (not (instance? String :Billy))}

Notice that the empty map is valid for this type, but a map whose values are Clojure keywords, are not. Also, the error that is returned is a Clojure data structure.

Named Maps

Named maps are maps where the keys are defined upfront. Keyword, symbol, and string keys are automatically assumed to be required, that is they must be present. You can also indicate that a key is optional, that is, it does not have to present, by wrapping the key in the optional-key function. Notice that this function is lowercase. That is because it is a component of a type, and not actually a type itself.

Any keys that are not present in the named map type will be ignored when checking types. This allows you to specify types only for the keys you care about. This is often referred to as Row Polymorphism in statically typed languages that support it.

(def User
  {:name String
   (optional-key :email) String
   :address {:city String
             :state String}})

(check User {:name "Billy"
             :address {:city "San Diego"
                       :state "CA"}})

;; {:likes key-not-found}

(check User {:name :Billy
             :email "billy@example.org"
             :address {:city "San Diego"
                       :state :CA}})

;; {:name (not (instance? String :Billy)),
;;  :address {:state (not (instance? String :CA))}}

Notice that the shape of the error is a map itself. In the case where a key is not found, the value for that key is replaced with the symbol key-not-found. When the value of a key fails to type check, a Clojure data structure that represents the error is substituted for the value.

Empty Map

(check {} {})
;; nil

(check {} {:name "Billy"})
;; (not (empty? {:name "Billy"}))

Vectors

There are three different patterns for expressing the shape of a vector in annotate.

  1. Homogenous vectors, where the element represents the type for all elements in the vector. The empty vector is a valid value for all homogenous vectors.
  2. A two-tuple, three-tuple, etc. vector, where each element represents the type for that particular position within the fixed length vector.
  3. An empty vector, that type checks against the empty vector, and only the empty vector.

Homogenous Vectors

(check [Keyword] [:hi :there])
;; nil

(check [Int] [1 :2 3])
;; [nil (not (integer? :2)) nil]

(check [Int] (range 10))
;; (not (vector? (0 1 2 3 4 ...)))

(check [Int] [])
;; nil

Notice that a nil is returned in the position where a particular element type checked, in the case where the data as a whole did not. Collections are automatically truncated to minimize the possibility of exceptionally large error messages.

Vector Tuples

(check [[Keyword Int]] [[:joe 10] [:billy 9]])
;; nil

Empty Vector

(check [] [])
;; nil

(check [] [1 2 3])
;; (not (empty? [1 2 3]))

Lists

Lists have the same behavior as vectors.

Sets

There are two different patterns for expressing the shape of a set in annotate.

  1. Homogenous sets, where the element represents the type for all elements in the set. The empty set is a valid value for all homogenous sets.
  2. An empty set, that type checks against the empty set, and only the empty set.
(check #{Keyword} #{:hi :there})
;; nil

(check #{Keyword} #{})
;; nil

(check #{Keyword} #{"hello"})
;; #{(not (instance? Keyword "hello"))}

(check #{} #{})
;; nil

(check #{} #{1 2 3})
;; (not (empty? #{1 2 3}))

Sequences

There are many types that can be used to define a sequence of values. The most likely to be used is NilableColl, which represents a collection of some type or nil.

(check (NilableColl Int) [1 2 3])
;; nil

(check (NilableColl Int) (range 10))
;; nil

(check (NilableColl Int) #{1 2 3})
;; nil

(check (NilableColl Int) (list 1 2 3))
;; nil

(check (NilableColl Int) nil)
;; nil

(check (NilableColl Int) "Billy")
;; (and (not (coll? "Billy")) (not (nil? "Billy")))

(check (NilableColl String) ["Billy" :Bobby])
;; (and (nil (not (instance? String :Bobby))) (not (nil? ["Billy" :Bobby])))

Predicates

The Pred type allows for arbitrary logic when type checking. It takes a predicate function and type checks if the function returns a truthy value.

(check (Pred odd?) 3)
;; nil

(check (Pred odd?) 2)
;; (not (odd? 2))

Regular expressions

Regular expressions are valid types in annotate. The data being checked must be a string and match the pattern.

(check #"[a-z]+" "hi")
;; nil

(check #"[a-z]+" "hi there")
;; (not (re-matches #"[a-z]+" "hi there"))

(check #"[a-z]+" :billy)
;; (not (string? :billy))

Union of types

A union implies that the type is composed of one or more types, and that the data only need conform to at most one of the types. Types are checked in the order they are passed.

(check (U Keyword Symbol String) :billy)
;; nil

(check (U Keyword Symbol String) 'billy)
;; nil

(check (U Keyword Symbol String) "billy")
;; nil

(check (U Keyword Symbol String) 5)
;; (and (not (instance? Keyword 5)) (not (instance? Symbol 5)) (not (instance? String 5)))

There is a convenience type Option that represents the union of some type and nil.

(check (Option Int) 3)
;; nil

(check (Option Int) nil)
;; nil

(check (Option Int) "Billy")
;; (and (not (integer? "Billy")) (not (nil? "Billy")))

Intersection of types

A intersection implies that the type is composed of one or more types, and that the data must conform to all of the types. Types are checked in the order they are passed.

(check (I Int (Pred even?)) 2)
;; nil

(check (I Int (Pred even?)) 3)
;; (not (even? 3))

Composition of types

In the example below we define a type Blog that represents blog posts. Blog posts can have zero or more comments, represented as a vector of type Comment. Comments have a single author, represented as a type User.

Since all of our types are just Clojure data, we compose them just like we would any other Clojure data.

(def User
  {:username String
   :email #"[^@]+@[^@]+"})

(def Comment
  {:author User
   :comment String
   :posted Date})

(def Blog
  {:title String
   :content String
   :posted Date
   :comments [Comment]})

Below is some example data that conforms to our type.

(def blog123
  {:title "Clojure 101"
   :content "Clojure is a functional..."
   :posted #inst "2015-01-01"
   :comments [{:author {:username "funprog" :email "me@example.org"}
               :comment "Great post!"
               :posted #inst "2015-01-02T12:30:00"}]})

(check Blog blog123)
;; nil

We can display our type by calling display-type on it.

(display-type Blog)
;; {:comments [{:author {:email #"[^@]+@[^@]+", :username String},
;;              :comment String,
;;              :posted Date}],
;;  :content String,
;;  :posted Date,
;;  :title String}

Functions

Annotate provides four variations of the defn macro: defn', defna, defnv and defn$.

defn' modifies the body of your function, adding a conditional switch that when enabled will check the types of the inputs and output against the type annotation. In addition, metadata is added to the var containing the type annotation. Finally, the type annotation is added to the doc string.

defna does not modify the body of your function in any way. It does add metadata and append to the doc string like defn', though.

defnv works like defn', only type checking is always enabled.

defn$ will generate an always type checked function when the system property annotate.typecheck is set to on. Otherwise, an annotated only function will be generated. Add :jvm-opts ["-Dannotate.typecheck=on"] to the dev profile of your project to enable type checking during development and when running tests.

Type annotations for fns must be wrapped in a vector or list. Lists indicate a multi-arty fn and should contain two or more vector forms.

NOTE: defn', defnv, and defn$ will remove pre/post conditions from the generated code. If you need to mimic the behavior of pre/post conditions use assert in the body of your function.

All of the macros for creating type annotated functions are located in annotate.fns. Let's take a look at some examples.

(use 'annotate.fns)

;; Annotation only with no type checking or code modification.
;; Use with low-level, performance sensitive fns.
(defna append [String String => String]
  "Append a string to a string."
  [s1 s2]
  (str s1 s2))

(doc append)
;; user/append
;; ([s1 s2])
;;  [String String => String]
;;
;; Append a string to a string.

(annotation append)
;; [String String => String]

(canonical append)
;; [java.lang.String java.lang.String => java.lang.String]

Notice how we can use the annotation and canonical macros to retrieve the type annotation from the var. By default, the display of the type is as terse as possible. This is a core principal of annotate. Types should be as terse as possible, while providing the more verbose representation when needed.

;; Always type checked.
(defnv greeting ([=> String] [String => String])
  "Doc string"
  ([] (greeting "world"))
  ([msg] (str "Hello, " msg)))

(greeting "Bob")
;; "Hello, Bob"

(greeting :Bob)
;; ExceptionInfo Failed to type check user/greeting input(s): (not (instance? String :Bob))

Notice that an ExceptionInfo exception is thrown when the function fails to type check. The exception message will always contain the name of the var and it's namespace, as well as whether an input or the output failed to type check. The data representation of the error can be extracted using ex-data, if needed.

;; Rest args
(defn' append* [String & (Seq String) => String]
  [s1 & args]
  (apply str s1 args))

(with-checking (append* "Hello " :there :friend))
;; ExceptionInfo Failed to type check user/append* input(s): ((not (instance? String :there)) (not (instance? String :friend)))

;; Keyword args
(defn' ping [String & (KwA :method Named :timeout Int) => String]
  [url & {:keys [method timeout]}]
  (str "Ping: " url " with: " method))

(with-checking (ping "localhost" :method :POST))
;; "Ping: localhost with: :POST"

(with-checking (ping "localhost" :method :POST :timeout 100.0))
;; ExceptionInfo Failed to type check user/ping input(s): (and {:timeout (not (integer? 100.0))} (not (nil? {:method :POST, :timeout 100.0})))

;; Higher order fns
(defn' map* [Fn (CanSeq) => (LazySeq)]
  [f coll]
  (map f coll))

(with-checking (map* inc (range 5)))
;; (1 2 3 4 5)

(with-checking (map* 3 (range 5)))
;; ExceptionInfo Failed to type check user/map* input(s): (not (ifn? 3))

Notice that defn' is only type checked when called within the with-checking macro. Also, notice the type for representing keyword arguments KwA.

(defn$ echo [String => String]
  [msg]
  msg)

(echo "Hello")
;; Hello

;; With the system property annotate.typecheck set to 'on'.
(echo 3)
;; ExceptionInfo Failed to type check user/echo input(s): (not (instance? String 3))

Which variation of defn should I use?

You should consider using defn$ first, as it can be used to provide type checking during development/testing, while not providing any runtime overhead in production. Use defna when you have a low-level function that needs to make use of loop/recur. Use defnv when you want to guarantee that a call to a function in production will throw an exception if given data with the wrong shape. Finally, defn' is not needed in most cases.

Anonymous functions

Annotate provides four variations of fn: fn', fna, fnv, and fn$.

Their behavior is identical to the defn variations.

((fnv [String => String] [x] x) "Bob")
;; "Bob"
((fnv ([String => String] [String String => String]) ([x] x) ([x y] (str x y))) "Billy" "Bob")
;; "BillyBob"
((fnv [String => String] [x] 20) "Bob")
;; ExceptionInfo Failed to type check anonymous output: (not (instance? String 20))

Recursive types

Recursive types can represented by referencing the var object of the type you are defining. Let's take a look at an example.

(def Element
  {:tag Keyword
   :attrs {Keyword String}
   :content (Seqable (U String #'Element))})

(check Element
       {:tag :a
        :attrs {:b "c"}
        :content ["d" "e" {:tag :f :attrs {} :content ["g"]}]})
;; nil

Wrap an existing function outside your control

Annotate allows you to wrap existing functions outside of your control in a type checking function. There are three variations: wrap', wrapv, and wrap$. If you just want to annotate a function without any type checking use ann.

(use 'annotate.wrap)

(defn ping
  "Make an HTTP request."
  [url & {:keys [method timeout] :or {method "GET"}}]
  (str "url: " url ", method: " (name method) ", timeout: " timeout))

(wrap' ping [String & (Pairs :method Named :timeout Int) => String])

(with-checking (ping "localhost" :method :POST))
;; "url: localhost, method: POST, timeout: "

(with-checking (ping "localhost" :timeout 100.0))
;; Exceptioninfo Failed to type check ping input(s): {:timeout (not (integer? 100.0))}

Notice the usage of the Pairs type to represent keyword arguments. This is due to a technical limitation, but provides the same behavior as using KwA for a function under your control.

Records

Annotate provides four variations of defrecord: defrecord', defrecorda, defrecordv, and defrecord$.

Their behavior is identical to the defn variations.

Note: The only type checking that occurs is on the generated constructor function.

(defrecordv User [String String]
  [first-name last-name])

(->User "Billy" "Bob")
;; #user.User{:first-name "Billy", :last-name "Bob"}

(->User :billy :bob)
;; ExceptionInfo Failed to type check user/->User input(s): (not (instance? String :billy)), (not (instance? String :bob))

Friendly errors

Annotate can be used to check user input and return user-friendly error messages, if needed. The label function is used to provide an error message and not found message for a particular map key. The friendly function recursively transforms a map of errors into a map of errors where the values are understandable error messages.

(use 'annotate.friendly)

(-> (check {:first-name (NonEmpty String)
            :age Int}
           {:first-name ""})
    (friendly {:first-name (label "First name is invalid" "First name is missing")
               :age (label "Age is invalid" "Age not found")}))
;; {:age "Age not found", :first-name "First name is invalid"}

Type validation

If you are curious whether a particular type is a valid type, you can call valid-type? on that type.

(valid-type? #{String})
;; true

(valid-type? #{String Keyword})
;; false

Notice that a set with two elements is not a valid type.

Presentation

To get you up to speed as quick as possible, we've provided an interactive presentation that was given at Room Key.

Run lein gorilla then type Ctrl-g Ctrl-l in the worksheet to load 'presentation.clj'.