/binary-types

Read and write binary records for Common Lisp

Primary LanguageCommon LispOtherNOASSERTION

-*- mode: text; coding: utf-8-unix; -*-

######################################################################
##
##    Copyright (C) 2001,2000, 2003
##    Department of Computer Science, University of Tromsø, Norway
##
## Filename:      README
## Author:        Frode Vatvedt Fjeld <frodef@acm.org>
## Created at:    Wed Dec  8 15:35:53 1999
## Distribution:  See the accompanying file COPYING.
##
## $Id: README,v 1.1.1.1 2004/01/13 11:13:13 ffjeld Exp $
##
######################################################################

Binary-types is a Common Lisp package for reading and writing binary
files. Binary-types provides macros that are used to declare the
mapping between lisp objects and some binary (i.e. octet-based)
representation.

Supported kinds of binary types include:

 * Signed and unsigned integers of any octet-size, big-endian or
   little-endian. Maps to lisp integers.

 * Enumerated types based on any integer type. Maps to lisp symbols.

 * Complex bit-field types based on any integer type. Sub-fields can
   be numeric, enumerated, or bit-flags. Maps to lisp lists of symbols
   and integers.

 * Fixed-length and null-terminated strings. Maps to lisp strings.

 * Compound records of other binary types. Maps to lisp DEFCLASS
   classes or, when you prefer, DEFSTRUCT structs.

Typically, a complete binary record format/type can be specified in a
single (nested) declaration statement. Such compound records may then
be read and written with READ-BINARY and WRITE-BINARY.

Binary-types is *not* helpful in reading files with variable
bit-length code-words, such as most compressed file formats. It will
basically only work with file-formats based on 8-bit bytes
(octets). Also, at this time no floating-point types are supported out
of the box.

Binary types may now be declared with the DEFINE-BINARY-CLASS macro,
which has the same syntax (and semantics) as DEFCLASS, only there is
an additional slot-option (named :BINARY-TYPE) that declares that
slot's binary type. Note that the binary aspects of slots are *not*
inherited (the semantics of inheriting binary slots is unclear to me).

Another slot-option added by binary-types is :MAP-BINARY-WRITE, which
names a function (of two arguments) that is applied to the slot's
value and the name of the slot's binary-type in order to obtain the
value that is actually passed to WRITE-BINARY. Similarly,
:MAP-BINARY-READ takes a function that is to be applied to the binary
data and type-name when a record of that type is being read.  A
slightly modified version of :map-binary-read is
:MAP-BINARY-READ-DELAYED, which will do essentially the same thing as
:map-binary-read, only the mapping will be "on-demand": A slot-unbound
method will be created for this purpose.

A variation of the :BINARY-TYPE slot-option is :BINARY-LISP-TYPE,
which does everything :BINARY-TYPE does, but also passes on a :TYPE
slot-option to DEFCLASS (or DEFSTRUCT).  The type-spec is inferred
from the binary-type declaration. When using this mechanism, you
should be careful to always provide a legal value in the slot (as you
must always do when declaring slots' types). If you find this
confusing, just use :BINARY-TYPE.

Performance has not really been a concern for me while designing this
package. There's no obvious performance bottlenecks that I know of,
but keep in mind that all "binary" reads and writes are reduced to
individual 8-bit READ-BYTEs and WRITE-BYTEs. If you do identify
particular performance bottlenecks, let me know.

The included file "example.lisp" demonstrates how to use this
package. To give you a taste of what it looks like, the following
declarations are enough to read the header of an ELF executable file
with the form

   (let ((*endian* :big-endian))
     (read-binary 'elf-header stream)


;;; ELF basic type declarations
(define-unsigned word 4)
(define-signed sword  4)
(define-unsigned addr 4)
(define-unsigned off  4)
(define-unsigned half 2)

;;; ELF file header structure
(define-binary-class elf-header ()
  ((e-ident
    :binary-type (define-binary-struct e-ident ()
           (ei-magic nil :binary-type
                 (define-binary-struct ei-magic ()
                   (ei-mag0 0 :binary-type u8)
                   (ei-mag1 #\null :binary-type char8)
                   (ei-mag2 #\null :binary-type char8)
                   (ei-mag3 #\null :binary-type char8)))
           (ei-class nil :binary-type
                 (define-enum ei-class (u8)
                   elf-class-none 0
                   elf-class-32   1
                   elf-class-64   2))
           (ei-data nil :binary-type
                (define-enum ei-data (u8)
                  elf-data-none 0
                  elf-data-2lsb 1
                  elf-data-2msb 2))
           (ei-version 0 :binary-type u8)
           (padding nil :binary-type 1)
           (ei-name "" :binary-type
                (define-null-terminated-string ei-name 8))))
   (e-type
    :binary-type (define-enum e-type (half)
           et-none 0
           et-rel  1
           et-exec 2
           et-dyn  3
           et-core 4
           et-loproc #xff00
           et-hiproc #xffff))
   (e-machine
    :binary-type (define-enum e-machine (half)
           em-none  0
           em-m32   1
           em-sparc 2
           em-386   3
           em-68k   4
           em-88k   5
           em-860   7
           em-mips  8))
   (e-version   :binary-type word)
   (e-entry     :binary-type addr)
   (e-phoff     :binary-type off)
   (e-shoff     :binary-type off)
   (e-flags     :binary-type word)
   (e-ehsize    :binary-type half)
   (e-phentsize :binary-type half)
   (e-phnum     :binary-type half)
   (e-shentsize :binary-type half)
   (e-shnum     :binary-type half)
   (e-shstrndx  :binary-type half)))


For a second example, here's an approach to supporting floats:

  (define-bitfield ieee754-single-float (u32)
    (((:enum :byte (1 31))
       positive 0
       negative 1)
      ((:numeric exponent 8 23))
      ((:numeric significand 23 0))))




The postscript file "type-hierarchy.ps" shows the binary types
hierarchy.  It is generated using psgraph and closer-mop, which may be
loaded via Quicklisp as shown below:

    (ql:quickload "psgraph")
    (ql:quickload "closer-mop")

    (with-open-file (*standard-output* "type-hierarchy.ps"
                                       :direction :output
                                       :if-exists :supersede)
      (psgraph:psgraph *standard-output* 'binary-types::binary-type
                       (lambda (p)
                         (mapcar #'class-name
                                 (closer-mop:class-direct-subclasses
                                  (find-class p))))
                       (lambda (s) (list (symbol-name s)))
                       t))