/ghc-dump

A GHC plugin and library for analysing GHC Core

Primary LanguageHaskell

ghc-dump: A tool for analysing GHC Core

ghc-dump is a library, GHC plugin, and set of tools for recording and analysing GHC's Core representation. The plugin is compatible with GHC 7.10 through 9.0, exporting a consistent (albeit somewhat lossy) representation across these versions. The AST is encoded as CBOR, which is small and easy to deserialise.

Dumping Core from compilation

The GHC plugin GhcDump.Plugin provides a Core-to-Core plugin which dumps a representation of the Core AST to a file after every Core-to-Core pass. To use it on a cabal package add the following to the relevant stanza of your cabal file (e.g. the library stanza in the case of a library):

build-depends: ghc-dump-core
ghc-options: -fplugin GhcDump.Plugin

Now building the project will produce a number of .cbor files in the dist-newstyle directory, one for each step of GHC's Core-to-Core pipeline.

$ cabal build
$ ls **/*.cbor
dist-newstyle/build/x86_64-linux/ghc-8.10.4/aeson-1.5.6.0/build/attoparsec-iso8601/src/Data/Attoparsec/Time/Internal.pass-0000.cbor
dist-newstyle/build/x86_64-linux/ghc-8.10.4/aeson-1.5.6.0/build/attoparsec-iso8601/src/Data/Attoparsec/Time/Internal.pass-0001.cbor
dist-newstyle/build/x86_64-linux/ghc-8.10.4/aeson-1.5.6.0/build/attoparsec-iso8601/src/Data/Attoparsec/Time/Internal.pass-0002.cbor
dist-newstyle/build/x86_64-linux/ghc-8.10.4/aeson-1.5.6.0/build/attoparsec-iso8601/src/Data/Attoparsec/Time/Internal.pass-0003.cbor
dist-newstyle/build/x86_64-linux/ghc-8.10.4/aeson-1.5.6.0/build/attoparsec-iso8601/src/Data/Attoparsec/Time/Internal.pass-0004.cbor
dist-newstyle/build/x86_64-linux/ghc-8.10.4/aeson-1.5.6.0/build/attoparsec-iso8601/src/Data/Attoparsec/Time/Internal.pass-0005.cbor
...

Here we see a pass-N.cbor file was produced for each Core-to-Core pass.

We can produce a brief summary of these files using the summarize mode of the ghc-dump CLI tool:

Name                                Terms    Types    Coerc.   Previous phase
Data/Attoparsec/Time/Internal.pass-0000.cbor
                                    124      63       9        desugar
Data/Attoparsec/Time/Internal.pass-0001.cbor
                                    138      59       38       Simplifier: Max iterations = 4
                                                                           SimplMode {Phase = InitialPhase [Gentle],
                                                                                      inline,
                                                                                      rules,
                                                                                      eta-expand,
                                                                                      no case-of-case}
Data/Attoparsec/Time/Internal.pass-0002.cbor
                                    138      59       38       Specialise:
Data/Attoparsec/Time/Internal.pass-0003.cbor
                                    142      59       38       Float out(FOS {Lam = Just 0, Consts = True, OverSatApps = False}):
Data/Attoparsec/Time/Internal.pass-0004.cbor
                                    118      41       38       Simplifier: Max iterations = 4
                                                                           SimplMode {Phase = 2 [main],
                                                                                      inline,
                                                                                      rules,
                                                                                      eta-expand,
                                                                                      case-of-case}
Data/Attoparsec/Time/Internal.pass-0005.cbor
                                    112      32       38       Simplifier: Max iterations = 4
                                                                           SimplMode {Phase = 1 [main],
                                                                                      inline,
                                                                                      rules,
                                                                                      eta-expand,
                                                                                      case-of-case}

Here we see for each dump file:

  • the file name
  • the size of the program measured in terms, types, and coercions
  • the name of the phase that produced the program

Analysis in GHCi

One can then load this into ghci for analysis,

$ ghci
GHCi, version 8.3.20170413: http://www.haskell.org/ghc/  :? for help
Loaded GHCi configuration from /home/ben/.ghci
λ> import GhcDump.Repl as Dump
λ> mod <- readDump "Test.pass-0.cbor"
λ> pretty mod
module Main where

nsoln :: Int-> Int
{- Core Size{terms=98 types=66 cos=0 vbinds=0 jbinds=0} -}
nsoln =
  λ nq →
    let safe =
          λ x d ds →
            case ds of wild {
              [] → GHC.Types.True
              : q l →
                GHC.Classes.&&
...

Analysis with CLI tool

Alternatively, the ghc-dump utility can be used to render the representation in human-readable form. For instance, we can filter the dump to include only top-level binders containing main in its name,

$ ghc-dump show --filter='.*main.*' Test.pass-0.cbor

You can conveniently summarize the top-level bindings of the program,

$ ghc-dump list-bindings --sort=terms Test.pass-0.cbor
Name                 Terms  Types  Coerc. Type
nsoln                98     66     0      Int-> Int
main                 30     28     0      IO ()
$trModule            5      0      0      Module
main                 2      1      0      IO ()
...