c-cube/printbox

Representing DAGs (box diagrams). Representing plots.

lukstafi opened this issue · 16 comments

Hi! Would others benefit from such advanced functionality? If I work on it, should I upstream it? Plots could be external to PrintBox -- treating a box as a canvas. But DAGs would be easier to implement integral to PrintBox.

Gbury commented

That would certainly be interesting, but the layout algorithm (to decide where to place each box of the DAG) would probably be quite complicated I suppose ? Or maybe it's not needed ?

What kind of API would you imagine for that feature ? (just the signatures of the various functions)

A layout algorithm is not needed. In the first version I would not have any extra layout decisions, boxes would be just topologically sorted vertically. The API would be the same as for trees, except there would be a predicate to decide box identity. If multiple boxes are identical, only one would be printed: for consistency, either only the first one, or only the last one -- the default box identity would be physical identity, so this doesn't matter. Maybe physical identity is good enough, so then there would be just a binary flag whether a tree is without sharing (a proper tree), or with sharing (a DAG).

As a follow-up, I would add side-by-side layout where non-comparable narrow-enough boxes that are at the same Hasse diagram level are stacked horizontally. This layout would work for regular trees too, not just for DAGs.

Hmm, maybe I don't need dedicated support for DAGs. I can put Text IDs for repeated nodes. And I can transform Tree (n, bs) -> Vlist [n; Hlist bs] down to a few levels for a more concise layout.

I wrote a PR: #27

This issue is less important than #26 because here the PR's implementation can be copy-pasted and tuned in user code.

I have an initial version of DAGs and plotting that covers my current needs -- I'll add displaying axis labels next. It's nearly trivial code but it already looks nice.

Closing this issue, since what I needed turned out very simple. Feel free to reopen if you'd like me to write a PR.

I'd be curious to see some sample output :)

I'd be curious to see some sample output :)

Thanks! I think what I actually need that's non-trivial is a compact tree layout: allow stacking branches horizontally, but when one of the branches is short vertically, reuse its space further down.

For plotting:

type plot_spec =
  | Scatterplot of {points: (float * float) array; pixel: string}
  | Line_plot of {points: float array; pixel: string}
  | Boundary_map of {callback: (float * float) -> bool; pixel_true: string; pixel_false: string}

Sorry the plot is not ideal, the model is not training well yet so just a vertical decision split.

 1.063e+0 │**************************#************************************************
          │***********************#*##*#**#*******************************************
          │***************#*#*#*******#*#*#*#**#**************************************
          │************#***#**#******#***#*******#************************************
          │************#****************####**##*#************************************
          │***********##**#*********************##************************************
          │***********##*************************#**#*********************************
          │*********#***#***************************###*******************************
          │******#*#********************************#**#******************************
          │*******#********************************###********************************
          │*****#************************************#**##****************************
          │****#*************************************#*#******************************
          │******##************************************#**************************%***
          │*****#*******************%*****************#***************************%*%%
y         │***#********************%*%*****************####***************************
g         │*****#********************%*%*******************#**************************
r         │*##***********************%%********************#********************%%****
e         │##***********************%********************#**************************%*
k         │....##...................%.....................##...................%%%..%.
s         │.##.....................%.....................#..........................%.
          │...#.......................%.........................................%.%...
          │..#.......................%................................................
          │#.#.........................%....................#.........................
          │..........................%.%..%..............#.....................%.%....
          │.............................%%....................................%%%.....
          │.............................%%..%.............................%...........
          │..............................%..%............................%.%.%%.......
          │.............................%.%%.%............................%..%.%......
          │..................................%%.........................%.............
          │................................%%..%%....................%...%%...........
          │..................................%.....%..............%.....%.............
          │..........................................%......%......%%..%..............
          │.......................................%...%..%%..........%................
          │......................................%..%%%%...........%..................
 -5.934e-1│............................................%.%%%..........................
──────────┼───────────────────────────────────────────────────────────────────────────
          │-1.081e+0                                                          2.095e+0
          │                                   ixes

For this expression:

  let%nn_op c = "a" [-4] + "b" [2] in
  let%nn_op d = a *. b + b **. 3 in
  let%nn_op c = c + c + 1 in
  let%nn_op c = c + 1 + c + ~-a in
  let%nn_op d = d + d *. 2 + !/ (b + a) in
  let%nn_op d = d + 3 *. d + !/ (b - a) in
  let%nn_op e = c - d in
  let%nn_op f = e *. e in
  let%nn_op g = f /. 2 in
  let%nn_op g = g + 10. /. f in

I can print the following tree. But expressions are typically more imbalanced, then it looks (even) worse...


                                                                                    [49] +
                                                                                     2.47e+1
                                                                                    Gradient
                                                                                     1.00e+0
                                                                       [41] *.                                                                         │        [48] *.
                                                                        2.45e+1                                                                        │         2.04e-1
                                                                       Gradient                                                                        │        Gradient
                                                                        1.00e+0                                                                        │         1.00e+0
                                                            [37] *.                                                               │     [40] **.       │[42] 10  │   [44] **.
                                                             4.90e+1                                                              │      5.00e-1       │ 1.00e+1 │    2.04e-2
                                                            Gradient                                                              │     <void>         │<void>   │   Gradient
                                                             4.96e-1                                                              │[38] 2   │[39] -1   │         │    1.00e+1
                                                         [36] +                                                              │[36]│ 2.00e+0 │ -1.00e+0 │         │[37]│[43] -1
                                                          -7.00e+0                                                           │    │<void>   │<void>    │         │    │ -1.00e+0
                                                         Gradient                                                            │    │         │          │         │    │<void>
                                                          -6.94e+0                                                           │    │         │          │         │    │
                       [19] +                           │                             [35] *.                                │    │         │          │         │    │
                        -1.00e+0                        │                              -6.00e+0                              │    │         │          │         │    │
                       Gradient                         │                             Gradient                               │    │         │          │         │    │
                        -6.94e+0                        │                              -6.94e+0                              │    │         │          │         │    │
               [18] +                    │  [15] *.     │[34] -1   │                        [33] +                           │    │         │          │         │    │
                -5.00e+0                 │   4.00e+0    │ -1.00e+0 │                         6.00e+0                         │    │         │          │         │    │
               Gradient                  │  Gradient    │<void>    │                        Gradient                         │    │         │          │         │    │
                -6.94e+0                 │   -6.94e+0   │          │                         6.94e+0                         │    │         │          │         │    │
             [17] +                 │[13]│[14] -1   │[1]│          │               [32] +                  │    [29] r       │    │         │          │         │    │
              -2.00e+0              │    │ -1.00e+0 │   │          │                0.00e+0                │     6.00e+0     │    │         │          │         │    │
             Gradient               │    │<void>    │   │          │               Gradient                │    Gradient     │    │         │          │         │    │
              -6.94e+0              │    │          │   │          │                6.94e+0                │     6.94e+0     │    │         │          │         │    │
        [13] +            │[16] 1   │    │          │   │          │[25]│            [31] *.               │    [28] +       │    │         │          │         │    │
         -3.00e+0         │ 1.00e+0 │    │          │   │          │    │             0.00e+0              │     6.00e+0     │    │         │          │         │    │
        Gradient          │<void>   │    │          │   │          │    │            Gradient              │    Gradient     │    │         │          │         │    │
         -1.39e+1         │         │    │          │   │          │    │             6.94e+0              │     6.94e+0     │    │         │          │         │    │
[12] +          │[11] 1   │         │    │          │   │          │    │[30] 3   │[25] +                  │[2]│[27] *.      │    │         │          │         │    │
 -4.00e+0       │ 1.00e+0 │         │    │          │   │          │    │ 3.00e+0 │ 0.00e+0                │   │ 4.00e+0     │    │         │          │         │    │
Gradient        │<void>   │         │    │          │   │          │    │<void>   │Gradient                │   │Gradient     │    │         │          │         │    │
 -1.39e+1       │         │         │    │          │   │          │    │         │ 2.78e+1                │   │ 6.94e+0     │    │         │          │         │    │
├─[3] +         │         │         │    │          │   │          │    │         │├─[24] +                │   │├─[26] -1    │    │         │          │         │    │
│  -2.00e+0     │         │         │    │          │   │          │    │         ││  0.00e+0              │   ││  -1.00e+0  │    │         │          │         │    │
│ Gradient      │         │         │    │          │   │          │    │         ││ Gradient              │   ││ <void>     │    │         │          │         │    │
│  -2.78e+1     │         │         │    │          │   │          │    │         ││  2.78e+1              │   │└─[1]        │    │         │          │         │    │
│ ├─[1] a       │         │         │    │          │   │          │    │         ││ ├─[10]                │   │             │    │         │          │         │    │
│ │  -4.00e+0   │         │         │    │          │   │          │    │         ││ └─[23] *.             │   │             │    │         │          │         │    │
│ │ Gradient    │         │         │    │          │   │          │    │         ││    0.00e+0            │   │             │    │         │          │         │    │
│ │  1.39e+2    │         │         │    │          │   │          │    │         ││   Gradient            │   │             │    │         │          │         │    │
│ └─[2] b       │         │         │    │          │   │          │    │         ││    2.78e+1            │   │             │    │         │          │         │    │
│    2.00e+0    │         │         │    │          │   │          │    │         ││   ├─[10] +            │   │             │    │         │          │         │    │
│   Gradient    │         │         │    │          │   │          │    │         ││   │  0.00e+0          │   │             │    │         │          │         │    │
│    6.46e+2    │         │         │    │          │   │          │    │         ││   │ Gradient          │   │             │    │         │          │         │    │
└─[3]           │         │         │    │          │   │          │    │         ││   │  8.33e+1          │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         ││   │ ├─[9] *.          │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         ││   │ │  -8.00e+0       │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         ││   │ │ Gradient        │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         ││   │ │  8.33e+1        │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         ││   │ │ ├─[1]           │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         ││   │ │ └─[2]           │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         ││   │ └─[5] **.         │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         ││   │    8.00e+0        │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         ││   │   Gradient        │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         ││   │    8.33e+1        │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         ││   │   ├─[2]           │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         ││   │   └─[4] 3         │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         ││   │      3.00e+0      │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         ││   │     <void>        │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         ││   └─[22] 2            │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         ││      2.00e+0          │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         ││     <void>            │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         │└─[21] r                │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         │   0.00e+0              │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         │  Gradient              │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         │   2.78e+1              │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         │  └─[20] +              │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         │     -2.00e+0           │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         │    Gradient            │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         │     0.00e+0            │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         │    ├─[2]               │   │             │    │         │          │         │    │
                │         │         │    │          │   │          │    │         │    └─[1]               │   │             │    │         │          │         │    │

I like the plotting one, but I'm not sure that the DAG representation is very readable.

"I'm not sure that the DAG representation is very readable"
It's a matter of taste, but I like that I can see the effect on a node from its sub-nodes. Compare:

                            [25] +
                             2.00e+1
                            Gradient
                             1.00e+0
                       [24] +                          │[14] 5
                        1.50e+1                        │ 5.00e+0
                       Gradient                        │<void>
                        1.00e+0                        │
       [21] *.          │          [23] *.             │
        2.70e+1         │           -1.20e+1           │
       Gradient         │          Gradient            │
        1.00e+0         │           1.00e+0            │
[20] 3   │  [18] **.    │[22] -1   │    [16] *.        │
 3.00e+0 │   9.00e+0    │ -1.00e+0 │     1.20e+1       │
<void>   │  Gradient    │<void>    │    Gradient       │
         │   3.00e+0    │          │     -1.00e+0      │
         │[13]│[17] 2   │          │[15] 4   │[13] x   │
         │    │ 2.00e+0 │          │ 4.00e+0 │ 3.00e+0 │
         │    │<void>   │          │<void>   │Gradient │
         │    │         │          │         │ 1.40e+1 │

with:

[25] +
 2.00e+1
Gradient
 1.00e+0
├─[24] +
│  1.50e+1
│ Gradient
│  1.00e+0
│ ├─[21] *.
│ │  2.70e+1
│ │ Gradient
│ │  1.00e+0
│ │ ├─[20] 3
│ │ │  3.00e+0
│ │ │ <void>
│ │ └─[18] **.
│ │    9.00e+0
│ │   Gradient
│ │    3.00e+0
│ │   ├─[13]
│ │   └─[17] 2
│ │      2.00e+0
│ │     <void>
│ └─[23] *.
│    -1.20e+1
│   Gradient
│    1.00e+0
│   ├─[22] -1
│   │  -1.00e+0
│   │ <void>
│   └─[16] *.
│      1.20e+1
│     Gradient
│      -1.00e+0
│     ├─[15] 4
│     │  4.00e+0
│     │ <void>
│     └─[13] x
│        3.00e+0
│       Gradient
│        1.40e+1
└─[14] 5
   5.00e+0
  <void>

I plan to low-effort enhance this layout by having a vertical stack of boxes, to not put the reused subtrees directly in the tree, but in a box below the tree. (More-or-less where they would be in a DAG layout, without the connector.)

On my side I'm satisfied with the boxing transformation above, but maybe I can revisit the DAGs idea to implement "postfix trees", like in the example here:
What do you do when you have trouble grokking a recursive algorithm? #11

Since I brought it up again on Discuss, would you want to incorporate plotting functionality, and where in the API would you put it? The benefit of a stand-alone project is offering non-PrintBox backends, but there's enough plotting libraries already.

c-cube commented

Sounds useful, yeah. I think having a Plot of plot constructor in the main box type would be adequate, if it can be done in a reasonably compact way (with sub-constructors for plot)?

  • for text rendering, you seem to have what's needed
  • for html rendering, I'm not sure but maybe sth like https://github.com/imandra-ai/vega-lite + an embedded script?
  • for LaTeX rendering, well… that'd be tough. Oh well.