cuter-testing/cuter

Support Erlang maps

neoaggelos opened this issue · 3 comments

Summary

Add support for Erlang maps in cuter. Cuter already contains some of the required machinery needed for supporting maps, but it currently crashes when parsing code that uses maps.

TODO

This is a rough TODO of things that are required for maps support:

  • (already implemented) Support MAP type for ErlangTerms, as well as marshaling from/to erlang values
  • Add support for map() type in specs (in progress)
  • Support c_map and c_map_pair AST nodes in concolic execution, as well as pattern matching (ref https://github.com/erlang/otp/blob/master/lib/compiler/src/core_parse.hrl#L75-L84). This mainly revolves around extending eval_expr/ and pattern_match/. (in progress)
  • Support the functions of the standard maps library (e.g. maps:get, maps:put, ...).
  • Investigate how cuter can internally handle map values, describe new log entries that can guide the concolic execution, and implement as many as possible.

Notes

  • My Erlang background is weak, so please excuse (and correct) any misuse/misunderstanding of Erlang/cuter terms.
  • My hope is that this issue can help with documenting the implementation process and any design choices made along the way.
  • I think it is best to split this work in parts, with separate small PRs for each new functionality subset, so as to help with testing and review.

WIP

Some early work is in https://github.com/neoaggelos/cuter/tree/maps-support

Erlang maps AST

This comment intends to serve as a reference of how map expressions and patterns match to c_map and c_map_pair nodes in the AST tree.

Expressions (for eval_expr)

% erlang map expression
- spec myfunc(float()) -> map().
myfunc(A) ->
    #{A => value}.

The #{A => value} expression matches the following AST:

{c_map,
    [_Annotations],                         % annotations
    {c_literal, [_As], #{}},                % operator that introduces the map, always `#{}`
    [{c_map_pair,                           % list of key-value pairs
        [_Annotations],
        {c_literal, [], assoc},             % '=>' maps to 'assoc' 
        {c_var, ....},                      % key expression
        {c_literal, [], value}}],           % value expression
    false,                                  % whether this map is a pattern (false for expressions)
}

Suggested implementation for eval_expr

  • For map pairs (c_map_pair), evaluate symbolic and concrete values for the key and the value of the map pair separately (by recursively calling eval_expr on each, then return mk_result({symbolic.key, symbolic.value}, {concrete.key, concrete.value})
  • For maps (c_map), iterate over all map pairs, then return mk_result(maps:from_list(symbolic_keyvalue_pairs), maps:from_list(concrete_keyvalue_pairs)
  • Raise exception when any of the preconditions in the notes above do not hold (e.g. operator is not #{}, the map pair operator is not assoc etc), mainly to protect our future selves.

Pattern Matching (for pattern_match)

-spec myfunc(map()) -> float().
myfunc(A) ->
    case A of
        #{ mkey := X } = A -> X;
        _ -> 0
    end.

The #{ mkey := X } = A pattern matches the following AST (when A is bound):

{c_clause,
    [Annotations],
    [{c_map,
        [Annotations],
        {c_literal, [Annotations], #{}},            % the operator that introduces the map, always `#{}`
        [{c_map_pair,                               % list of key-value pairs
            [Annotations],
            {c_literal, [], exact},                 % ':=' maps to 'exact'
            {c_literal, [Annotations], mkey},       % key; must be literal or a bound variable
            {c_var, ...}}],                         % value; can be literal or variable
        true}                                       % this map is a pattern
    ],
    ....
}

Notes:

  • #{key1 := Y, key2 => value} syntax is not allowed by Erlang, #{key1 := Y, key2 := value} should be used instead -> this can be a rule to disallow assoc in map pairs when map is a pattern (and vice versa).
  • #{Z := value} = #{key => value} cannot be used to bind Z.

Suggested implementation for pattern_match:

(Rough idea) The map keys in the pattern are always bound, so call match_pattern for each map_pair_value with the value in maps:get(Map, map_pair_key)

Map value expressions

Support for generic map syntax (evaluate expressions and perform pattern matching) is implemented in #185, along with test coverage. The PR contains further details, but in general the implementation proposed in this issue was followed.

Map specs

Support for parsing specs including map types is implemented in #184, along with test coverage.

Support for maps standard library

I am having quite a lot of trouble adding support for the maps standard library. Further, the maps library is quite extensive, and quite a lot of the functions are built-ins that cannot be implemented in Erlang code (e.g. maps:iter, maps:next, ...).

As reference, I've been trying to add support for the ones where implementing in user-space code is possible, for example maps:from_list. However, I keep bumping into issues stemming from the handling of symbolic values within cuter. It is not straightforward to me how to best handle this. Work is being done in neoaggelos/cuter@map-eval-expr...neoaggelos:maps-from-list-problematic, which is based on top of #185.

The diff above contains a new test case using maps:from_list, which fails. The debugging statements have been added to showcase the issue:

Included here for clarity, the test case is:

        {
            "module": "map",
            "function": "fromlist",
            "args": "[10,11]",
            "depth": "30",
            "errors": true,
            "arity": 2,
            "nondeterministic": true,
            "xopts": "--disable-pmatch",
            "solutions": [
                "$2 == 0"
            ],
            "skip": false
        }

and the function definition is

-spec fromlist(atom(), integer()) -> float().
fromlist(X, Y) ->
    Z = maps:from_list([{X, Y}]),
    10 / maps:get(X, Z).
(venv) ubuntu@erlang:~/cuter$ make ftest no=111
ERLC src/cuter_symbolic.erl
ERLC test/ftest/src/map.erl

[Test #111]
/home/ubuntu/cuter/cuter map fromlist '[10,11]' -d 30 -p 2 -s 32 -pa /home/ubuntu/cuter/test/ftest/ebin --disable-pmatch -m --sorted-errors
map:fromlist(10, 11)
xxxx
OUTPUT MISMATCH
=================================================================
Testing map:fromlist/2 ...
*******************************
MAKE TUPLE
-- CONCRETE VALUE {10,11}
-- SYMBOLIC VALUES [{'__s',"0.1369584167.920125444.32051"},
                    {'__s',"0.1369584167.920125444.32052"}]
-- NEW SYMBOLIC VALUE {'__s',"0.1369584167.920125441.32302"}
*******************************
BREAK SYMBOLIC LIST/TUPLE
-- SYMBOLIC VALUE {'__s',"0.1369584167.920125444.32107"}
-- NEW SYMBOLIC VALUES [{'__s',"0.1369584167.920125444.32114"},
                        {'__s',"0.1369584167.920125444.32115"}]
*******************************
MATCH ATTEMPT
-- PATTERN {c_var,[938,{file,"src/cuter_erlang.erl"}],'Key'}
-- CONCRETE KEY 10
-- SYMBOLIC KEY {'__s',"0.1369584167.920125444.32051"}
-- CONCRETE VALUE #{10 => 11}
-- SYMBOLIC VALUE #{{'__s',"0.1369584167.920125444.32114"} =>
                        {'__s',"0.1369584167.920125444.32115"}}
-- CONCRETE ENV [{0,10},{1,#{10 => 11}},{'Key',10},{'Map',#{10 => 11}}]
-- SYMBOLIC ENV [{0,{'__s',"0.1369584167.920125444.32051"}},
                 {1,
                  #{{'__s',"0.1369584167.920125444.32114"} =>
                        {'__s',"0.1369584167.920125444.32115"}}},
                 {'Key',{'__s',"0.1369584167.920125444.32051"}},
                 {'Map',#{{'__s',"0.1369584167.920125444.32114"} =>
                              {'__s',"0.1369584167.920125444.32115"}}}]

*******************************
MAKE TUPLE
-- CONCRETE VALUE {badkey,10}
-- SYMBOLIC VALUES [badkey,{'__s',"0.1369584167.920125444.32051"}]
-- NEW SYMBOLIC VALUE {'__s',"0.1369584167.920125444.32126"}


=== Collected Metrics ===

[solver/status]
- UNSAT: 4
=== Inputs That Lead to Runtime Errors ===
#1 map:fromlist(10, 11)

I suppose the output is parsable by people familiar with cuter internals. Judging from it, it looks like creating and breaking a symbolic value in cuter does not return the original pair of symbolic values (which does make sense for cuter in general, it logs all necessary constraints for the Z3 solver to handle).

I would appreciate any pointers/help.