GaloisInc/what4

Extended/unprintable characters when using CVC4

finnteegen opened this issue · 2 comments

Hello (again),

when I use What4 with CVC4 and try to generate single characters using the built-in string support, it seems as if extended/unprintable characters are not correctly handled as the following error message indicates.

*** Exception: Solver reported an error:
  Parse Error: <stdin>:652.37: Extended/unprintable characters are not part of SMT-LIB, and they must be encoded as escape sequences
in response to command:
  (define-fun x!259 () Bool (= x!0 " "))

Cheers,
Finn

PS: For reference, I give my test module below.

{-# LANGUAGE LambdaCase          #-}
{-# LANGUAGE ScopedTypeVariables #-}

module What4Test where

import Data.Data
import Data.Parameterized
import Data.Parameterized.Nonce
import Data.Text

import System.IO.Unsafe

import What4.Config
import What4.Expr
import What4.Interface
import What4.Protocol.Online
import What4.Protocol.SMTLib2
import What4.Solver
import What4.Utils.StringLiteral

getChars :: [Char]
getChars = unsafePerformIO $ do
    Some (ng :: NonceGenerator IO x) <- newIONonceGenerator
    sym <- newExprBuilder FloatIEEERepr Proxy ng
    extendConfig cvc4Options (getConfiguration sym)
    solver <- startSolverProcess cvc4Features Nothing sym :: IO (SolverProcess x (Writer CVC4))
    let conn = solverConn solver
    -- Create string var and constrain its length to 1
    x <- freshConstant sym (safeSymbol "x") (BaseStringRepr UnicodeRepr)
    l <- stringLength sym x
    intLit sym 1 >>= isEq sym l >>= assume conn
    -- Recursively generate characters
    let getModelsRecursive =
          runCheckSat (Session conn (solverResponse solver)) $ \case
            Sat (ge, _) -> do
              v <- groundEval ge x
              -- Exclude value
              stringLit sym v >>= isEq sym x >>= notPred sym >>= assume conn
              return $ (Data.Text.head . fromUnicodeLit) v : unsafePerformIO getModelsRecursive
            _           -> shutdownSolverProcess solver >> return []
    getModelsRecursive

Thanks for the report. Originally, this code was written when only ASCII characters were supported, and it looks like we need to revisit the string escaping procedures.

I think the code in question is here. We probably need a more restrictive predicate than isPrint to decide when to apply escaping.

textToTerm :: Text -> Term
textToTerm bs = SMT2.T ("\"" <> Text.foldr f "\"" bs)
where
f c x
| '\"' == c = "\"\"" <> x
| isPrint c = Builder.singleton c <> x
| otherwise = "\\u{" <> Builder.fromString (showHex (fromEnum c) "}") <> x