angr/simuvex

fgets breaks with blank lines

Opened this issue · 1 comments

I'm trying to run a binary which uses fgets to read a file in. At this point, I'm running concretely with a concrete file system as I have not yet hit the unrelated function I am going to hook and make symbolic. The file I am reading with fgets has several blank lines in it and it would appear that this messes up the simuvex implementation of fgets. Importantly, this file does not start with a new line, and the first few blank lines don't seem to mess up anything. After several blank lines, the file pointer position (fp.pos) gets pushed way ahead of where it should be, because of a very large read distance argument and this causes my program to break.

I think the problem might be with these lines in simuvex/procedures/libc___so___6/fgets.py:

        # XXX: this is a HACK to determine if r is 0 because there is a newline at the first index or
        # if r is 0 because there cannot be any newline
        errored = False
        if not self.state.se.satisfiable(extra_constraints=(r > 0,)):
            errored = True
            if self.state.se.solution(mem.load(0, 1), self.state.se.BVV('\n')):
                errored = False

I think there's at least two problems:

  • I don't think constraining r > 0, does a whole lot because r is an offset into the file, not an offset relative to "pos" (file seek position). If r was relative to pos, checking it for 0 would make sense but... it's not.
  • Checking if the first character of the file is a newline is not a good test of if the current line (based on "pos" (file seek position)) is only a newline.

I may be completely wrong, but changing those lines to the following seems to work for me in my concrete case. I've had to add the extra conditional check as it seems even with the r > 0 changed to r > pos, there are still times where the satisfiable check fails when I don't believe it should. There could be more going on here, my understanding of this code is fairly limited.

        # XXX: this is a HACK to determine if r is 0 because there is a newline at the first index or
        # if r is 0 because there cannot be any newline
        errored = False
        
        # Not sure if this is the best way to do this... I'm really just trying to find if r==pos 
        # but I have to take into account the fact that these could be symoblic... 
        # The way it is things could go south if either value is symbolic  
        # This way works with concrete values
        if self.state.se.any_int(r) == self.state.se.any_int(pos):
            
            if not self.state.se.satisfiable(extra_constraints=(r > pos,)):
                
                errored = True
                if self.state.se.solution(mem.load(pos, 1), self.state.se.BVV('\n')):
                    errored = False

Intersting...

Would you mind sending along a PR so that we can play around with it?