sosy-lab/sv-benchmarks

Potential overflow in some busybox tasks

Closed this issue · 4 comments

Hi,

There are overflows in some of the busybox tasks. The problem originates from the the built-in function strerror that resolves an error id to an error string. This error string could in theory be very long, let's say of length INT_MAX-2 (neither the C standard nor GCC specify how these strings look like or what is the maximal length):

// in bb_perror_msg(const char *s, ...):
return_value_strerror$1=strerror(*bb_errno); // gives string of arbitrary length

Now in some busybox tasks, the length of such a string retrieved by a call to strerror is calculated using the built-in strlen function:

// in bb_verror_msg(const char *s, va_list p, const char *strerr):
if(!(strerr == ((const char *)((void *)0))))
{
  return_value_strlen$2=strlen(strerr); // strerr here is return_value_strerror$1 from above
  tmp_if_expr$3 = return_value_strlen$2;
}
else
  tmp_if_expr$3 = (unsigned long int)0;
strerr_len = (signed int)tmp_if_expr$3;

This length is then used in a signed integer addition where the overflow happens:

// in bb_perror_msg(const char *s, ...)
return_value_realloc$5=realloc(
    (void *)msg,
    (unsigned long int)(applet_len + used + strerr_len + msgeol_len + 3) // overflow!
);

I think this bug is legit though very unlikely to happen in reality (error messages should be much shorter than INT_MAX=2147483647). It looks like all files using bb__simple_perror_msg bb_verror_msg function to report errors are affected. These are the 15 36 files:

$ grep bb_verror_msg c/busybox-1.22.0/*.i -l  |tee /dev/stderr |wc -l
c/busybox-1.22.0/chgrp-incomplete_true-no-overflow_false-valid-memtrack.i
c/busybox-1.22.0/chroot-incomplete_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/cut_false-unreach-call_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/date_false-unreach-call_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/du_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/echo_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/expand_false-unreach-call_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/fold_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/head_false-valid-deref.i
c/busybox-1.22.0/head_true-no-overflow_true-valid-deref.i
c/busybox-1.22.0/logname_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/ls-incomplete_false-unreach-call_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/mkdir_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/mkfifo-incomplete_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/od_false-unreach-call_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/printf_false-unreach-call_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/readlink_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/realpath_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/rm_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/seq_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/sleep_false-valid-deref.i
c/busybox-1.22.0/sleep_true-no-overflow_true-valid-deref.i
c/busybox-1.22.0/stty_false-unreach-call_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/sync_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/tac_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/tee_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/test-incomplete_false-unreach-call_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/touch_false-unreach-call_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/uname_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/uniq_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/usleep_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/uudecode_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/wc_false-unreach-call_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/whoami-incomplete_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/who_true-no-overflow_true-valid-memsafety.i
c/busybox-1.22.0/yes_true-no-overflow_true-valid-memsafety.i
36

I could make a pull request to fix these files and also make a unchanged copy with the false-no-overflow label. Or I could just fix the labels. Or just fix the programs. We normally choose the first of these three options, but I saw that this did not happen for busybox tasks in the past when there were similar problems. What is the rationale behind this?

Also I anticipate some discussion about this, that is why I opened this issue.

Thanks for the detailed report.

I could make a pull request to fix these files and also make a unchanged copy with the false-no-overflow label.

This seems to be the most sensible option and consistent with the usual way newly detected bugs are dealt with.

I saw that this did not happen for busybox tasks in the past when there were similar problems. What is the rationale behind this?

That depends on the specific issue, but the most likely reason is that this was simply forgotten and slipped through. Or maybe the bug was deemed to be to trivial, although this is often a tough call to make.

I updated the number of affected files in my first post, all except 4 busybox tasks are affected.

There are already two tasks that only have the label false-valid-deref. What shall be done with those?
I guess invalid dereferencing is undefined behavior, so is an overflow. Does it make sense to fork those files into one with false-valid-deref where the overflow is fixed and one with false-valid-deref-false-no-overflow where the overflow still exists? I think it would depend on whether the overflow can be reached before the invalid dereference (and vice versa). This would be hard to prove. Maybe I am overthinking here. I will ignore those two files in my pull request until there is a solution for this.

Next question: There are also files in busybox-1.22.0-todo that contain the overflow. I could also fix them, but this would mean duplicating the files, what makes it harder to edit them later.
I could also just add the command that generates the pull request into the commit message. It currently looks like this:

for file in $(grep -l bb_verror_msg c/busybox-1.22.0/* | grep -Pv ".*?_false-valid-deref.*" );
# make copy of the files (except false-valid-deref ones) where all labels are removed and false-no-
overflow is added:
do cp $file $(echo $file | perl -pe "s/(.*?_)(.*)(\..*)/\1false-no-overflow\3/g");
# patch the uncopied files, so that the label true-no-overflow holds:
perl -0777pi -e "s/(applet_len;\n  )(signed int strerr_len;\n)/\1un\2/g" $file;
done

´´´
The actual problem as I see it is that we accumulate manual changes to the busybox files, instead of updating the shell script in the c/busybox-1.22.0/README.txt and regenerating the files automatically. This might soon become a maintainability nightmare.

There are already two tasks that only have the label false-valid-deref. What shall be done with those?
I guess invalid dereferencing is undefined behavior, so is an overflow. Does it make sense to fork those files into one with false-valid-deref where the overflow is fixed and one with false-valid-deref-false-no-overflow where the overflow still exists?

Just have one file with _false-valid-deref and one with _false-no-overflow. If possible, also add the corresponding _true labels.

The actual problem as I see it is that we accumulate manual changes to the busybox files, instead of updating the shell script in the c/busybox-1.22.0/README.txt and regenerating the files automatically. This might soon become a maintainability nightmare.

@tautschnig Any idea?