[master] Undefined behavior in `src/hash.c`
hartwork opened this issue ยท 19 comments
Symptom when compiled with UndefinedBehaviorSanitizer:
# src/dcfldd if=.github/workflows/test.txt of=/tmp/test2.txt hash=md5,sha1,sha256,sha384,sha512
hash.c:227:9: runtime error: call to function MD5Init through pointer to incorrect function type 'void (*)(void *)'
([..]/src/dcfldd+0x137df8): note: MD5Init defined here
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior hash.c:227:9
I can offer a fix, but let's first get pull request #27 reviewed an merged.
@davidpolverari would you rather want me to go (a)
--- void SHA256_Init(SHA256_CTX *);
--- void SHA256_Update(SHA256_CTX*, const uint8_t*, size_t);
--- void SHA256_Final(SHA256_CTX *, uint8_t[SHA256_DIGEST_LENGTH]);
+++ void SHA256_Init(void*);
+++ void SHA256_Update(void*, void*, size_t);
+++ void SHA256_Final(void*, void*);
and cast inside the function body or (b) add a wrapper function doing the casting in between, and keep the original functions untouched? Other ideas?
Shouldn't we fix src/hash.c
?
hashtype_t hashops[] =
{
{"md5",
1,
&MD5_window_context,
&MD5_total_context,
&MD5_vwindow_context,
&MD5_vtotal_context,
(void (*)(void *)) MD5Init, /* if we set the parameter to "MD5_CTX *", as defined in `src/md5.h` */
(void (*)(void *, const void *, size_t)) MD5Update,
(void (*)(void *, void *)) MD5Final,
&MD5_hashstr[0],
sizeof (MD5_hashstr),
NULL},
See the comment I added to the above code snippet. What do you think?
Shouldn't we fix
src/hash.c
?
@davidpolverari that won't be enough. We cannot cast MD5Init
to (void (*)(void *))
with its current signature and the problem multiplies times 3 times 6 or so. The undefined behaviors is that we're calling a function pointer with a signature mismatch at runtime.
See the comment I added to the above code snippet. What do you think?
I diffed that against master
and the only change is adding the comment. I don't get the idea you have for a fix then. What's the idea?
I don't have another idea at the moment. I will have to take a deeper look on how it is implemented. I am thinking something along the lines of using preprocessor concatenation to try to solve it, if possible. But I can't be sure if it will be useful without having a deeper look, and I will not be able to do this right now.
@davidpolverari okay, I will pause my own work about it until I hear back from you on your own experiments. See you ๐ป
I diffed that against
master
and the only change is adding the comment. I don't get the idea you have for a fix then. What's the idea?
From what I glanced, I was thinking about changing all(void *)
parameters to(xxx_CTX *)
ones uponhashops
initialization. But it was just a thought, without having thought about how everything was working. I will really need to debug that one further to think about a solution.
I really prefer to avoid changing the functions signatures if we can.
@davidpolverari okay, I will pause my own work about it until I hear back from you on your own experiments. See you ๐ป
See you ๐ป
PS: when possible, please let me know which flags you used to build dcfldd with UBSan, and under which compiler. Using only -fsanitize=undefined
under GCC 13 I wasn't able to repro the runtime message.
@davidpolverari sure!
# clang-18 --version | head -n1
clang version 18.1.4
# lsb-release -i
Distributor ID: Gentoo
# ./autogen.sh && ./configure CC=clang-18 CFLAGS='-fsanitize=address,undefined,leak -fno-sanitize-recover=all -fno-omit-frame-pointer' LDFLAGS='-fsanitize=address,undefined,leak' && make clean all
[..]
# src/dcfldd if=.github/workflows/test.txt of=/tmp/test2.txt hash=md5,sha1,sha256,sha384,sha512
hash.c:227:9: runtime error: call to function MD5Init through pointer to incorrect function type 'void (*)(void *)'
(/home/sping/__playground/dcfldd-fork/src/dcfldd+0x137df8): note: MD5Init defined here
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior hash.c:227:9
Thanks!
@davidpolverari would you rather want me to go (a)
--- void SHA256_Init(SHA256_CTX *); --- void SHA256_Update(SHA256_CTX*, const uint8_t*, size_t); --- void SHA256_Final(SHA256_CTX *, uint8_t[SHA256_DIGEST_LENGTH]); +++ void SHA256_Init(void*); +++ void SHA256_Update(void*, void*, size_t); +++ void SHA256_Final(void*, void*);and cast inside the function body or (b) add a wrapper function doing the casting in between, and keep the original functions untouched? Other ideas?
fwiu, the issue UBSan warns us about would only present practical problems if/when we build dcfldd with CFI (Control Flow Integrity), which distros are beginning to do in any case. In that case, I think adding wrapper functions would be better, something like what sqlite has done to solve this same issue.
What do you think about it?
- I'm good with going for adding a layer of wrap ๐
- Chimera Linux is already using CFI on a distro level today (but not across shared objects).
Would you rather be in the developer seat or in the review seat with this? I'm good with either, with a secret slight preference ๐
How would you like to continue?
Would you rather be in the developer seat or in the review seat with this? I'm good with either, with a secret slight preference ๐
How would you like to continue?
I definitely rather be in the review seat in this case ๐
Would you rather be in the developer seat or in the review seat with this? I'm good with either, with a secret slight preference ๐
How would you like to continue?I definitely rather be in the review seat in this case ๐
@davidpolverari here you go:
Wow. I thought it would take a while for you to send the fixes ๐. Anyways, I will probably take a little longer to have a look at this one, given the extent of the changes. But I will make some remarks about some of the changes in advance.
Wow. I thought it would take a while for you to send the fixes ๐.
It became smaller after I started, except the side show bug that I ran into.
Anyways, I will probably take a little longer to have a look at this one, given the extent of the changes. But I will make some remarks about some of the changes in advance.
Review on commit-level should be most fun. Through that lense, things become quite small, only the combined diff may seem a bitter bigger at first, I think.
I'm looking forward to the continuation of this.
It's fine for me that way. It really eases the reviewing process.