Find heap buffer overflow by running fuzz test
ziyangc97 opened this issue · 29 comments
hi, I am using the oss-fuzz google/oss-fuzz against libyaml and when I run libyaml_dumper_fuzzer I find a heap buffer overflow error in function: yaml_emitter_emit_flow_sequence_item.
the erro log is in this pic:
I also attach the full error log here:
fuzz_error_log.TXT
Due to my limited knowledge of fuzz test I don't know how to find the exact input yaml or string to reproduce this error, but I think the error log can help us to analysis and fix the error.
Code analysis:
It is obvious that in emitter.c line 761, we try to pop the element from STACK and get emitter->indents value. However, we didn't check whether STACK is empty and in this case, we try to dereference a pointer: (*(--(stack).top)) and stack.top maybe NULL and cause heap buffer overflow.
Fix:
I think it's necessary to add STACK_EMPTY before POP, the goal is to check whether stack.top has values and avoid dereferencing a NULL pointer.
I will create a PR to fix this problem. #259
It would be really good to find out how to reproduce.
Looking at the stacktrace
#0 0x572a3d in yaml_emitter_emit_flow_sequence_item /src/libyaml/src/emitter.c:761:27
#1 0x56ff2e in yaml_emitter_emit /src/libyaml/src/emitter.c:291:14
#2 0x56253a in yaml_emitter_close /src/libyaml/src/dumper.c:98:10
#3 0x5551eb in LLVMFuzzerTestOneInput /src/libyaml_dumper_fuzzer.c:268:3
So in yaml_emitter_close
, yaml_emitter_emit
is called to emit a stream end event.
However, we see that it goes into yaml_emitter_emit_flow_sequence_item
instead. And in the example YAML files I don't even find a flow sequence at all.
I think there's already something wrong before. It shouldn't even call yaml_emitter_emit_flow_sequence_item
, so I think #259 probably doesn't fix the problem.
I added this code in yaml_emitter_emit_flow_sequence_item
before the POP:
+ if (!STACK_EMPTY(emitter, emitter->indents)) {
+ fprintf(stderr, "????????? NOT EMPTY\n");
+ }
+ else {
+ fprintf(stderr, "????????? EMPTY\n");
+ }
and it never prints EMPTY.
I tried to reproduce it by following those instructions: https://google.github.io/oss-fuzz/advanced-topics/reproducing/#building-using-docker
Everything seems to be fine. I tried both testcases.
% python3 infra/helper.py build_fuzzers --sanitizer address --architecture x86_64 libyaml
...
% python3 infra/helper.py reproduce libyaml libyaml_fuzzer testcaselibyaml/libyaml/89bcd83f-f3f0-4155-a889-d0a64bff892b
INFO:__main__:Running: podman run --rm --privileged --shm-size=2g --platform linux/amd64 -i -e HELPER=True -e ARCHITECTURE=x86_64 -v /path/to/oss-fuzz/build/out/libyaml:/out -v /path/to/testcaselibyaml/libyaml/89bcd83f-f3f0-4155-a889-d0a64bff892b:/testcase -t gcr.io/oss-fuzz-base/base-runner reproduce libyaml_fuzzer -runs=100.
+ FUZZER=libyaml_fuzzer
+ shift
+ '[' '!' -v TESTCASE ']'
+ TESTCASE=/testcase
+ '[' '!' -f /testcase ']'
+ export RUN_FUZZER_MODE=interactive
+ RUN_FUZZER_MODE=interactive
+ export FUZZING_ENGINE=libfuzzer
+ FUZZING_ENGINE=libfuzzer
+ export SKIP_SEED_CORPUS=1
+ SKIP_SEED_CORPUS=1
+ run_fuzzer libyaml_fuzzer -runs=100 /testcase
sysctl: permission denied on key "vm.mmap_rnd_bits", ignoring
/out/libyaml_fuzzer -rss_limit_mb=2560 -timeout=25 -runs=100 /testcase -dict=yaml.dict < /dev/null
Dictionary: 17 entries
INFO: Seed: 4103730260
INFO: Loaded 1 modules (2414 inline 8-bit counters): 2414 [0x8cd870, 0x8ce1de),
INFO: Loaded 1 PC tables (2414 PCs): 2414 [0x667b70,0x671250),
/out/libyaml_fuzzer: Running 1 inputs 100 time(s) each.
Running: /testcase
Executed /testcase in 7 ms
***
*** NOTE: fuzzing was not performed, you have only
*** executed the target code on a fixed set of inputs.
***
Also note that I haven't been able to run libyaml_dumper_fuzzer
, only libyaml_fuzzer
:
% python3 infra/helper.py reproduce libyaml libyaml_dumper_fuzzer testcaselibyaml/libyaml/89bcd83f-f3f0-4155-a889-d0a64bff892b
ERROR:__main__:libyaml_dumper_fuzzer does not seem to exist. Please run build_fuzzers first.
Apparently there are people who can reproduce it.
I would be glad for help, but I don't know how to contact those person(s).
If you would be able to help and don't want to do this in a public issue, you can open a Private vulnerability under
https://github.com/yaml/libyaml/security
Yes, I think it is uncommon to publish the exact attack input.
I was able to reproduce it now with the help of a colleague. Apparently podman is doing something wrong. With docker it worked and built all the fuzzers.
The problem only appears when the emitter writes to a buffer:
https://github.com/google/oss-fuzz/blob/master/projects/libyaml/libyaml_dumper_fuzzer.c#L233
Replacing it with
yaml_emitter_set_output_file(&emitter, stderr);
"fixes" it.
#259 makes the symptom go away, but the function shouldn't be called in this case at all.
I counted the number of YAML_SEQUENCE_END_EVENTs, and it should be 132 in the one testcase, but when writing to the buffer, it is called 133 times -> of course that leads to an empty stack.
So I assume the memory is corrupted even before and #259 would only hide a symptom.
Not sure if the write handler from the oss-fuzz project https://github.com/google/oss-fuzz/blob/master/projects/libyaml/yaml_write_handler.h is the problem, or something in libyaml.
I should also note that the write_handler is only called once, after the 132 sequence end events, and when I look into the buffer content at that point, it is missing the last closing ]
number 132.
And only after that it calls yaml_emitter_emit
and the wrong 133rd yaml_emitter_emit_flow_sequence_item
.
I was able to reproduce it now with the help of a colleague. Apparently podman is doing something wrong. With docker it worked and built all the fuzzers.
The problem only appears when the emitter writes to a buffer: https://github.com/google/oss-fuzz/blob/master/projects/libyaml/libyaml_dumper_fuzzer.c#L233 Replacing it with
yaml_emitter_set_output_file(&emitter, stderr);
"fixes" it.
#259 makes the symptom go away, but the function shouldn't be called in this case at all. I counted the number of YAML_SEQUENCE_END_EVENTs, and it should be 132 in the one testcase, but when writing to the buffer, it is called 133 times -> of course that leads to an empty stack.
So I assume the memory is corrupted even before and #259 would only hide a symptom. Not sure if the write handler from the oss-fuzz project https://github.com/google/oss-fuzz/blob/master/projects/libyaml/yaml_write_handler.h is the problem, or something in libyaml.
you mean that this is a problem with the oss-fuzz test method?
you mean that this is a problem with the oss-fuzz test method?
no, I was just blindly guessing. but I found out that it starts to go wrong even before the write handler is called, in one of the PUT
calls. The tricky thing is now to find out which one. One of the last PUT
/WRITE
calls fail (and as a result it ends up in a code path where it shouldn't, with an already empty event and indents stack), but I assume that the buffer is already wrongly filled before.
It's confusing, and I don't think this is worth applying for a cve number.
Is it possible to remove the cve number? Or reject it, After all the code doesn't seem to be going here.
in addition, even if this is an external issue, I think the pr is ok
The macro does not do any out-of-bounds checking, so this is needed, or else do the out-of-bounds checking in the macro.
#define POP(context,stack) \
(*(--(stack).top))
It's true that it might be good to check before the POP. but it should return an error in this case because the code should not run into this state, and the resulting YAML will be invalid.
see #290 for a mitigation. It will still output invalid yaml because it's missing the last ]
, but it won't try to pop from an empty stack.
I was able to reproduce it now with the help of a colleague.
Can you provide a way to reproduce it?
The reproducer can be found in the CVE. you have to use the oss fuzzer.
Initially I used it with podman, but apparently podman is not supported. it worked with docker.
It's not so easy to reproduce with existing tools because you have to set canonical mode and use a buffer for the output.
Presumably it's also possible to find an according case without canonical mode.
see #290 for a mitigation. It will still output invalid yaml because it's missing the last
]
, but it won't try to pop from an empty stack.
so after the mitigation, can I assume there is still a bug considering the invalid yaml? After reproducing it, I found this text:
“”“
Ý - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? - - ? ?? ? ??? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? tag? ? ? ?| ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ??*? ? ? ? ? ? ?? ? ? ? ? kkk? ? ? ? = ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?w ? ? @{".3'[6\ufffffffffff\uffFffffbFffffbffffffffffFfff\uffff|b ppppppppppppppppppppppppppppp ? ? - $ [.2<8y37y-0{5',? :,y;e_y`j!--;:;2'ffffFfff\ufffffffbffff;:05',? 1.3e+9,? yyy05bb} "b1
”“”
I don't understand what this text has to do with a valid yaml?
But I think https://github.com/yaml/libyaml/pull/290 has fixed CVE-2024-3205. After all, no more heap overflow.
Should we open another issue to track invalid yaml issue?
(PS, I don't think this issue deserves a CVE number, just a normal bug. )
@znzjugod What do you mean with "I found this text"? Is this something that was emitted by libyaml?
edit: and, btw, this is valid YAML.
@znzjugod What do you mean with "I found this text"? Is this something that was emitted by libyaml? edit: and, btw, this is valid YAML.
This text is generated by the oss-fuzz and stored when the stack overflow occurs. I understand that this text is from the oss-fuzz. When an error occurs, the data generated by oss-fuzz is automatically saved in crash-xxxxxxxxxx.
I think this content should be this crash-e80579b2017e0b03db54a0f453671d596d5fa559
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
==624==ABORTING
MS: 3 CMP-ChangeByte-ChangeByte- DE: "tag:yaml.org,2002:str"-; base unit: c3a352ec3ad7a0c0f075bc799345ac342f418db0
artifact_prefix='./'; Test unit written to ./crash-e80579b2017e0b03db54a0f453671d596d5fa559
stat::number_of_executed_units: 699855
stat::average_exec_per_sec: 11861
stat::new_units_added: 7960
stat::slowest_unit_time_sec: 0
stat::peak_rss_mb: 52
INFO: exiting: 1 time: 2931s
++ ETS_LOG_ERROR 'run libyaml libyaml_dumper_fuzzer fail'
++ log_error 'run libyaml libyaml_dumper_fuzzer fail'
Should we open another issue to track invalid yaml issue?
i open another issue to tracing failure of yaml_emitter_write_indicator
. #291
is there any update news?
I had a closer look at the code again. So far I was not familiar with that part of libyaml.
I also looked closer at the fuzzer code. I first assumed it must be correct as it has been running for a few years.
But I found two problems with it (that together lead to the overflow), and I believe now that the fuzzer is using libyaml in a wrong way.
1. The write handler returns 0
on success
https://github.com/google/oss-fuzz/blob/master/projects/libyaml/yaml_write_handler.h#L34
However, libyaml expects 1
on success:
https://github.com/yaml/libyaml/blob/master/src/writer.c#L53-L62
That means whenever the buffer is big enough to flush, the write handler is called, returns 0
and the code returns 0
all the way up to yaml_emitter_dump
also returning 0
.
Now, this might be on purpose, I don't know the reasons behind it. Maybe it was made so that it detects flaws in libyaml error handling. Which I first thought it did. But if the write handler returns a failure on purpose, I think it should be documented with a short comment.
2. The return value of yaml_emitter_dump
is not correctly handled by the fuzzer
There is code handling a failing yaml_emitter_dump
to break out of the loop:
https://github.com/google/oss-fuzz/blob/master/projects/libyaml/libyaml_dumper_fuzzer.c#L255
However, later yaml_emitter_close
is called, which is supposed to generate and emit the stream_end_event. When the dumper failed, this function should not be called, as the emitter is in a broken state.
https://github.com/google/oss-fuzz/blob/master/projects/libyaml/libyaml_dumper_fuzzer.c#L268
Conclusion
Of course you might argue that a) this is not documented and b) libyaml should detect that the emitter is in a broken state in yaml_emitter_close
. Yes, I agree.
However, I now think that a CVE is unwarranted.
If there is broken code like that out there that a) does not handle a failing yaml_emitter_dump
correctly b) reads and emits untrusted yaml/data c) and the write can be made to fail somehow, it might be exploitable, but the exploit is not trivial, and if it is not using canonical mode, I wouldn't know how to exploit.
I opened an issue for the fuzzer now: google/oss-fuzz#11811
I contacted cve.mitre.org and nvd.nist.gov to reject the CVE and still waiting for a reply.
I'm wondering if there has been any update on this? I saw that the fuzzer was at least partially updated more than a month ago, but no further updates since then. This issue has caused pyyaml to be flagged and we're stuck in limbo at the moment.
I was unsuccessful with contacting CVE authorities so far and I will try another contact method.
Thanks a lot. I appreciate it.
The CVE is now rejected: https://www.cve.org/CVERecord?id=CVE-2024-3205