[Analyzer Proposal]: Analyzer/fixer for converting Stream.Read calls to ReadAtLeast and ReadExactly
eerhardt opened this issue ยท 16 comments
With the addition of the new Stream ReadAtLeast
and ReadExactly
methods, there is a pattern of mistakes we can catch with an analyzer and suggest the caller use the new APIs. For example:
stream.Read(buffer, 0, buffer.Length);
where the return value isn't checked, and recommend it be changed to a call to ReadExactly
.
Or cases where the return value is checked but required to be the same as what was requested:
if (stream.Read(buffer, 0, buffer.Length) != buffer.Length) throw ...;
and similarly suggest it use ReadExactly
.
If the caller is using a length that is different than the length of the buffer being passed in, and not checking the return value, we can suggest to use ReadAtLeast
:
stream.Read(buffer, 0, count);
would become:
stream.ReadAtLeast(buffer, count);
However, note in the ReadAtLeast
case, the caller should still be checking the return value because the ReadAtLeast
call could be returning more than count
bytes, and this data would be missed if the caller only consumed count
bytes from buffer
.
One problem with this analyzer might be when we implement #34098. If the caller saw the "Do not discard results of methods marked as DoNotIgnoreReturnValue
" warning first and addressed it, they would lose out on this fixer.
cc @stephentoub
Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.
Issue Details
With the addition of the new Stream ReadAtLeast
and ReadExactly
methods, there is a pattern of mistakes we can catch with an analyzer and suggest the caller use the new APIs. For example:
stream.Read(buffer, 0, buffer.Length);
where the return value isn't checked, and recommend it be changed to a call to ReadExactly
.
Or cases where the return value is checked but required to be the same as what was requested:
if (stream.Read(buffer, 0, buffer.Length) != buffer.Length) throw ...;
and similarly suggest it use ReadExactly
.
If the caller is using a length that is different than the length of the buffer being passed in, and not checking the return value, we can suggest to use ReadAtLeast
:
stream.Read(buffer, 0, count);
would become:
stream.ReadAtLeast(buffer, count);
However, note in the ReadAtLeast
case, the caller should still be checking the return value because the ReadAtLeast
call could be returning more than count
bytes, and this data would be missed if the caller only consumed count
bytes from buffer
.
One problem with this analyzer might be when we implement #34098. If the caller saw the "Do not discard results of methods marked as DoNotIgnoreReturnValue
" warning first and addressed it, they would lose out on this fixer.
cc @stephentoub
Author: | eerhardt |
---|---|
Assignees: | - |
Labels: |
|
Milestone: | - |
One problem with this analyzer might be when we implement #34098. If the caller saw the "Do not discard results of methods marked as DoNotIgnoreReturnValue" warning first and addressed it, they would lose out on this fixer.
I don't think these conflict, as the methods aren't pure (changing state of the stream) and won't be marked as such?
[edit] I was pointed at the fact that the referenced issue is no longer about purity alone
@jeffhandley @stephentoub with 3.1 EOL in 3 months, and based on some of the issues posted here recently, we can expect a number of folks migrating before support expires. Some of them will hit this and get a hang in their app. It might take a while to connect to the breaking change notice because there isn't an obvious "smell" (the doc doesn't contain the word "hang" for example). Once they've made that connection, they may have many places in their codebase that need auditing for the bug and fixing.
An analyzer/fixer could help with both those hurdles. I wonder whether we should start now to front load getting one into the SDK ..
The intent had been that this be addressed in .NET 7. It unfortunately got pushed out due to lack of time. We should tackle it soon.
Note that another analyzer that would help in this case is #34098. Code that doesn't capture the result of Stream.Read
or BinaryReader.Read
would get flagged by that analyzer as well (along with other APIs that shouldn't have their results ignored).
However, the downfall of #34098 is that the necessary attribute didn't get added to net7.0, so we can't mark the APIs as "do not discard" yet.
Estimates:
- Analyzer: Medium
- Fixer: Medium
Looks good as proposed.
The diagnostic (possibly with a different diagnostic ID) should still fire on older TFMs to report that the call made to Read is unreliable as written.
Category: Reliability
Severity: Warning
I would like to be assigned to this one, please.
I would like to be assigned to this one, please.
Terrific. Done. Thanks!
What is the reason for suggesting to call ReadAtLeast
instead of ReadExactly
when the length is different from the buffer size?
Is it due to the difference in how the end of the stream is handled in ReadAtLeast
(but wouldn't this also apply to the first case)?
What is the reason for suggesting to call
ReadAtLeast
instead ofReadExactly
when the length is different from the buffer size? Is it due to the difference in how the end of the stream is handled inReadAtLeast
(but wouldn't this also apply to the first case)?
I'm actually not sure now... @eerhardt, do you remember what you had in mind there?
@stephentoub I think the initial post is based on a comment from you when discussing the implementation of ReadExactly
/ReadAtLeast
.
One of the reasons why I am following up on the second code fix is because I am not sure how we can enforce the following:
However, note in the
ReadAtLeast
case, the caller should still be checking the return value because theReadAtLeast
call could be returning more thancount
bytes, and this data would be missed if the caller only consumedcount
bytes frombuffer
.
The code is still unreliable (in a different way) if we just change the call to ReadAtLeast
. We could generate something like var bytesRead = stream.ReadAtLeast(buffer, count);
to encourage the user to use the return value.
in fact, if I read this right, if you change from Read to ReadAtLeast you may introduce additional bugs because now you can overwrite data past the original count being passed in? it might be best to always suggest ReadExactly when the return value isn't checked in the first place? maybe the description of the diagnostic can mention the other options
Thinking about this now, I think ReadExactly is the only correct suggestion here. It's the only method you can call without checking the return value, as it is the only method that ensures you got exactly count
bytes. When calling ReadAtLeast
, you still need to check the return value in case of EOS (when throwOnEndOfStream = false), or when more data is read than minimumBytes
.
Yeah, for the life of me I can't think of why I wrote ReadAtLeast in my original comment. It should be ReadExact.
Thank you all for helping to clarify ๐.