๐ Param string data buffer is reused, causing unexpected side effects
ViRb3 opened this issue ยท 13 comments
Fiber version/commit
v1.10.1
Issue description
The method fiber.Ctx.Params(key string) string
returns strings that share the same data buffer. This means that once a new string is returned, all previously returned strings will be changed to the new string's content, no matter how much they have been copied and passed around, since their data buffer is the same. This behavior is clearly seen in PoC below. unsafe
is used to parse the header of each string returned from the Params
method and then its content is printed. Notice how the data pointer is identical for all strings, meaning that they all use the same data buffer.
The only workaround right now is to explicitly copy the data buffer into a new buffer and create a string from the new buffer:
newBuffer := make([]byte, len(result))
copy(newBuffer, result)
newResult := string(newBuffer)
This behavior is absolutely unexpected, as strings in Go are expected to be immutable. I have not tracked down exactly what causes this issue, but I suspect that it may be a side effect from an inherited optimization by fasthttp. If this is true, I suspect that there are other methods that behave the same way as well. This behavior should be changed, or a clear documentation notice should be placed that indicates this side effect.
PoC
package main
import (
"fmt"
"github.com/gofiber/fiber"
"net/http"
"reflect"
"sync"
"unsafe"
)
var results []string
var wg sync.WaitGroup
func main() {
go server()
client()
wg.Wait()
}
func client() {
wg.Add(3)
http.Get("http://127.0.0.1:3000/aaa")
http.Get("http://127.0.0.1:3000/bbbb")
http.Get("http://127.0.0.1:3000/c")
}
func server() {
app := fiber.New()
app.Get("/:test", func(c *fiber.Ctx) {
result := c.Params("test")
hdr := (*reflect.StringHeader)(unsafe.Pointer(&result))
results = append(results, result)
fmt.Printf("Param string: %+v Value:%s\n", hdr, result)
fmt.Printf("Results: %+v\n", results)
wg.Done()
})
app.Listen(3000)
}
Output
Param string: &{Data:824633868529 Len:3} Value:aaa
Results: [aaa]
Param string: &{Data:824633868529 Len:4} Value:bbbb
Results: [bbb bbbb]
Param string: &{Data:824633868529 Len:1} Value:c
Results: [cbb cbbb c]
Thanks for opening your first issue here! ๐ Be sure to follow the issue template! If you need help or want to chat with us, join us on Discord https://gofiber.io/discord
We use the same method as Fasthttp to convert strings and bytes without memory allocation, but requires you to never change the original byte array, but that's a reasonable contract in many cases. To persist values after returning from the handler, you can use the build-in Immutable setting in Fiber.
Reference: #185
app := fiber.New(&fiber.Settings{
Immutable: true,
})
However, we could make an exception to allocate parameters when c.Params()
is called and keep 0 allocations on matching / hot paths. @ReneWerner87 thoughts?
Oh, I was not aware of the immutable setting. I wouldn't mind keeping the current behavior, but there should be clear documentation that states such side effects - perhaps a general warning in the start of the project, so people are aware that this is even a possible side effect, and maybe note down the scope of returned variables on each method's documentation, like fasthttp does it. It took me a full day to track down this issue just because I just didn't expect it to be coming from fiber.
You are right regarding the documentation, we could add some sort of warning/disclaimer regarding this issue and explain how it would be solvable using the Immutable
setting or allocate/copy specific variables by using the default string
or []byte
types.
If you have any suggestions on how we could phrase/summary this behavior, I will add it to all README's and API Docs right away.
Suggestion:
Returned value is valid until the handler returns, do not store references, make copies instead or use the Immutable
setting to do this for you. Keep in mind that this may result in memory allocation+copy.
My recommendation would be to make two changes.
First, add a dedicated section in the beginning of the docs that outlines this concept as a whole. I created a PR for that, mentioned above. Any feedback is very welcome.
Second, for each method that returns a non-immutable value, like the Params
method discussed in this issue, add a warning to its docs, like the one you proposed. I allowed myself to re-phrase it to make it a bit shorter:
Returned value is only valid within the handler. Do not store any references. Make copies or use the Immutable
setting instead.
@Fenny when you say not to keep references to it, is it ok to pass it to the service layer if the handler will wait for the service to return before it returns? example:
app.Get("/users/:id", func(c *fiber.Ctx) {
userID := c.Params("id")
// is this ok?
user, err := userService.FindByID(userID)
// ... rest of the code
})
Assuming above that userID is only used to make a query to the DB and not stored anywhere in state beyond that use.
Yeah should be okay, the string is there until the response is written.
I just got hit by this issue. I would like to suggest, (if it works) to return a new type something like below unsafestring.
This would make it explicit that the mutability MUST be considered when using the string. Go strings are always spoken of as immutable so it should probably be a new type. This way you could pass your unsafestring into functions and it would be clear that it requires special consideration. If you have fairy complicated program you could easily lose track of which strings are allowed to outlive the caller's scope.
Params, Query would return unsafestring instead of string
//Use the Copy method if you need it outside the scope
//of the fiber handler. It's ok to use UNSAFE to avoid a copy
//if you know that that the result won't outlive the fiber handler
type unsafestring string
func (u unsafestring) Copy() string {
return fiberUtils.ImmutableString(string(u))
}
func (u unsafestring) UNSAFE() string {
return string(u)
}
Also just got hit by this. This is insidious. Not only was it causing values to change underneath us, it was also changing the values that we were logging to trace through the system and try to find the issue.
I appreciate what you're doing here, but this should be an opt-in setting IMHO. It would be much more discoverable as a performance feature that you can enable that can be called out in documentation for people looking to improve performance. I do see that it is in the docs now as suggested above, but I suspect very few people look up and read to the bottom the documentation for ctx.Get for example.
@DiggidyDave This is highly explained in the First Page of the documentation. See here https://docs.gofiber.io/#zero-allocation
It also has an example of how to turn off this feature for the full framework.
Sure, anyone who actually starts with the docs will probably see it. Others, like me, may start with the readme on github and will only consult docs to fill gaps in an otherwise very intuitive API. Also, someone may see it and brush it off as an edge case that doesn't apply to them at the time and skim past it, only to be bitten by it later. If you are banking on folks seeing the docs, why not have the riskiest part of that assumption be opt-in and super discoverable in the docs?
It was just a suggestion, no major complaints here. Love the project, just hit this and thought I'd mention it. :-)
@DiggidyDave Thanks for the suggestion, I will update the README to mention this