gobuffalo/buffalo

Can not read request body multiple times

Closed this issue · 16 comments

Description

If this is a feature request, explain why it should be added. Specific use-cases are best.

This is part feature request and part bug report. In short, slack sends messages to slack apps via HTTP (good) and these are signed with HMAC-SHA256 (good). However, the body is contained the part that is signed (bad). They put a content-type of content_type=application/x-www-form-urlencoded.

Here is a sample body, which is form urlencoded, but a form's data is not sensitive to order and signature algorithms are:

token=rV3iSwpJXizkK66GQLm8NQef&team_id=T8DRVQJ4U&team_domain=slightdrizzle&channel_id=CEAN3G59D&channel_name=apptesting&user_id=UC6FBA788&user_name=iansmith&command=%2Fmaz&text=maz&response_url=https%3A%2F%2Fhooks.slack.com%2Fcommands%2FT8DRVQJ4U%2F493213364772%2FkxNOuEO0QMFgUeVAKf4MH9Sv&trigger_id=494432800071.285879834164.51e506e22646d57273f1b3df4a0201ad

Steps to Reproduce the Problem

Please describe in painful detail what you did (so others can play along with you) to get to this point. This includes things like the exact command(s) you used, or the curl command you used, that sort of thing.

If you want to reproduce the problem, just create a slack app and point a slash command at your machine (or ngrok tunnel). When the http request arrives in buffalo it will have an empty body, because the form data has been extracted.

Expected Behavior

What did you what to happen? Tell us a story. We love to read.

I tried a couple of approaches to try to work around this. I tried using my own middleware (clearing the old first) and I tried going into the gorilla mux layer as well. Neither of these prevented the consumption of the body before it reached my code.

Actual Behavior

In the happiest of happy places what should have happened?

I would say there are three things that could be done to fix it:

  1. Allow a mux to be passed in as part of app setup. It would be up to the person passing it in to prevent collisions in the URL space.

  2. Offer some type of "raw" middleware that offers no processing before the call to the middleware. This would seem to require avoiding or extending gorilla mux since it doesn't seem to offer this.

  3. Always offer a way that the request body can be replayed. This could be part of context or similar. The http.Request object offers a bit of help here with

GetBody func() (io.ReadCloser, error) // Go 1.8

Info

Please run buffalo info and paste the information below where it says "PASTE_HERE".

$ buffalo info
### Buffalo Version
v0.13.7

### App Information
Pwd=/Users/iansmith/mazarin.src/mazarin/go/src/github.com/iansmith/mazarin
Root=/Users/iansmith/mazarin.src/mazarin/go/src/github.com/iansmith/mazarin
GoPath=/Users/iansmith/mazarin.src/mazarin/go
Name=mazarin
Bin=bin/mazarin
PackagePkg=github.com/iansmith/mazarin
ActionsPkg=github.com/iansmith/mazarin/actions
ModelsPkg=github.com/iansmith/mazarin/models
GriftsPkg=github.com/iansmith/mazarin/grifts
VCS=
WithPop=true
WithSQLite=true
WithDep=false
WithWebpack=false
WithYarn=false
WithDocker=true
WithGrifts=true
WithModules=false

### Go Version
go version go1.11.2 darwin/amd64

### Go Env
GOARCH="amd64"
GOBIN="/Users/iansmith/mazarin.src/mazarin/go/bin"
GOCACHE="/Users/iansmith/Library/Caches/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/iansmith/mazarin.src/mazarin/go"
GOPROXY=""
GORACE=""
GOROOT="/Users/iansmith/mazarin.src/go"
GOTMPDIR=""
GOTOOLDIR="/Users/iansmith/mazarin.src/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/w0/fw3qzvvx54b39b790yxq6v100000gn/T/go-build418835592=/tmp/go-build -gno-record-gcc-switches -fno-common"

### Node Version
Node Not Found

### NPM Version
NPM Not Found

### Yarn Version
Yarn Not Found

### PostgreSQL Version
PostgreSQL Not Found

### MySQL Version
MySQL Not Found

### SQLite Version
3.24.0 2018-06-04 14:10:15 95fbac39baaab1c3a84fdfc82ccb7f42398b2e92f18a2a57bce1d4a713cbaapl

### Dep Version
could not find a Gopkg.toml file

### Dep Status
could not find a Gopkg.toml file

### go.mod
module github.com/iansmith/mazarin


sio4 commented

I am not in front of my laptop so I can't test it. But I guess you can get each values (token, team_id, and so on) by bind() to structure (I think this is easist way) or by calling Request() and access raw values. Did you tried them?

Yes, the Request() isn't really raw. That is the key problem: context.Request().Body returns an empty writer because somebody has already read it and go's request only lets you read the body once under normal circumstances. I can get the values just fine from the parsed out form data, but I have no way to know what order they were sent in (and that matters for the signature).

sio4 commented

As I remeber the raw body is stored as some variable and request logger logging it as is. Isn't it?

  1. Allow a mux to be passed in as part of app setup. It would be up to the person passing it in to prevent collisions in the URL space.
  1. Offer some type of "raw" middleware that offers no processing before the call to the middleware. This would seem to require avoiding or extending gorilla mux since it doesn't seem to offer this.

Always offer a way that the request body can be replayed. This could be part of context or similar. The http.Request object offers a bit of help here with

This is a bug and should be fixed. Context.Request() should return an unmodified request.

@markbates I tried a couple of those yesterday and found that the result was the same. I will try to create a small test and show you what I saw yesterday.

On the subject of the Prehandler, I tried it. Here is the code (possibly wrong) that I used:

func App() *buffalo.App {
	if app == nil {
		app = buffalo.New(buffalo.Options{
			Env:         ENV,
			SessionName: "_mazarin_session",
			PreHandlers: []http.Handler{&SlackHandler{}},
		})

// ...

//SlackHandler is used for testing slack integration.
type SlackHandler struct{}

// this produces
// 2018/12/03 15:00:23 Request Body is empty (and size of form data is 11)

func (s *SlackHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
	var buf bytes.Buffer
	n, err := io.Copy(&buf, r.Body)
	if err != nil {
		log.Fatalf("error during body copy: %v", err)
	}
	if n == 0 {
		log.Fatalf("Request Body is empty (and size of form data is %d)\n", len(r.Form))
	}
	HandleSlackSlashCommand(w, r)
}
sio4 commented

I suspected calling ParseForm() in code below, in the App.newContext(),

buffalo/context.go

Lines 55 to 61 in e362599

if err := req.ParseForm(); err == nil {
for k, v := range req.Form {
for _, vv := range v {
params.Set(k, vv)
}
}
}

but the first call of ParseForm() was generated by buffalo.MethodOverride(), by calling req.FormValue("_method"). I think those two calls should be checked with deep understanding of full flow of buffalo (that I don't have, sadly :-)

sio4 commented

Oops! mw-csrf.requestCSRFToken also calls PostFormValue() (and it calls ParsForm() finally).
It is somewhat big...

sio4 commented

I looked into this issue more deeply today.

Since request.Body is a stream and Request.ParseForm() consumes it without duplication, the original request body should be saved before calling it or similar functions. In Buffalo, the function is called by:

  1. App.newContext() which is called by handler functions include RouteInfo.ServeHTTP(),
  2. MethodOverride() which is called by App.ServeHTTP(),
  3. and defaultErrorHandler().

Since newContext() is called before all middlewares, making a middleware is not a solution for this issue without fixing the context generator. (the code below is that part. IMO anyway, it is not a good idea to copy those values and save them in memory twice.)

buffalo/context.go

Lines 55 to 61 in e362599

if err := req.ParseForm(); err == nil {
for k, v := range req.Form {
for _, vv := range v {
params.Set(k, vv)
}
}
}

For PreWares and PreHandlers, they are invoked after MethodOverride() which is calling ParseForm() indirectly in default App.ServeHTTP(). It means Pre* also not a chance for it. So if they should be invoked before buffalo doing nothing, the order of execution should be changed. (It seems to be a reasonable approach for this issue.)

Just for the test, I assigned dummy MethodOverride and remove above part from newContext() and tested with middleware below (inserted before mw-csrf since it also calls ParseForm) so I can access it from any handlers:

func bodySaver(next buffalo.Handler) buffalo.Handler {
	return func(c buffalo.Context) error {
		body, err := ioutil.ReadAll(c.Request().Body)
		if err != nil {
			c.Logger().Errorf("could not read request body: %v", err)
		}
		c.Set("request_body", body)
		c.Request().Body = ioutil.NopCloser(bytes.NewReader(body))
		return next(c)
	}
}

I didn't test but if I add this code block into the custom MethodOverride, without other changes, It should be possible to access the original body but it cannot be saved into the context so real handlers cannot access them. (Pre* approach also has same problem but just doing its own job in that place.)

So, I think, there are some possible approaches:

  1. Make something like buffalo.Request to wrap http.Request and add features including permanent Body. (universal way to give a static access to the original body)
  2. Fix App.ServeHTTP() and use PreHandlers/PreWares per application on users own demand. (details are in charge of users and the Body cannot be saved permanently for normal handlers. Also, it is cannot be used for a specific route and it works as App wide function. Not a good idea.)
  3. Remove Form to Params copying from context generator, remove or move MethodOverride() to safe place, and give a chance to use middleware and storing Body as context value. (maybe breaking change but I think this is the best solution.)
  4. ...or mixing some of above.

How do you think about?

@markbates I tried and was successful using buffalo's app as a http.Handler in my own mux. thanks!

My guess, although I don't have great proof, is that gorilla mux is doing this consumption of the body when the incoming request is form-encoded data. I am saying this because some simple experiments using only gorilla mux and it exhibited the "body already read" problem.

Great! I was going to suggest that as a work around until v0.14

closed as it is fixed in development.

sio4 commented

Hi @markbates,
Is this a temporary fix for the issue? Basically, access to the Body is not needed for most cases and I think this should be implemented as middleware or similar ways for the user who want this.

And current implementation uses packd.virtualFile but it seems that it requires the triple size of the original Body. Even though the request handler just runs during a few second and some more, I worry about memory consumption when it used for file upload or big content.

@sio4 I've done some work on packd.File today and packd.FileInfo to reduce the overhead and improve efficiency. Can you take a look at master when have a chance? Thanks.

sio4 commented

Hi, @markbates,

Of course, That's my pleasure! I looked into the changes and I think that's good enough!
Anyway, I just edited some parts of the code and made a PR. gobuffalo/packd#7

It is not directly related to this issue but I just fixed Seek() to support normal seeking, not limited to go to the starting point, and changed some codes for easy reading, and removing partially duplicated codes.
Please check the PR.

By the way, these packages including Genny and Packd are so interesting!