janko/tus-ruby-server

412 Pre-Condition Failed returns after whole body uploaded

gencer opened this issue ยท 19 comments

At current state, we upload (PATCH) a chunk or whole file. After that if headers are wrong or missing, 412 error returned. However, we should let client before we receive the file.

Currently, the whole body file uploaded regardless it has invalid or missing headers.

How we should handle errors before we accept file?

janko commented

The 412 Precondition Failed response status is returned when Tus-Resumable request header is missing or doesn't equal to 1.0.0. The tus-ruby-server validates the Tus-Resumable header pretty-much as soon as it receives the request (the route block is where the request starts): https://github.com/janko-m/tus-ruby-server/blob/b2626ef25d0e070ad2748dd469b80cbcc0d713cd/lib/tus/server.rb#L37-L47

This mean you're either using a web server which buffers the request body before forwarding it to the Rack application (Tus::Server). Either that or your frontend server (Nginx, Apache) is configured to buffer the request body. In this case there is nothing that Tus::Server can do, because the web server would call it only when the request body is fully uploaded.

Puma buffers the request body, while Goliath, Unicorn, and I think Passenger don't. If you're mounting the Tus::Server inside your main app, then it's recommended to have request buffering, as it prevents slow-write clients to impact the request throughput.

If you want the request body to be buffered, you should run the tus server as a standalone app on Goliath, as explained in the README. Goliath is recommended over Unicorn or Passenger because it's vulnerable to slow clients, meaning the client's upload/download speed doesn't affect the overall request throughput.

I agree that its processed as soon as headers received, However, my nginx configuration has request and response buffering off and I already use Goliath. Still have the same issue. Also, Let's say I start from an invalid offset; 409 Conflict error came after whole body uploaded. But the thing is if upload already finished I get that error before file uploaded.

I will investigate more and update here. I'm sure something is broken in somewhere (I still don't know yet)

janko commented

Thanks for providing more details. That is strange indeed then.

The Upload-Offset header is also validated before the request body is read (which returns the 409 Conflict error code), at least according to my static code analysis.

But the thing is if upload already finished I get that error before file uploaded.

Hmm, that means that streaming does work, so there might indeed be a bug somewhere.

I would suggest you try to override the #read method of the rack.input object before it reaches any Rack middleware, and have it raise an exception, and the backtrace should tell you where the first #read was called.

class DebugRead
  def initialize(app)
    @app = app
  end

  def call(env)
    env["rack.input"].instance_eval { def read(*); raise "#read is called"; end }
    @app.call(env)
  end
end

Tus::Server.use DebugRead
[43770:ERROR] 2018-05-19 19:08:51 :: tus.rb:19:in `read': #read is called (RuntimeError)
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/tus-server-2.1.0/lib/tus/input.rb:21:in `read'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/tus-server-2.1.0/lib/tus/input/unicorn.rb:8:in `read'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/tus-server-2.1.0/lib/tus/storage/filesystem.rb:65:in `copy_stream'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/tus-server-2.1.0/lib/tus/storage/filesystem.rb:65:in `block in patch_file'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/tus-server-2.1.0/lib/tus/storage/filesystem.rb:65:in `open'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/tus-server-2.1.0/lib/tus/storage/filesystem.rb:65:in `open'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/tus-server-2.1.0/lib/tus/storage/filesystem.rb:65:in `patch_file'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/tus-server-2.1.0/lib/tus/server.rb:141:in `block (3 levels) in <class:Server>'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/roda-3.8.0/lib/roda.rb:778:in `always'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/roda-3.8.0/lib/roda.rb:769:in `_verb'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/roda-3.8.0/lib/roda/plugins/all_verbs.rb:40:in `patch'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/roda-3.8.0/lib/roda/plugins/not_allowed.rb:113:in `patch'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/tus-server-2.1.0/lib/tus/server.rb:125:in `block (2 levels) in <class:Server>'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/roda-3.8.0/lib/roda/plugins/not_allowed.rb:89:in `block in is'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/roda-3.8.0/lib/roda.rb:843:in `if_match'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/roda-3.8.0/lib/roda.rb:467:in `is'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/roda-3.8.0/lib/roda/plugins/not_allowed.rb:82:in `is'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/tus-server-2.1.0/lib/tus/server.rb:97:in `block in <class:Server>'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/roda-3.8.0/lib/roda.rb:275:in `instance_exec'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/roda-3.8.0/lib/roda.rb:275:in `block in call'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/roda-3.8.0/lib/roda.rb:273:in `catch'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/roda-3.8.0/lib/roda.rb:273:in `call'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/roda-3.8.0/lib/roda.rb:242:in `block in build_rack_app'
    from tus.rb:20:in `call'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/roda-3.8.0/lib/roda.rb:137:in `call'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/rack-2.0.5/lib/rack/urlmap.rb:68:in `block in call'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/rack-2.0.5/lib/rack/urlmap.rb:53:in `each'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/rack-2.0.5/lib/rack/urlmap.rb:53:in `call'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/rack-2.0.5/lib/rack/builder.rb:153:in `call'
    from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/goliath-rack_proxy-1.0.2/lib/goliath/rack_proxy.rb:74:in `block in fiber'

It's getting interesting. This traceback from fresh file being uploaded. When 412 error happens there is no read. However, it is still upload the whole file.

What I did;

I stopped TusServer (Ruby) and start the Tusd (GoLang). I sent the same request and 412 given before file uploaded. Same environment but different languages.

I'm suspecting that TusServer and/or Goliath doesn't intercept request until all received. It should break the connection somehow and return the error. That's what tusd does.

janko commented

Thanks a lot for checking that. I think the I found the issue:

The goliath-rack_proxy gem, the glue between Goliath and the Tus::Server, doesn't immediately return the response that the Rack app has early-returned. Instead, it still continues receiving the request body, which it just "black-holes", as it doesn't have anywhere to send it to (the Tus::Server.call has already finished).

I recall thinking about this when I was writing goliath-rack_proxy, but I don't remember what conclusion I reached back then. I'll have another look.

I'm more than happy to help you on this case. I was eating my brain since few hours :P

BTW;

I'm currently evaluating this (your gem) and tusd (not binary but directly using its sources and i already written a little sample that works for me.). The only thing that I consider Go in this due to performance considerations. I will use this gem on very heavy and large traffic environment. It should handle carefully. It seems TusRubyServer does it. Two thing makes me wonder. Should I go with Go? Or Ruby+EventMachine. But we have to solve this issue first.

janko commented

Thanks, I'm currently reading the Goliath source code to see how I can return a response early. The most important code is in goliath/api.rb and goliath/connection.rb, feel free to take a look if you want ๐Ÿ˜ƒ

janko commented

I'm currently evaluating this (your gem) and tusd (not binary but directly using its sources and i already written a little sample that works for me.). The only thing that I consider Go in this due to performance considerations. I will use this gem on very heavy and large traffic environment. It should handle carefully. It seems TusRubyServer does it. Two thing makes me wonder. Should I go with Go? Or Ruby+EventMachine. But we have to solve this issue first.

In this case I would recommend going with tusd, as it's been battle tested by Transloadit, Vimeo and other companies.

The Ruby implementation is useful mostly because you can mount it as a Rack endpoint into the existing app, i.e. you don't need to run it as a separate process. I also wanted to prove that it's possible to write something advanced like this in Ruby, with streaming and everything, and I think it has good performance (I was benchmarking it against tusd and didn't notice big differences in memory/CPU usage).

But it will never have such a big userbase as tusd, so I think that's a safer bet, in case you don't plan to extend anything and/or are comfortable with writing some Go should it be necessary. It would probably be good to mention tusd as the alternative in the README.

Thanks, I'm currently reading the Goliath source code to see how I can return a response early. The most important code is in goliath/api.rb and goliath/connection.rb, feel free to take a look if you want ๐Ÿ˜ƒ

I am looking at it and also to Goliath's all source code to understand whats + and -.

But it will never have such a big userbase as tusd, so I think that's a safer bet, in case you don't plan to extend anything and/or are comfortable with writing some Go should it be necessary. It would probably be good to mention tusd as the alternative in the README.

You'll never know. Maybe this will get more attention. However, For tusd, I added authorization, content (response body) support, phased restart, more hooks and many more and all I can say is, your gem does perfectly what it promises. If you look tusd, you will see that its actually very thin and it has very simple tasks. Apart from testing side, I can safely say that both are equal even yours has more features. I will definitely use your gem on my Rack projects. For the testing side, yes that's also an important key.

janko commented

Ok, I think I fixed it on the goliath-rack_proxy master branch. Would you mind testing it out?

Btw, I'm just curious, doesn't this issue affect only the scenario where the client sent out the wrong request, which wouldn't be something that happens in production?

You'll never know. Maybe this will get more attention. However, For tusd, I added authorization, content (response body) support, phased restart, more hooks and many more and all I can say is, your gem does perfectly what it promises. If you look tusd, you will see that its actually very thin and it has very simple tasks. Apart from testing side, I can safely say that both are equal even yours has more features. I will definitely use your gem on my Rack projects. For the testing side, yes that's also an important key.

Thanks a lot, that's really nice to hear โค๏ธ

janko commented

I am looking at it and also to Goliath's all source code to understand whats + and -.

You mean to say that evented code is difficult to wrap your head around? ๐Ÿ˜œ

I feel you, there was a lot of head banging on my side during writing goliath-rack_proxy, as I've never used EventMachine or Goliath before that. It's probably one of the most difficult codebases I had to navigate through (though I think the Rails codebase was still harder). But after a lot of looking and reading EventMachine's API documentation you do get a sense of how it all fits together ๐Ÿ˜ƒ

Btw, I'm just curious, doesn't this issue affect only the scenario where the client sent out the wrong request, which wouldn't be something that happens in production?

Yes, It is only that If client sends an invalid request. However, Imagine that even in production, client sends a request wrongly and added 10GB file and imagine after we say 412 after grabbing whole file. That is still a probable. And after that error I wouldn't dare to face with that client ๐Ÿ˜œ (Especially if we consider that client has a slow connection)

You mean to say that evented code is difficult to wrap your head around? ๐Ÿ˜œ

I'm always a little bit scared of them. (especially anything that's single threaded async i/o)

I had to navigate through (though I think the Rails codebase was still harder). But after a lot of looking and reading EventMachine's API documentation you do get a sense of how it all fits together

I still feel its complicated ๐Ÿ˜œ

Ok, I think I fixed it on the goliath-rack_proxy master branch. Would you mind testing it out?

Something is preventing me to fetch from your master branch.

Fetching https://github.com/janko-m/goliath-rack_proxy.git
Traceback (most recent call last):
	10: from /home/gencer/.rbenv/versions/2.5.1/bin/bundle:23:in `<main>'
	 9: from /home/gencer/.rbenv/versions/2.5.1/bin/bundle:23:in `load'
	 8: from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/bundler-1.16.1/exe/bundle:22:in `<top (required)>'
	 7: from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/bundler-1.16.1/lib/bundler/friendly_errors.rb:121:in `with_friendly_errors'
	 6: from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/bundler-1.16.1/lib/bundler/friendly_errors.rb:126:in `rescue in with_friendly_errors'
	 5: from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/bundler-1.16.1/lib/bundler/friendly_errors.rb:17:in `log_error'
	 4: from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/bundler-1.16.1/lib/bundler/ui/shell.rb:45:in `error'
	 3: from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/bundler-1.16.1/lib/bundler/ui/shell.rb:105:in `tell_me'
	 2: from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/bundler-1.16.1/lib/bundler/vendor/thor/lib/thor/shell/basic.rb:88:in `say'
	 1: from /home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/bundler-1.16.1/lib/bundler/vendor/thor/lib/thor/shell/basic.rb:88:in `!~'
/home/gencer/.rbenv/versions/2.5.1/lib/ruby/gems/2.5.0/gems/bundler-1.16.1/lib/bundler/vendor/thor/lib/thor/shell/basic.rb:88:in `=~': invalid byte sequence in US-ASCII (ArgumentError)
janko commented

Yes, It is only that If client sends an invalid request. However, Imagine that even in production, client sends a request wrongly and added 10GB file and imagine after we say 412 after grabbing whole file. That is still a probable. And after that error I wouldn't dare to face with that client ๐Ÿ˜œ (Especially if we consider that client has a slow connection)

You're right, it probably can happen, and yeah, in this case it's better to fail early. I was just asking because I'm always wary of making changes related to Goliath, because I feel like I'm playing with fire ๐Ÿ”ฅ, but I think this change should be ok.

Something is preventing me to fetch from your master branch.

Hmm, I'm not getting this error, this is what I'm using:

gem "goliath-rack_proxy", github: "janko-m/goliath-rack_proxy"

I'm using the same version of Ruby and Bundler as you, so I really don't know what gives. The only thing I could tell from the stack trace is that Bundler tried to print an error message to the console, but it contained invalid characters. I have no idea why, I don't think it's related to the recent changes.

I'm using the same version of Ruby and Bundler as you, so I really don't know what gives. The only thing I could tell from the stack trace is that Bundler tried to print an error message to the console, but it contained invalid characters. I have no idea why, I don't think it's related to the recent changes.

Nope, It wasn't. Updating bundler solved the problem. (I was very sleepy at that moment couldn't think of updating it...) Sorry for the false alarm.

However, It's still does not intercept my request but await for the whole body finish. I get 412 or 415 errors after file gets uploaded.

janko commented

Ok, I realized why it's not working. Goliath always waits for the request body to be received before writing the response data, I opened a ticket for that โ€“ postrank-labs/goliath#348.

For now I'll add a workaround for it in goliath-rack_proxy. I'll make it opt-in, because not all HTTP clients support early responses (e.g. http.rb doesn't at the moment).

janko commented

Actually, scratch that. If a HTTP client doesn't support early responses, then that client ought to be fixed. If tusd returns early responses, then so should tus-ruby-server.

janko commented

Ok, I pushed another commit to goliath-rack_proxy which should this time correctly send responses early, and learned a few more things about TCP sockets along the way. Let me know how it works for you.

False alarm. Discard my previous reply (deleted).

It just worked. It intercepts and returns with 412 error successfully.

๐Ÿ‘

janko commented

I released goliath-rack_proxy 1.1.0 with the latest changes.