File upload combinator
alpmestan opened this issue Β· 76 comments
Just creating this issue to put some code I have written up here for discussion.
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeSynonymInstances #-}
module Files where
import Control.Monad.IO.Class
import Control.Monad.Trans.Either
import Control.Monad.Trans.Resource
import Data.ByteString.Lazy (ByteString)
import Network.Wai
import Network.Wai.Handler.Warp (run)
import Network.Wai.Parse
import Servant
import Servant.Server.Internal
-- Backends for file upload: in memory or in /tmp ?
data Mem
data Tmp
class KnownBackend b where
type Storage b :: *
withBackend :: Proxy b -> (BackEnd (Storage b) -> IO r) -> IO r
instance KnownBackend Mem where
type Storage Mem = ByteString
withBackend Proxy f = f lbsBackEnd
instance KnownBackend Tmp where
type Storage Tmp = FilePath
withBackend Proxy f = runResourceT . withInternalState $ \s ->
f (tempFileBackEnd s)
-- * Files combinator, to get all of the uploaded files
data Files b
instance (KnownBackend b, HasServer api) => HasServer (Files b :> api) where
type ServerT (Files b :> api) m =
[File (Storage b)] -> ServerT api m
route Proxy subserver req respond = withBackend pb $ \b -> do
(_, files) <- parseRequestBody b req
route (Proxy :: Proxy api) (subserver files) req respond
where pb = Proxy :: Proxy b
type FilesMem = Files Mem
type FilesTmp = Files Tmp
-- test
type API = "files" :> FilesTmp :> Post '[JSON] ()
:<|> Raw
api :: Proxy API
api = Proxy
server :: Server API
server = filesHandler :<|> serveDirectory "."
where filesHandler :: [File FilePath] -> EitherT ServantErr IO ()
filesHandler = liftIO . mapM_ ppFile
ppFile :: File FilePath -> IO ()
ppFile (name, fileinfo) = do
putStrLn $ "Input name: " ++ show name
putStrLn $ "File name: " ++ show (fileName fileinfo)
putStrLn $ "Content type: " ++ show (fileContentType fileinfo)
putStrLn $ "------- Content --------"
readFile (fileContent fileinfo) >>= putStrLn
putStrLn $ "------------------------"
app :: Application
app = serve api server
f :: IO ()
f = run 8083 app
along with this HTML file:
<form action="/files" method="post" enctype="multipart/form-data">
Select a file: <input type="file" name="blah" />
Select another one: <input type="file" name="foo" />
<hr />
<input type="submit" value="Upload" />
</form>
served through serveDirectory
. Thoughts, comments?
This is great! π
One problem with this is that we can't target individual files by specifying the input name associated to them, with, say, a data File backend (inputname :: Symbol)
combinator, but this is kind of on purpose... Let me explain.
- With the current implementation I'm just calling out to some existing wai-code that reads the request body and decodes it. If we were to add a
File
combinator, we would either be decoding the request body and getting the file for everyFile
mentionned, or would have to do something similar to what we do withReqBody
and memoize the decoded body. This is getting much trickier to handle. - If we have a
File
using the memory backend and another using/tmp
, this would probably mess things up or do a lot of unnecessary work.
This is why I'm not sure we should include a File
combinator. Shall I just put a PR together with just Files
, using the code above and documenting it?
Here's a version that doesn't "forget" about the inputs that were sent along with the files in the request body (remember, this is multipart/form-data
):
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeSynonymInstances #-}
module Files where
import Control.Monad.IO.Class
import Control.Monad.Trans.Either
import Control.Monad.Trans.Resource
import Data.ByteString.Lazy (ByteString)
import Network.Wai
import Network.Wai.Handler.Warp (run)
import Network.Wai.Parse
import Servant
import Servant.Server.Internal
-- Backends for file upload: in memory or in /tmp ?
data Mem
data Tmp
class KnownBackend b where
type Storage b :: *
withBackend :: Proxy b -> (BackEnd (Storage b) -> IO r) -> IO r
instance KnownBackend Mem where
type Storage Mem = ByteString
withBackend Proxy f = f lbsBackEnd
instance KnownBackend Tmp where
type Storage Tmp = FilePath
withBackend Proxy f = runResourceT . withInternalState $ \s ->
f (tempFileBackEnd s)
-- * Files combinator, to get all of the uploaded files
data Files b
type MultiPartData b = ([Param], [File (Storage b)])
instance (KnownBackend b, HasServer api) => HasServer (Files b :> api) where
type ServerT (Files b :> api) m =
MultiPartData b -> ServerT api m
route Proxy subserver req respond = withBackend pb $ \b -> do
dat <- parseRequestBody b req
route (Proxy :: Proxy api) (subserver dat) req respond
where pb = Proxy :: Proxy b
type FilesMem = Files Mem
type FilesTmp = Files Tmp
-- test
type API = "files" :> FilesTmp :> Post '[JSON] ()
:<|> Raw
api :: Proxy API
api = Proxy
server :: Server API
server = filesHandler :<|> serveDirectory "."
where filesHandler :: MultiPartData Tmp -> EitherT ServantErr IO ()
filesHandler (inputs, files) = do
liftIO $ mapM_ ppFile files
liftIO $ mapM_ print inputs
ppFile :: File FilePath -> IO ()
ppFile (name, fileinfo) = do
putStrLn $ "Input name: " ++ show name
putStrLn $ "File name: " ++ show (fileName fileinfo)
putStrLn $ "Content type: " ++ show (fileContentType fileinfo)
putStrLn $ "------- Content --------"
readFile (fileContent fileinfo) >>= putStrLn
putStrLn $ "------------------------"
app :: Application
app = serve api server
Now, this is in fact all equivalent to a new content type, for multipart/form-data
. However, the fact that wai-extra forces us to do some IO
to parse all of this (see parseRequestBody
) gets in the way.
Thoughts? @jkarni any idea to make this a content type?
fwiw, i'm using this now and while it works from curl, it doesn't in the browser for large files. Weirdly, if i use netcat to collect the request as raw text then use netcat to dump it into the server, it works fine - doing it normally gets 405 Method Not Allowed and some chunks of text from the input logged by Network.Wai.Middleware.RequestLogger.
Interesting. I can upload files just fine in some app I'm working on. Do you think you could share a minimal app (I assume the one you're working on is for work, hence closed-source?) that has this problem? I could start from there and investigate what is going on. So just the HTML of your upload form along with the haskell app that receives the files.
yep, i'll try to get that to you today.
i think this is actually a warp thing, i get it when i use scotty too.
3.1.0. testing with the latest now.
@mwotton there was an issue with 3.1.0. So try the lastest or downgrade to 3.0.*.
ah, thanks, @codedmart - i'll try that.
yep, 3.1.2 fixes it. wish i'd known that before i rewrote the app :)
Ah, good to know!
Perhaps you could release this as a package?
Well, we would have to write instances for servant-client, servant-docs etc if we want to release this properly. It's quite a task...
Make the version < 1. Something is better than nothing, IMHO. That is to say, this is awesome and I would like to use it.
In order to make this a content-type, we'd need to change the content-type machinery to allow IO. I think that makes sense.
And if we do that, we may not need instances for all the packages.
@3noch You can use this! Simply drop the code in a module in your project, and you can use it =) I've put this combinator to work in several apps for work this way.
@alpmestan Of course, I just wish I had found it on hackage instead of a GitHub issue. Also, I could submit a PR to a repo if it existed. π
Yeah we should probably put some page together with combinators like this one and some instances. On the github wiki or in the servant-examples
package or the site or something?
π
Is this still the best way to achieve file upload?
I have it working. I am trying to capture the text and upload at the same time. How do I do that:
:<|> "plants" :> Capture "plant_id" Text :> "pictures" :> Capture "picture_id" Text :> FilesTmp :> Post '[JSON] ()
? This does not seem to work.
@i-am-the-slime Did you look at this version: #133 (comment). It handles the form data and the files.
Thanks @codedmart it works. However for a test I now seem to need a HasClient
instance. How and where should I define that?
Is there a version of this that works with 0.5?
Probably something like:
-- Backends for file upload: in memory or in /tmp ?
data Mem
data Tmp
class KnownBackend b where
type Storage b :: *
withBackend :: Proxy b -> (BackEnd (Storage b) -> IO r) -> IO r
instance KnownBackend Mem where
type Storage Mem = ByteString
withBackend Proxy f = f lbsBackEnd
instance KnownBackend Tmp where
type Storage Tmp = FilePath
withBackend Proxy f = runResourceT . withInternalState $ \s ->
f (tempFileBackEnd s)
-- * Files combinator, to get all of the uploaded files
data Files b
type MultiPartData b = ([Param], [File (Storage b)])
instance (KnownBackend b, HasServer api) => HasServer (Files b :> api) where
type ServerT (Files b :> api) m =
MultiPartData b -> ServerT api m
route Proxy context subserver = WithRequest WithRequest $ \ request ->
route (Proxy :: Proxy api) context (addBodyCheck subserver (bodyCheck request))
where
bodyCheck request = withBackend (Proxy :: Proxy b) $ \be -> do
let contentType = lookup hContentType $ requestHeaders request
-- fixme: make sure the multipart/form-data content type is
-- used and return the appropriate error if not
dat <- parseRequestBody be request
return (Route dat)
type FilesMem = Files Mem
type FilesTmp = Files Tmp
Not tested, but it shouldn't be too far from being functional. Note that an alternative solution is in the works -- @fizruk can tell you more about that but in the meantime you can take a look here.
This is a response to @alexanderkjeldaas, right? I would still really appreciate any pointers about how to create HasClient instances.
@alpmestan @alexanderkjeldaas Unfortunately that solution won't work because temporarily allocated resources (files or memory) would be destroyed by the time you would use their contents. That is because of the Delayed
thing that came up in 0.5
. To fix that you need to work with a CPS-d MultiPartData
:
instance (HasServer sublayout config) => HasServer (Files b :> sublayout) config where
type ServerT (Files b :> sublayout) m =
((MultiPartData b -> IO ()) -> IO ()) -> ServerT sublayout m
route Proxy config subserver = WithRequest $ \request ->
route (Proxy :: Proxy sublayout) config (addBodyCheck subserver (bodyCheck request))
where
bodyCheck request = return $ Route (\(f :: MultiPartData b -> IO ()) ->
runResourceT . withInternalState $ \s ->
parseRequestBody (tempFileBackEnd s) request >>= f)
Something like above should work. In fact @codedmart might have an exact working example (since we have discussed it with him). The usage example then would be like this:
handleFiles :: ((MultiPartData -> IO ()) -> IO ()) -> ExceptT ServantErr IO ()
handleFiles multipart = liftIO $ multipart $ \(params, files) -> do
print ("start" :: String)
mapM_ ppFile files
print params
print ("end" :: String)
where
ppFile :: File FilePath -> IO ()
ppFile (name', fileinfo) = do
putStrLn $ "Input name: " ++ show name'
putStrLn $ "File name: " ++ show (fileName fileinfo)
putStrLn $ "Content type: " ++ show (fileContentType fileinfo)
putStrLn "------- Content --------"
readFile (fileContent fileinfo) >>= putStrLn
putStrLn "------------------------"
@i-am-the-slime for a HasClient
instance you probably need Network.HTTP.Client.MultipartFormData
. I don't think I can guide you through this though.
@fizruk's example does work, and thanks to him for the help. I slightly modified it:
data Mem
data Tmp
class KnownBackend b where
type Storage b :: *
withBackend :: Proxy b -> (BackEnd (Storage b) -> IO r) -> IO r
instance KnownBackend Mem where
type Storage Mem = ByteString
withBackend Proxy f = f lbsBackEnd
instance KnownBackend Tmp where
type Storage Tmp = FilePath
withBackend Proxy f = runResourceT . withInternalState $ \s ->
f (tempFileBackEnd s)
data Files b
type MultiPartData b = ([Param], [File (Storage b)])
type MultiPartDataT b = ((MultiPartData b -> IO (MultiPartData b)) -> IO (MultiPartData b))
type FilesMem = Files Mem
type FilesTmp = Files Tmp
instance (KnownBackend b, HasServer sublayout config) => HasServer (Files b :> sublayout) config where
type ServerT (Files b :> sublayout) m =
MultiPartDataT b -> ServerT sublayout m
route Proxy config subserver = WithRequest $ \request ->
route (Proxy :: Proxy sublayout) config (addBodyCheck subserver (bodyCheck request))
where
bodyCheck request = return $ Route (\f ->
withBackend (Proxy :: Proxy b) $ \pb -> parseRequestBody pb request >>= f
)
@fizruk Oh, right, I overlooked this problem. Thanks for correcting this.
@i-am-the-slime I'm not aware of any HasClient
instance for this, but @fizruk's link definitely looks like something you want to use for that. Maybe we should all put the effort into writing the missing interpretations for multipart and add that to servant? Or just let the "allow IO in content type code" PR take over and simply have multipart be a content type?
Apparently servant-0.7
broken something here. I believe we should update it with:
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE TypeSynonymInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
module Files where
import Data.ByteString.Lazy (ByteString)
import Network.Wai.Parse
import Servant
import Servant.Server.Internal
import Control.Monad.Trans.Resource (runResourceT,withInternalState)
data Mem
data Tmp
class KnownBackend b where
type Storage b :: *
withBackend :: Proxy b -> (BackEnd (Storage b) -> IO r) -> IO r
instance KnownBackend Mem where
type Storage Mem = ByteString
withBackend Proxy f = f lbsBackEnd
instance KnownBackend Tmp where
type Storage Tmp = FilePath
withBackend Proxy f = runResourceT . withInternalState $ \s ->
f (tempFileBackEnd s)
data Files b
type MultiPartData b = ([Param], [File (Storage b)])
type MultiPartDataT b = ((MultiPartData b -> IO (MultiPartData b)) -> IO (MultiPartData b))
type FilesMem = Files Mem
type FilesTmp = Files Tmp
instance (KnownBackend b, HasServer sublayout config) => HasServer (Files b :> sublayout) config where
type ServerT (Files b :> sublayout) m =
MultiPartDataT b -> ServerT sublayout m
route Proxy config subserver = route (Proxy :: Proxy sublayout) config (addBodyCheck subserver bodyCheck)
where
bodyCheck = withRequest $ \ request ->
return $ \f -> withBackend (Proxy :: Proxy b) $ \pb -> parseRequestBody pb request >>= f
Also as it wasn't trivial here is how to handle files with it:
handleFiles :: MultiPartDataT Mem -> IO ()
handleFiles multipart = void $ multipart $ \(params,files) -> do
putStrLn "start"
mapM_ ppFile files
print params
putStrLn "end"
return (params,files)
where
ppFile :: File BL.ByteString -> IO ()
ppFile (name',fileinfo) = do
putStrLn $ "Input name: " <> toS name'
putStrLn $ "File name: " <> toS (fileName fileinfo)
putStrLn $ "Content type: " <> toS (fileContentType fileinfo)
putStrLn (toS (fileContent fileinfo))
putStrLn "---------------------------------"
myHandler :: MultiPartDataT Mem -> ExceptT ServantErr IO a
myHandler multipart = do
liftIO $ handleFiles multipart
...
Any news about HasClient
instance? I've tried to use HTTP.Client.MultipartFormData
without success...
I think this functionality should have first class support
Anyone volunteers to bring this to the servant packages? I'm willing to help on this if needed.
I'd like to help too.
@alpmestan @yogsototh I can give it a try
(in the case you prefer to let someone else do it and only provide help if needed)
the only thing I would discuss first is: do you really want Mem / Tmp /SomeOtherBackend to appear at the api level ?
I think it would be better to hide them, and provide usefull functions to store the file on different backend (the disk / temp file / dropbox, etc.) from inside the handler instead
do you really want Mem / Tmp /SomeOtherBackend to appear at the api level ?
I'd be in favour of it not doing so, if possible. Even needing a new Files
combinator (rather than using ReqBody [MultiPart]
) seems like an unfortunate leak in our abstractions.
Note also @fizruk's approach here.
I think my own personal preference would be to go with something like that. However, I think since whether a content-type does IO or not is a static fact (only MuliPart
, really), I think we don't need to allow IO
for all mimeUnrender
. We can just have a type class that determines whether it does IO (and what it returns - with MuliPart
we probably want it to be a FilePath
).
@jkarni I moslty agree but I have a concern around:
with MuliPart we probably want it to be a FilePath
I just checked https://www.w3.org/Protocols/rfc1341/7_2_Multipart.html and http://stackoverflow.com/questions/16958448/what-is-http-multipart-request
And I now think that FilePath should not directly be related to multipart.
I would prefer either:
- ReqBody [MultiPart a] with MultiPart to return a
- ReqBody [MultiPart] with MultiPart to return ByteString
- ReqBody '[MultiPart a, MultiPart b] with the whole stuff returning (a,b) - or some similar type level mechanism
with 1. we could then define
type UploadedFile = MultiPart FileContent
I won't have time anymore to work on that in the near future.
I hope that someone else will pick it up, this would be a great addition to servant
@yogsototh can you share full example? I have errors with this one
Couldn't match type βByteStringβ with β[Char]β
Expected type: Server API
Actual type: (MultiPartDataT Mem -> ExceptT ServantErr IO ())
:<|> Server Raw
In the expression: myHandler :<|> serveDirectory "/out"
In an equation for βserverβ:
server = myHandler :<|> serveDirectory "/out"
with api
type API = "files" :> FilesTmp :> Post '[PlainText] ()
:<|> "static" :> Raw
@mwotton OverloadedString already added. This is something different I can't figure, because types are equal () and ()
@sigrlami my bad there might be something wrong with my FilesTmp, I only tried FilesMem which should be closer to the proposed solution. From my handler example with FilesMem you could easily save the file to some directory.
@yogsototh thanks! That helped, It was obvious solution but I missed it
Hello,
is it possible to mix the file upload with traditonnal POST request?
Something like that:
type API = "files" :> ReqBody '[JSON] Foo :> FilesTmp :> Post '[JSON] ()
myHandler :: Foo -> MultiPartData Tmp -> EitherT ServantErr IO ()
myHandler foo (inputs, files) = ...
The idea is that I would like to use a normal body, decoded into Foo, together with some files.
I also need HasClient instance :)
Since this issue is still open, am I correct in assuming file upload still isn't part of the servant mainline?
@schell: Yes, that's correct.
I tried to update the combinator above for Servant 0.8. As far as I can tell, the change is not that simple as the route method in the HasServer class does not have access to the request object anymore.
Any suggestion as to how I might proceed ?
Nevermind, looking at Servant.Server.Internal gave me the answer. I still hope this combinator will be officially included in Servant, though !
Well, it's "just" a matter of writing all interpretations for it =) The annoying thing is that it basically just is a content type, but one that needs IO to decode from, and it doesn't feel right. Those are the 2 main reasons this isn't shipped in servant today.
There's no way around IO without avoiding parseRequestBody
though, right?
right. and we do want some IO because files get created using that functionand the /tmp backend.
Well, it's "just" a matter of writing all interpretations for it =)
I think this is the wrong approach. If we only allow new combinators to pop up when they provide instances for all core interpretations, we're basically asking people to write code that they don't even want to use themselves. Which we should strongly avoid.
@gaeldeest (or anyone else): Have you considered publishing this combinator as a separate package? Then you could just include the one (servant-server
) interpretation that you care about right now. And people wouldn't have to copy the combinator from this issue discussion into their code to use it.
Re: Allowing combinators to do IO: Even if most combinators don't want (and shouldn't) use IO
I think we should still allow IO
if there's one good use-case for that. (Which file-upload is.) So I'd be in favor of changing that.
+1, please publish this as a package or get it merged into servant and worry about the other instances later. Pretty please!
Having played around with @fizruk 's branch, I now think allowing unrender
to do IO
is the wrong approach, for the following reasons:
- We're skating on lazy-IO thin ice, since we have very little control or understanding of how much of the request body will be in memory at any point
- We can't use
wai-extra
, since even it's more internal functions expect an IO strict bytestring, and if we convert our lazy bytestring to strict I'm pretty sure we'd be putting everything into memory. - The content-type doesn't necessarily get to decide the storage (e.g., it should be possible to keep even mutlipart-formdata in memory, and other content-types in disk).
- Doing IO in
mimeUnrender
is anyhow ugly conceptually.
- We can't use
So I'm jumping ships and saying that the approach @alpmestan outlined is better.
@jkarni I am not sure which of @alpmestan's approaches you refer to.
It appears to me that in this thread he hoped #343 will be merged eventually.
What are the alternatives to IO-enabled mimeUnrender
?
The alternative is to just have a Files
combinator and not pretend we support the multipart content type. Basically provide just enough to enable users to have file upload. @jkarni correct me if I'm wrong.
Multipart needs to be supported though.
Right, I didn't mean that we shouldn't support multipart at some point. But I think people really just want the code shown in this ticket to be available on hackage and ready to use in any servant app, for now. Later on, when someone feels brave enough, we could have proper multipart support but this means reimplementing the code from wai-extra, even though there's still the question then of how to do it without IO
.
@alpmestan As long as I have the ability to still handle mutlipart myself then I am good with whatever.
Will the solution discussed here support the servant-client
case where I want to do:
$ curl -XPOST 'https://somewhere/foo' \
-i -L \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: audio/wav" \
--data-binary "@sample.wav"
Hi,
I've been playing with @alpmestan's approach for handling file uploads using the suggested Files
combinator. A minimal-complete example that is (partially) working for me on servant-8.1
is the following slight variation of the code snippets already posted in this thread:
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE RecordWildCards #-}
module Main where
import Control.Monad.Trans.Resource
import Control.Monad.Except
import Data.Monoid
import qualified Data.ByteString.Lazy as LS
import Data.Text (Text)
import qualified Data.Text as Text
import qualified Data.Text.Encoding as Text
import Lucid
import Network.Wai.Parse
import Network.Wai.Handler.Warp hiding (FileInfo)
import Servant
import Servant.HTML.Lucid
import Servant.Server.Internal
import System.Directory
import System.FilePath
data Mem
data Tmp
class KnownBackend b where
type Storage b :: * -- associated type family
withBackend :: Proxy b -> (BackEnd (Storage b) -> IO r) -> IO r
instance KnownBackend Mem where
type Storage Mem = LS.ByteString
withBackend Proxy f = f lbsBackEnd
instance KnownBackend Tmp where
type Storage Tmp = FilePath
withBackend Proxy f = runResourceT . withInternalState $ \s ->
f (tempFileBackEnd s)
data Files b
type MultiPartData b = ([Param], [File (Storage b)])
instance (KnownBackend b, HasServer sublayout config)
=> HasServer (Files b :> sublayout) config where
type ServerT (Files b :> sublayout) m =
IO (MultiPartData b) -> ServerT sublayout m
route Proxy config subserver =
route psub config (addBodyCheck subserver check)
where
psub = Proxy :: Proxy sublayout
pbak = Proxy :: Proxy b
check = withRequest $ \request -> return $ withBackend pbak $
\backend -> parseRequestBody backend request
type API = "form" :> Get '[HTML] (Html ())
:<|> "upload" :> Files Mem :> Post '[PlainText] Text
-- | Some form to upload an image and a video file.
formHandler :: ExceptT ServantErr IO (Html ())
formHandler = return $
html_ $ do
head_ (title_ "upload files test!")
body_ $ do
h1_ "upload files test"
form_ [ action_ "/upload"
, method_ "POST"
, enctype_ "multipart/form-data" ] $ do
input_ [ type_ "file", name_ "media" ]
input_ [ type_ "submit"
, name_ "send"
, value_ "Submit Data!" ]
-- | Handle the uploaded image and video files.
uploadHandler :: IO (MultiPartData Mem) -> ExceptT ServantErr IO Text
uploadHandler multipart = do
liftIO $ putStrLn "handling file upload..."
(params, files) <- liftIO multipart
-- when using `MultiPartData Mem`, use `wrFile` to write the files from memory to disk.
-- when using `MultiPartData Tmp`, use `cpFile` to copy the temporarily uploaded files
-- to some other location.
liftIO $ mapM_ wrFile files
return $
"params:\n" <> Text.intercalate "\n" (map ppParam params) <> "\n" <>
"files:\n" <> Text.intercalate "\n" (map ppFile files)
where
ppParam (name, val) =
" name = " <> Text.decodeUtf8 name <> "\n" <>
" value = " <> Text.decodeUtf8 val <> "\n"
ppFile (paramName, FileInfo{..}) =
" parameter name = " <> Text.decodeUtf8 paramName <> "\n" <>
" fileName = " <> Text.decodeUtf8 fileName <> "\n" <>
" fileContentType = " <> Text.decodeUtf8 fileContentType <> "\n" <>
" fileContent = " <> "..." <> "\n"
wrFile (_, FileInfo{..}) = LS.writeFile newFileName fileContent
where newFileName = (Text.unpack . Text.decodeUtf8) fileName <.> "copy"
cpFile (_, FileInfo{..}) = copyFile oldFileName newFileName
where
oldFileName = fileContent
newFileName = (Text.unpack . Text.decodeUtf8) fileName <.> "copy"
server :: Server API
server = formHandler :<|> uploadHandler
main :: IO ()
main = run 8888 (serve (Proxy :: Proxy API) server)
However, the above snippet exhibits the following two issues:
- While the code works for
MultiPartData Mem
(and using thewrFile
function inuploadHandler
), it doesn't seem to work forMultiPartData Tmp
(and using thecpFile
helper inuploadHandler
). For the latter, a file seems to get uploaded temporarily to the/tmp
directory, but by the timecpFile
inuploadHandler
wants to copy it to some other location, the file doesn't exist anymore, thus causingcpFile
to fail with asource file does not exist
exception. - Even in the
MultipartData Mem
setting, the above snippet seems only able to upload files up to a certain size. For example, uploading video files with sizes between 50-100MB just silently fails (not even hitting theuploadHandler
at all).
Has anyone observed either one of these issues? And if so, does anyone know what is happening here and, at best, how to resolve and overcome any of the issues?
Thoughts on this would be great!
In my first shot at file upload, withBackend
used to wrap the execution of the entire multipart decoding + disk writing + handler chain. If you take a look at the code you can see that the file is registered for deletion, and it's not exactly clear to me what the guarantees are wrt the lifetime of the uploaded file. Ideally, we would need a saner implementation of multipart that would be a good old content type or at least an implementation that either:
- lets the handler decide when/if the uploaded file from /tmp should be deleted
- deletes the file in /tmp at the end of the execution, if it's still there, in which case a user really should copy or move the said file inside the handler ; otherwise the content is lost.
In your case, AFAICT, we're only guaranteed that the tmp file is there during the decoding and writing to /tmp. I really wish we had a solid multipart encoding/decoding library; this would allow:
- proper support for /tmp backend that doesn't have the flaws I've mentionned above
- file upload support in servant-client! we could hand a few
FilePath
or(Filename, FileContent)
or something like that to the client function and it would do everything for us "automagically"
There's a possibility that I am misunderstanding the issue, but that looks like the most plausible explanation to me.
@alpmestan @bollmann The temp files are deleted at the end of runResourceT
, which is why the implementation in #133 (comment) uses the MultiPartDataT
type. I'm currently doing something similar.
@alpmestan, @rimmington: Thanks for your replies! Indeed, using the MultiPartDataT
data allows one to use the Tmp
backend as well, which solves my first problem. Furthermore, my second problem was unrelated to the Files
combinator, but rather had to to with my improper use of BasicAuth
, so nvm. :-)
For the record, I've been working on packaging up the multipart/form-data-powered upload. It does a little bit more than all the code we've written in this ticket and doesn't have the issue reported by @bollmann, without exposing a continuation. I still have a few things to add there and have to think about making this as nice and simple to use as possible, but I'll drop a comment here once it's ready. The repo's here.
Hopefully we'll soon be able to close one of (if not the) oldest open issues in the tracker =)
Any updates on this feature? I'd really like to have file uploads in servant!
Well, I'd appreciate feedback on https://github.com/haskell-servant/servant-multipart if anyone has got some time for looking at it. I might get back to it soon and add support for in-memory handling of file upload and cut a first release. It requires these patches for servant, which I yet have to wrap up and add tests for. If anyone wants to give some feedback or even help with these tasks, that'd be very much appreciated :)
For the record, servant got the necessary patches and https://github.com/haskell-servant/servant-multipart should be ready for use. It's missing support for in-memory uploads but could be released as it stands.
It looks good! I'd say release it! Though maybe it should be split into servant-multipart
and servant-multipart-server
(so that afterwards we can add -client
, -docs
, etc. in a backwards compatible way)?
As for the feedback - it's well documented and the API seems very nice! I'd mention something about ReqBody
not being usable in conjunction with this on the same path. Also, tests :)
Primarily, though, release it!