haskell-servant/servant

File upload combinator

alpmestan opened this issue Β· 76 comments

Just creating this issue to put some code I have written up here for discussion.

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeSynonymInstances #-}
module Files where

import Control.Monad.IO.Class
import Control.Monad.Trans.Either
import Control.Monad.Trans.Resource
import Data.ByteString.Lazy (ByteString)
import Network.Wai
import Network.Wai.Handler.Warp (run)
import Network.Wai.Parse
import Servant
import Servant.Server.Internal

-- Backends for file upload: in memory or in /tmp ?

data Mem
data Tmp

class KnownBackend b where
  type Storage b :: *

  withBackend :: Proxy b -> (BackEnd (Storage b) -> IO r) -> IO r

instance KnownBackend Mem where
  type Storage Mem = ByteString

  withBackend Proxy f = f lbsBackEnd

instance KnownBackend Tmp where
  type Storage Tmp = FilePath

  withBackend Proxy f = runResourceT . withInternalState $ \s ->
    f (tempFileBackEnd s)

-- * Files combinator, to get all of the uploaded files

data Files b

instance (KnownBackend b, HasServer api) => HasServer (Files b :> api) where
  type ServerT (Files b :> api) m =
    [File (Storage b)] -> ServerT api m

  route Proxy subserver req respond = withBackend pb $ \b -> do
    (_, files) <- parseRequestBody b req
    route (Proxy :: Proxy api) (subserver files) req respond

    where pb = Proxy :: Proxy b

type FilesMem = Files Mem
type FilesTmp = Files Tmp

-- test

type API = "files" :> FilesTmp :> Post '[JSON] ()
      :<|> Raw

api :: Proxy API
api = Proxy

server :: Server API
server = filesHandler :<|> serveDirectory "."

  where filesHandler :: [File FilePath] -> EitherT ServantErr IO ()
        filesHandler = liftIO . mapM_ ppFile

        ppFile :: File FilePath -> IO ()
        ppFile (name, fileinfo) = do
          putStrLn $ "Input name: " ++ show name
          putStrLn $ "File name: " ++ show (fileName fileinfo)
          putStrLn $ "Content type: " ++ show (fileContentType fileinfo)
          putStrLn $ "------- Content --------"
          readFile (fileContent fileinfo) >>= putStrLn
          putStrLn $ "------------------------"

app :: Application
app = serve api server

f :: IO ()
f = run 8083 app

along with this HTML file:

<form action="/files" method="post" enctype="multipart/form-data">
    Select a file: <input type="file" name="blah" />
    Select another one: <input type="file" name="foo" />
    <hr />
    <input type="submit" value="Upload" />
</form>

served through serveDirectory. Thoughts, comments?

This is great! πŸ‘

One problem with this is that we can't target individual files by specifying the input name associated to them, with, say, a data File backend (inputname :: Symbol) combinator, but this is kind of on purpose... Let me explain.

  • With the current implementation I'm just calling out to some existing wai-code that reads the request body and decodes it. If we were to add a File combinator, we would either be decoding the request body and getting the file for every File mentionned, or would have to do something similar to what we do with ReqBody and memoize the decoded body. This is getting much trickier to handle.
  • If we have a File using the memory backend and another using /tmp, this would probably mess things up or do a lot of unnecessary work.

This is why I'm not sure we should include a File combinator. Shall I just put a PR together with just Files, using the code above and documenting it?

Here's a version that doesn't "forget" about the inputs that were sent along with the files in the request body (remember, this is multipart/form-data):

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeSynonymInstances #-}
module Files where

import Control.Monad.IO.Class
import Control.Monad.Trans.Either
import Control.Monad.Trans.Resource
import Data.ByteString.Lazy (ByteString)
import Network.Wai
import Network.Wai.Handler.Warp (run)
import Network.Wai.Parse
import Servant
import Servant.Server.Internal

-- Backends for file upload: in memory or in /tmp ?

data Mem
data Tmp

class KnownBackend b where
  type Storage b :: *

  withBackend :: Proxy b -> (BackEnd (Storage b) -> IO r) -> IO r

instance KnownBackend Mem where
  type Storage Mem = ByteString

  withBackend Proxy f = f lbsBackEnd

instance KnownBackend Tmp where
  type Storage Tmp = FilePath

  withBackend Proxy f = runResourceT . withInternalState $ \s ->
    f (tempFileBackEnd s)

-- * Files combinator, to get all of the uploaded files

data Files b

type MultiPartData b = ([Param], [File (Storage b)]) 

instance (KnownBackend b, HasServer api) => HasServer (Files b :> api) where
  type ServerT (Files b :> api) m =
    MultiPartData b -> ServerT api m

  route Proxy subserver req respond = withBackend pb $ \b -> do
    dat <- parseRequestBody b req
    route (Proxy :: Proxy api) (subserver dat) req respond

    where pb = Proxy :: Proxy b

type FilesMem = Files Mem
type FilesTmp = Files Tmp

-- test
type API = "files" :> FilesTmp :> Post '[JSON] ()
      :<|> Raw

api :: Proxy API
api = Proxy

server :: Server API
server = filesHandler :<|> serveDirectory "."

  where filesHandler :: MultiPartData Tmp -> EitherT ServantErr IO ()
        filesHandler (inputs, files) = do
          liftIO $ mapM_ ppFile files
          liftIO $ mapM_ print inputs

        ppFile :: File FilePath -> IO ()
        ppFile (name, fileinfo) = do
          putStrLn $ "Input name: " ++ show name
          putStrLn $ "File name: " ++ show (fileName fileinfo)
          putStrLn $ "Content type: " ++ show (fileContentType fileinfo)
          putStrLn $ "------- Content --------"
          readFile (fileContent fileinfo) >>= putStrLn
          putStrLn $ "------------------------"

app :: Application
app = serve api server

Now, this is in fact all equivalent to a new content type, for multipart/form-data. However, the fact that wai-extra forces us to do some IO to parse all of this (see parseRequestBody) gets in the way.

Thoughts? @jkarni any idea to make this a content type?

fwiw, i'm using this now and while it works from curl, it doesn't in the browser for large files. Weirdly, if i use netcat to collect the request as raw text then use netcat to dump it into the server, it works fine - doing it normally gets 405 Method Not Allowed and some chunks of text from the input logged by Network.Wai.Middleware.RequestLogger.

Interesting. I can upload files just fine in some app I'm working on. Do you think you could share a minimal app (I assume the one you're working on is for work, hence closed-source?) that has this problem? I could start from there and investigate what is going on. So just the HTML of your upload form along with the haskell app that receives the files.

yep, i'll try to get that to you today.

i think this is actually a warp thing, i get it when i use scotty too.

@mwotton what version of warp?

3.1.0. testing with the latest now.

@mwotton there was an issue with 3.1.0. So try the lastest or downgrade to 3.0.*.

ah, thanks, @codedmart - i'll try that.

yep, 3.1.2 fixes it. wish i'd known that before i rewrote the app :)

Ah, good to know!

3noch commented

Perhaps you could release this as a package?

Well, we would have to write instances for servant-client, servant-docs etc if we want to release this properly. It's quite a task...

3noch commented

Make the version < 1. Something is better than nothing, IMHO. That is to say, this is awesome and I would like to use it.

In order to make this a content-type, we'd need to change the content-type machinery to allow IO. I think that makes sense.

And if we do that, we may not need instances for all the packages.

@3noch You can use this! Simply drop the code in a module in your project, and you can use it =) I've put this combinator to work in several apps for work this way.

3noch commented

@alpmestan Of course, I just wish I had found it on hackage instead of a GitHub issue. Also, I could submit a PR to a repo if it existed. πŸ˜€

Yeah we should probably put some page together with combinators like this one and some instances. On the github wiki or in the servant-examples package or the site or something?

πŸ‘

Is this still the best way to achieve file upload?

I'm afraid so, but the work done by @fizruk here could probably be merged eventually, once the issues raised there are fixed. I'm sure if one or more people give this PR some love, that can happen :)

I have it working. I am trying to capture the text and upload at the same time. How do I do that:
:<|> "plants" :> Capture "plant_id" Text :> "pictures" :> Capture "picture_id" Text :> FilesTmp :> Post '[JSON] () ? This does not seem to work.

@i-am-the-slime Did you look at this version: #133 (comment). It handles the form data and the files.

Thanks @codedmart it works. However for a test I now seem to need a HasClient instance. How and where should I define that?

Is there a version of this that works with 0.5?

Probably something like:

-- Backends for file upload: in memory or in /tmp ?

data Mem
data Tmp

class KnownBackend b where
  type Storage b :: *

  withBackend :: Proxy b -> (BackEnd (Storage b) -> IO r) -> IO r

instance KnownBackend Mem where
  type Storage Mem = ByteString

  withBackend Proxy f = f lbsBackEnd

instance KnownBackend Tmp where
  type Storage Tmp = FilePath

  withBackend Proxy f = runResourceT . withInternalState $ \s ->
    f (tempFileBackEnd s)

-- * Files combinator, to get all of the uploaded files

data Files b

type MultiPartData b = ([Param], [File (Storage b)]) 

instance (KnownBackend b, HasServer api) => HasServer (Files b :> api) where
  type ServerT (Files b :> api) m =
    MultiPartData b -> ServerT api m

  route Proxy context subserver = WithRequest  WithRequest $ \ request ->
    route (Proxy :: Proxy api) context (addBodyCheck subserver (bodyCheck request))
    where
      bodyCheck request = withBackend (Proxy :: Proxy b) $ \be -> do
        let contentType = lookup hContentType $ requestHeaders request
        -- fixme: make sure the multipart/form-data content type is
        -- used and return the appropriate error if not
        dat <- parseRequestBody be request
        return (Route dat)

type FilesMem = Files Mem
type FilesTmp = Files Tmp

Not tested, but it shouldn't be too far from being functional. Note that an alternative solution is in the works -- @fizruk can tell you more about that but in the meantime you can take a look here.

This is a response to @alexanderkjeldaas, right? I would still really appreciate any pointers about how to create HasClient instances.

@alpmestan @alexanderkjeldaas Unfortunately that solution won't work because temporarily allocated resources (files or memory) would be destroyed by the time you would use their contents. That is because of the Delayed thing that came up in 0.5. To fix that you need to work with a CPS-d MultiPartData:

instance (HasServer sublayout config) => HasServer (Files b :> sublayout) config where
  type ServerT (Files b :> sublayout) m =
    ((MultiPartData b -> IO ()) -> IO ()) -> ServerT sublayout m

  route Proxy config subserver = WithRequest $ \request ->
    route (Proxy :: Proxy sublayout) config (addBodyCheck subserver (bodyCheck request))
    where
      bodyCheck request = return $ Route (\(f :: MultiPartData b -> IO ()) ->
        runResourceT . withInternalState $ \s ->
          parseRequestBody (tempFileBackEnd s) request >>= f)

Something like above should work. In fact @codedmart might have an exact working example (since we have discussed it with him). The usage example then would be like this:

handleFiles :: ((MultiPartData -> IO ()) -> IO ()) -> ExceptT ServantErr IO ()
handleFiles multipart = liftIO $ multipart $ \(params, files) -> do
  print ("start" :: String)
  mapM_ ppFile files
  print params
  print ("end" :: String)
  where
    ppFile :: File FilePath -> IO ()
    ppFile (name', fileinfo) = do
      putStrLn $ "Input name: " ++ show name'
      putStrLn $ "File name: " ++ show (fileName fileinfo)
      putStrLn $ "Content type: " ++ show (fileContentType fileinfo)
      putStrLn "------- Content --------"
      readFile (fileContent fileinfo) >>= putStrLn
      putStrLn "------------------------"

@i-am-the-slime for a HasClient instance you probably need Network.HTTP.Client.MultipartFormData. I don't think I can guide you through this though.

@fizruk's example does work, and thanks to him for the help. I slightly modified it:

data Mem
data Tmp

class KnownBackend b where
  type Storage b :: *

  withBackend :: Proxy b -> (BackEnd (Storage b) -> IO r) -> IO r

instance KnownBackend Mem where
  type Storage Mem = ByteString

  withBackend Proxy f = f lbsBackEnd

instance KnownBackend Tmp where
  type Storage Tmp = FilePath

  withBackend Proxy f = runResourceT . withInternalState $ \s ->
    f (tempFileBackEnd s)

data Files b

type MultiPartData b = ([Param], [File (Storage b)])
type MultiPartDataT b = ((MultiPartData b -> IO (MultiPartData b)) -> IO (MultiPartData b))

type FilesMem = Files Mem
type FilesTmp = Files Tmp

instance (KnownBackend b, HasServer sublayout config) => HasServer (Files b :> sublayout) config where
  type ServerT (Files b :> sublayout) m =
    MultiPartDataT b -> ServerT sublayout m

  route Proxy config subserver = WithRequest $ \request ->
    route (Proxy :: Proxy sublayout) config (addBodyCheck subserver (bodyCheck request))
    where
      bodyCheck request = return $ Route (\f ->
        withBackend (Proxy :: Proxy b) $ \pb -> parseRequestBody pb request >>= f
        )

@fizruk Oh, right, I overlooked this problem. Thanks for correcting this.

@i-am-the-slime I'm not aware of any HasClient instance for this, but @fizruk's link definitely looks like something you want to use for that. Maybe we should all put the effort into writing the missing interpretations for multipart and add that to servant? Or just let the "allow IO in content type code" PR take over and simply have multipart be a content type?

Apparently servant-0.7 broken something here. I believe we should update it with:

{-# LANGUAGE DataKinds            #-}
{-# LANGUAGE FlexibleInstances    #-}
{-# LANGUAGE ScopedTypeVariables  #-}
{-# LANGUAGE TypeFamilies         #-}
{-# LANGUAGE TypeOperators        #-}
{-# LANGUAGE TypeSynonymInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
module Files where

import           Data.ByteString.Lazy         (ByteString)
import           Network.Wai.Parse
import           Servant
import           Servant.Server.Internal
import Control.Monad.Trans.Resource (runResourceT,withInternalState)

data Mem
data Tmp

class KnownBackend b where
  type Storage b :: *

  withBackend :: Proxy b -> (BackEnd (Storage b) -> IO r) -> IO r

instance KnownBackend Mem where
  type Storage Mem = ByteString

  withBackend Proxy f = f lbsBackEnd

instance KnownBackend Tmp where
  type Storage Tmp = FilePath

  withBackend Proxy f = runResourceT . withInternalState $ \s ->
    f (tempFileBackEnd s)

data Files b

type MultiPartData b = ([Param], [File (Storage b)])
type MultiPartDataT b = ((MultiPartData b -> IO (MultiPartData b)) -> IO (MultiPartData b))

type FilesMem = Files Mem
type FilesTmp = Files Tmp

instance (KnownBackend b, HasServer sublayout config) => HasServer (Files b :> sublayout) config where
  type ServerT (Files b :> sublayout) m =
    MultiPartDataT b -> ServerT sublayout m

  route Proxy config subserver = route (Proxy :: Proxy sublayout) config (addBodyCheck subserver bodyCheck)
    where
      bodyCheck = withRequest $ \ request ->
             return $ \f -> withBackend (Proxy :: Proxy b) $ \pb -> parseRequestBody pb request >>= f

Also as it wasn't trivial here is how to handle files with it:

handleFiles :: MultiPartDataT Mem -> IO ()
handleFiles multipart = void $ multipart $ \(params,files) -> do
  putStrLn "start"
  mapM_ ppFile files
  print params
  putStrLn "end"
  return (params,files)
  where
    ppFile :: File BL.ByteString -> IO ()
    ppFile (name',fileinfo) = do
      putStrLn $ "Input name: " <> toS name'
      putStrLn $ "File name: " <> toS (fileName fileinfo)
      putStrLn $ "Content type: " <>  toS (fileContentType fileinfo)
      putStrLn (toS (fileContent fileinfo))
      putStrLn "---------------------------------"

myHandler :: MultiPartDataT Mem -> ExceptT ServantErr IO a
myHandler multipart = do
  liftIO $ handleFiles multipart
  ...

Any news about HasClient instance? I've tried to use HTTP.Client.MultipartFormData without success...

rvion commented

I think this functionality should have first class support

Anyone volunteers to bring this to the servant packages? I'm willing to help on this if needed.

I'd like to help too.

rvion commented

@alpmestan @yogsototh I can give it a try
(in the case you prefer to let someone else do it and only provide help if needed)

the only thing I would discuss first is: do you really want Mem / Tmp /SomeOtherBackend to appear at the api level ?
I think it would be better to hide them, and provide usefull functions to store the file on different backend (the disk / temp file / dropbox, etc.) from inside the handler instead

do you really want Mem / Tmp /SomeOtherBackend to appear at the api level ?

I'd be in favour of it not doing so, if possible. Even needing a new Files combinator (rather than using ReqBody [MultiPart]) seems like an unfortunate leak in our abstractions.

Note also @fizruk's approach here.

I think my own personal preference would be to go with something like that. However, I think since whether a content-type does IO or not is a static fact (only MuliPart, really), I think we don't need to allow IO for all mimeUnrender. We can just have a type class that determines whether it does IO (and what it returns - with MuliPart we probably want it to be a FilePath).

rvion commented

@jkarni I moslty agree but I have a concern around:

with MuliPart we probably want it to be a FilePath

I just checked https://www.w3.org/Protocols/rfc1341/7_2_Multipart.html and http://stackoverflow.com/questions/16958448/what-is-http-multipart-request

And I now think that FilePath should not directly be related to multipart.

I would prefer either:

  1. ReqBody [MultiPart a] with MultiPart to return a
  2. ReqBody [MultiPart] with MultiPart to return ByteString
  3. ReqBody '[MultiPart a, MultiPart b] with the whole stuff returning (a,b) - or some similar type level mechanism

with 1. we could then define

type UploadedFile = MultiPart FileContent
rvion commented

I won't have time anymore to work on that in the near future.
I hope that someone else will pick it up, this would be a great addition to servant

@yogsototh can you share full example? I have errors with this one

    Couldn't match type β€˜ByteString’ with β€˜[Char]’
    Expected type: Server API
      Actual type: (MultiPartDataT Mem -> ExceptT ServantErr IO ())
                   :<|> Server Raw
    In the expression: myHandler :<|> serveDirectory "/out"
    In an equation for β€˜server’:
        server = myHandler :<|> serveDirectory "/out"

with api

type API = "files"   :> FilesTmp :> Post '[PlainText] ()
      :<|> "static"  :> Raw

@sigrlami try OverloadedStrings

@mwotton OverloadedString already added. This is something different I can't figure, because types are equal () and ()

@sigrlami my bad there might be something wrong with my FilesTmp, I only tried FilesMem which should be closer to the proposed solution. From my handler example with FilesMem you could easily save the file to some directory.

@yogsototh thanks! That helped, It was obvious solution but I missed it

Hello,
is it possible to mix the file upload with traditonnal POST request?
Something like that:

type API = "files"   :> ReqBody '[JSON] Foo :> FilesTmp :> Post '[JSON] ()

myHandler :: Foo -> MultiPartData Tmp -> EitherT ServantErr IO ()
myHandler foo (inputs, files) = ...

The idea is that I would like to use a normal body, decoded into Foo, together with some files.

I also need HasClient instance :)

Since this issue is still open, am I correct in assuming file upload still isn't part of the servant mainline?

@schell: Yes, that's correct.

I tried to update the combinator above for Servant 0.8. As far as I can tell, the change is not that simple as the route method in the HasServer class does not have access to the request object anymore.

Any suggestion as to how I might proceed ?

Nevermind, looking at Servant.Server.Internal gave me the answer. I still hope this combinator will be officially included in Servant, though !

Well, it's "just" a matter of writing all interpretations for it =) The annoying thing is that it basically just is a content type, but one that needs IO to decode from, and it doesn't feel right. Those are the 2 main reasons this isn't shipped in servant today.

There's no way around IO without avoiding parseRequestBody though, right?

right. and we do want some IO because files get created using that functionand the /tmp backend.

Well, it's "just" a matter of writing all interpretations for it =)

I think this is the wrong approach. If we only allow new combinators to pop up when they provide instances for all core interpretations, we're basically asking people to write code that they don't even want to use themselves. Which we should strongly avoid.

@gaeldeest (or anyone else): Have you considered publishing this combinator as a separate package? Then you could just include the one (servant-server) interpretation that you care about right now. And people wouldn't have to copy the combinator from this issue discussion into their code to use it.

Re: Allowing combinators to do IO: Even if most combinators don't want (and shouldn't) use IO I think we should still allow IO if there's one good use-case for that. (Which file-upload is.) So I'd be in favor of changing that.

+1, please publish this as a package or get it merged into servant and worry about the other instances later. Pretty please!

Having played around with @fizruk 's branch, I now think allowing unrender to do IO is the wrong approach, for the following reasons:

  • We're skating on lazy-IO thin ice, since we have very little control or understanding of how much of the request body will be in memory at any point
    • We can't use wai-extra, since even it's more internal functions expect an IO strict bytestring, and if we convert our lazy bytestring to strict I'm pretty sure we'd be putting everything into memory.
    • The content-type doesn't necessarily get to decide the storage (e.g., it should be possible to keep even mutlipart-formdata in memory, and other content-types in disk).
    • Doing IO in mimeUnrender is anyhow ugly conceptually.

So I'm jumping ships and saying that the approach @alpmestan outlined is better.

@jkarni I am not sure which of @alpmestan's approaches you refer to.
It appears to me that in this thread he hoped #343 will be merged eventually.

What are the alternatives to IO-enabled mimeUnrender?

@fizruk: I'm also slightly confused, but I think @jkarni is referring to the topmost comment here in this issue. I haven't tried it myself, but it looks like it works without #343.

The alternative is to just have a Files combinator and not pretend we support the multipart content type. Basically provide just enough to enable users to have file upload. @jkarni correct me if I'm wrong.

Multipart needs to be supported though.

Right, I didn't mean that we shouldn't support multipart at some point. But I think people really just want the code shown in this ticket to be available on hackage and ready to use in any servant app, for now. Later on, when someone feels brave enough, we could have proper multipart support but this means reimplementing the code from wai-extra, even though there's still the question then of how to do it without IO.

@alpmestan As long as I have the ability to still handle mutlipart myself then I am good with whatever.

Will the solution discussed here support the servant-client case where I want to do:

  $ curl -XPOST 'https://somewhere/foo' \
   -i -L \
   -H "Authorization: Bearer $TOKEN" \
   -H "Content-Type: audio/wav" \
   --data-binary "@sample.wav"

Hi,

I've been playing with @alpmestan's approach for handling file uploads using the suggested Files combinator. A minimal-complete example that is (partially) working for me on servant-8.1 is the following slight variation of the code snippets already posted in this thread:

{-# LANGUAGE FlexibleInstances     #-}
{-# LANGUAGE OverloadedStrings     #-}
{-# LANGUAGE DataKinds             #-}
{-# LANGUAGE TypeOperators         #-}
{-# LANGUAGE ScopedTypeVariables   #-}
{-# LANGUAGE TypeFamilies          #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE RecordWildCards       #-}
module Main where

import           Control.Monad.Trans.Resource
import           Control.Monad.Except

import           Data.Monoid
import qualified Data.ByteString.Lazy as LS
import           Data.Text (Text)
import qualified Data.Text as Text
import qualified Data.Text.Encoding as Text

import           Lucid

import           Network.Wai.Parse
import           Network.Wai.Handler.Warp hiding (FileInfo)
import           Servant
import           Servant.HTML.Lucid
import           Servant.Server.Internal
import           System.Directory
import           System.FilePath

data Mem
data Tmp

class KnownBackend b where
  type Storage b :: * -- associated type family

  withBackend :: Proxy b -> (BackEnd (Storage b) -> IO r) -> IO r

instance KnownBackend Mem where
  type Storage Mem = LS.ByteString

  withBackend Proxy f = f lbsBackEnd

instance KnownBackend Tmp where
  type Storage Tmp = FilePath

  withBackend Proxy f = runResourceT . withInternalState $ \s ->
    f (tempFileBackEnd s)

data Files b

type MultiPartData b = ([Param], [File (Storage b)])

instance (KnownBackend b, HasServer sublayout config)
      => HasServer (Files b :> sublayout) config where

  type ServerT (Files b :> sublayout) m =
    IO (MultiPartData b) -> ServerT sublayout m

  route Proxy config subserver =
    route psub config (addBodyCheck subserver check)
    where
      psub  = Proxy :: Proxy sublayout
      pbak  = Proxy :: Proxy b
      check = withRequest $ \request -> return $ withBackend pbak $
        \backend -> parseRequestBody backend request

type API = "form" :> Get '[HTML] (Html ())
      :<|> "upload" :> Files Mem :> Post '[PlainText] Text

-- | Some form to upload an image and a video file.
formHandler :: ExceptT ServantErr IO (Html ())
formHandler = return $
  html_ $ do
    head_ (title_ "upload files test!")
    body_ $ do
      h1_ "upload files test"
      form_ [ action_ "/upload"
            , method_ "POST"
            , enctype_ "multipart/form-data" ] $ do
        input_ [ type_ "file", name_ "media" ]
        input_ [ type_ "submit"
               , name_ "send"
               , value_ "Submit Data!" ]

-- | Handle the uploaded image and video files.
uploadHandler :: IO (MultiPartData Mem) -> ExceptT ServantErr IO Text
uploadHandler multipart = do
  liftIO $ putStrLn "handling file upload..."
  (params, files) <- liftIO multipart

  -- when using `MultiPartData Mem`, use `wrFile` to write the files from memory to disk.
  -- when using `MultiPartData Tmp`, use `cpFile` to copy the temporarily uploaded files 
  -- to some other location.
  liftIO $ mapM_ wrFile files
  return $
    "params:\n" <> Text.intercalate "\n" (map ppParam params) <> "\n" <>
    "files:\n"  <> Text.intercalate "\n" (map ppFile files)
  where
    ppParam (name, val) =
      "  name  = " <> Text.decodeUtf8 name <> "\n" <>
      "  value = " <> Text.decodeUtf8 val  <> "\n"

    ppFile (paramName, FileInfo{..}) =
      "  parameter name = "  <> Text.decodeUtf8 paramName       <> "\n" <>
      "  fileName = "        <> Text.decodeUtf8 fileName        <> "\n" <>
      "  fileContentType = " <> Text.decodeUtf8 fileContentType <> "\n" <>
      "  fileContent = "     <> "..."                           <> "\n"
    wrFile (_, FileInfo{..}) = LS.writeFile newFileName fileContent
      where newFileName = (Text.unpack . Text.decodeUtf8) fileName <.> "copy"
    cpFile (_, FileInfo{..}) = copyFile oldFileName newFileName
      where
        oldFileName = fileContent
        newFileName = (Text.unpack . Text.decodeUtf8) fileName <.> "copy"

server :: Server API
server = formHandler :<|> uploadHandler

main :: IO ()
main = run 8888 (serve (Proxy :: Proxy API) server)

However, the above snippet exhibits the following two issues:

  1. While the code works for MultiPartData Mem (and using the wrFile function in uploadHandler), it doesn't seem to work for MultiPartData Tmp (and using the cpFile helper in uploadHandler). For the latter, a file seems to get uploaded temporarily to the /tmp directory, but by the time cpFile in uploadHandler wants to copy it to some other location, the file doesn't exist anymore, thus causing cpFile to fail with a source file does not exist exception.
  2. Even in the MultipartData Mem setting, the above snippet seems only able to upload files up to a certain size. For example, uploading video files with sizes between 50-100MB just silently fails (not even hitting the uploadHandler at all).

Has anyone observed either one of these issues? And if so, does anyone know what is happening here and, at best, how to resolve and overcome any of the issues?

Thoughts on this would be great!

In my first shot at file upload, withBackend used to wrap the execution of the entire multipart decoding + disk writing + handler chain. If you take a look at the code you can see that the file is registered for deletion, and it's not exactly clear to me what the guarantees are wrt the lifetime of the uploaded file. Ideally, we would need a saner implementation of multipart that would be a good old content type or at least an implementation that either:

  • lets the handler decide when/if the uploaded file from /tmp should be deleted
  • deletes the file in /tmp at the end of the execution, if it's still there, in which case a user really should copy or move the said file inside the handler ; otherwise the content is lost.

In your case, AFAICT, we're only guaranteed that the tmp file is there during the decoding and writing to /tmp. I really wish we had a solid multipart encoding/decoding library; this would allow:

  • proper support for /tmp backend that doesn't have the flaws I've mentionned above
  • file upload support in servant-client! we could hand a few FilePath or (Filename, FileContent) or something like that to the client function and it would do everything for us "automagically"

There's a possibility that I am misunderstanding the issue, but that looks like the most plausible explanation to me.

@alpmestan @bollmann The temp files are deleted at the end of runResourceT, which is why the implementation in #133 (comment) uses the MultiPartDataT type. I'm currently doing something similar.

@alpmestan, @rimmington: Thanks for your replies! Indeed, using the MultiPartDataT data allows one to use the Tmp backend as well, which solves my first problem. Furthermore, my second problem was unrelated to the Files combinator, but rather had to to with my improper use of BasicAuth, so nvm. :-)

For the record, I've been working on packaging up the multipart/form-data-powered upload. It does a little bit more than all the code we've written in this ticket and doesn't have the issue reported by @bollmann, without exposing a continuation. I still have a few things to add there and have to think about making this as nice and simple to use as possible, but I'll drop a comment here once it's ready. The repo's here.

Hopefully we'll soon be able to close one of (if not the) oldest open issues in the tracker =)

Any updates on this feature? I'd really like to have file uploads in servant!

Well, I'd appreciate feedback on https://github.com/haskell-servant/servant-multipart if anyone has got some time for looking at it. I might get back to it soon and add support for in-memory handling of file upload and cut a first release. It requires these patches for servant, which I yet have to wrap up and add tests for. If anyone wants to give some feedback or even help with these tasks, that'd be very much appreciated :)

For the record, servant got the necessary patches and https://github.com/haskell-servant/servant-multipart should be ready for use. It's missing support for in-memory uploads but could be released as it stands.

It looks good! I'd say release it! Though maybe it should be split into servant-multipart and servant-multipart-server (so that afterwards we can add -client, -docs, etc. in a backwards compatible way)?

As for the feedback - it's well documented and the API seems very nice! I'd mention something about ReqBody not being usable in conjunction with this on the same path. Also, tests :)

Primarily, though, release it!

Uploaded as it is on hackage! Here

I hope nobody here minds that I close this issue, after having started it almost 2 years ago. Any multipart/upload related discussion should now happen on the issue tracker for servant-multipart, over here. Thanks everyone!