codingjoe/django-s3file

tmp folder not appearing

Closed this issue · 19 comments

pdf9 commented

Hi, I've been desperately searching for something like this and would love to make use of it for a small project I'm spending way too much time on. I followed the insanely simple setup instructions :) but can't get the tmp folder to appear in my s3 bucket, so I'm assuming I could not get it to work. I want this to work so bad, please help!

Here's my AWS settings - everything else is just like in the setup instructions, forms use clearable file input, etc. I can share repo with you if that'd help. Is just a simple app to upload photos and videos that uses a calendar to track them, am going to give to my family and friends. Problem is it takes forever to upload multiple images at once (one of the main reasons i'm so pumped about using this) and videos though and I'm afraid they won't use due this, and also it's on heroku so the 30 second timeout is a problem! I'll stop rambling...

DEFAULT_FILE_STORAGE = 'storages.backends.s3boto3.S3Boto3Storage'
AWS_ACCESS_KEY_ID = 'my key'
AWS_SECRET_ACCESS_KEY = 'secret access key'
AWS_STORAGE_BUCKET_NAME = 'bucket name'
AWS_S3_FILE_OVERWRITE = False
STATICFILES_STORAGE = 'storages.backends.s3boto3.S3Boto3Storage'
AWS_DEFAULT_ACL = None
AWS_S3_CUSTOM_DOMAIN = '%s.s3.amazonaws.com' % AWS_STORAGE_BUCKET_NAME
AWS_LOCATION = 'static'
AWS_S3_OBJECT_PARAMETERS = {
    'CacheControl': 'max-age=86400',
}
AWS_STORAGE_BUCKET_NAME_MEDIA = 'media bucket name'
AWS_S3_CUSTOM_DOMAIN_MEDIA = '%s.s3.amazonaws.com' % AWS_STORAGE_BUCKET_NAME_MEDIA
AWS_LOCATION_MEDIA = 'assets'

STATIC_ROOT = os.path.join(BASE_DIR, 'staticfiles')
STATIC_URL = "https://%s/%s/" % (AWS_S3_CUSTOM_DOMAIN, AWS_LOCATION)
STATICFILES_DIRS = [
    os.path.join(BASE_DIR, 'static')
]

MEDIA_ROOT = os.path.join(BASE_DIR, '/static/media')
MEDIA_URL = "https://%s/%s/" % (AWS_S3_CUSTOM_DOMAIN_MEDIA, AWS_LOCATION_MEDIA)

Hi @pdf9,

Thanks for reaching out. At first glance, everything looks correct. Then again, there isn't much to set up to make this package work.

Let's tackle this step by step:

  1. Does regular upload work without this package?
  2. Does the s3 mock upload work locally? You can check the browsers network logs.
  3. Did you check the folder static/tmp? I see that you use the AWS_LOCATION setting. This prefix is also applied to the S3file location.

Let me know if anything unexpected turns up, which examining those points.

Best,
Joe

pdf9 commented

Hi Joe,

I tried the s3 mockup and I'm getting bad request notifications when I try to upload - the steps for setup are so simple I feel like I have to be missing something additional that's needed. The only steps I've taken is to install s3file, add to installed apps and middleware in settings, and then add the CORS in s3 (then spent a good long time trying to mess around and get my project to work with it lol). I'm able to upload files through the server to s3 with no issues aside from timeout when I'm running it in heroku. Are there additional things I need to be doing? My project is super simple - have just one project directory and one app directly with a setup like you'd see in the beginner django books - with only models, urls, views, forms, and filter files (I'm still pretty new this).

image

Hi @pdf9,

Thanks for the additional information, but I need a little more. Can you please click on one of the failed requests and post the headers as well as the response body? The response body should contain some indication to the error. If you have a website that you want me to check out for myself (like a Heroku review app, feel free to send a URL to info@johanneshoppe.com).

Best, Joe

pdf9 commented

Hi Joe, here are headers and response body of one of the failed requests - thanks so much for your responses/help

Headers:
image
image

Response body:

InvalidRequestThe authorization mechanism you have provided is not supported. Please use AWS4-HMAC-SHA256.60FC3091CA8B0ABFm6GIJsE647sitsvKZQJCikvkhgZ8SFiJUeOssY8dM3PIa+JgbzurWSNqfQELNJGfx+XMwVt6LUE=

Thanks,
Pat

pdf9 commented

After your direction to look at the response body I updated settings to include the below after finding a post on it here: jschneier/django-storages#687

This fixed for me and it's now uploading! I see the tmp folder in my s3 bucket. Thank you so much....seriously you're the man...

if using boto3 (what I am using)
AWS_S3_REGION_NAME = 'us-east-2' #change to your region
AWS_S3_SIGNATURE_VERSION = 's3v4'

if using boto (in case it helps anyone else)
AWS_S3_HOST = 'us-east-2' #change to your region
S3_USE_SIGV4 = True

pdf9 commented

Hi Joe,

As stated before the files are now successfully uploading :) but as soon as they finish uploading my browser displays the below json response (looks like that's what should happen based on the view) but FILES are both empty, and the below error appears in my terminal. I think it's telling me there is an issue with actually creating the model records?

In django admin I can see there has been nothing added to the model I created for the test(even though I've uploaded several test files). Is there anything that jumps out to you for why I'm not able to close the loop with uploading to s3 and also creating database records for those files? Or do you think that's even what's happening?

Browser message:

{"POST": {"csrfmiddlewaretoken": "m5wlsyuOxINjqOYG6A0KjImDQmeMjkCiqgquQGuSTztPHScEEejqrtXmcQthC1iH", "progress": "1", "save_continue": "continue_value", "s3file": "["file","other_file"]", "other_file": "[]", "file": "["static/tmp/s3file/BEmMX0_fQAix5E-JTr-N7Q/1.jpg","static/tmp/s3file/BEmMX0_fQAix5E-JTr-N7Q/55.JPG","static/tmp/s3file/BEmMX0_fQAix5E-JTr-N7Q/Capture3.PNG"]"}, "FILES": {"file": [], "other_file": []}}

Django terminal error message:

File not found: statictmp/s3file/69PKsrZ3TsmkobXUwB1P9w/a123.mp4
Traceback (most recent call last):
File "C:\my_project_path\venv\lib\site-packages\storages\backends\s3boto3.py", line 431, in _open
f = S3Boto3StorageFile(name, mode, self)
File "C:\my_project_path\venv\lib\site-packages\storages\backends\s3boto3.py", line 107, in init
self.obj.load()
File "C:\Users\my_project_path\venv\lib\site-packages\boto3\resources\factory.py", line 505, in do_action
response = action(self, *args, **kwargs)
File "C:\Users\my_project_path\venv\lib\site-packages\boto3\resources\action.py", line 83, in call
response = getattr(parent.meta.client, operation_name)(*args, **params)
File "C:\my_project_path\venv\lib\site-packages\botocore\client.py", line 357, in _api_call
return self._make_api_call(operation_name, kwargs)
File "C:\my_project_path\venv\lib\site-packages\botocore\client.py", line 676, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (404) when calling the HeadObject operation: Not Found

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\my_project_path\venv\lib\site-packages\s3file\middleware.py", line 34, in get_files_from_storage
f = storage.open(path)
File "C:\my_project_path\venv\lib\site-packages\django\core\files\storage.py", line 36, in open
return self._open(name, mode)
File "C:\my_project_path\venv\lib\site-packages\storages\backends\s3boto3.py", line 434, in _open
raise FileNotFoundError('File does not exist: %s' % name)
FileNotFoundError: File does not exist: static/statictmp/s3file/69PKsrZ3TsmkobXUwB1P9w/a123.mp4

pdf9 commented

Hi Joe, ok so I think I have solved the error I was getting but am still not able to create new records for the model. Solved the issue causing the error by just adding a '/' to my AWS_LOCATION so it's 'static/'.

so now something like static/statictmp/s3file/69PKsrZ3TsmkobXUwB1P9w/a123.mp4 is now static/static**/**tmp/s3file/69PKsrZ3TsmkobXUwB1P9w/a123.mp4 and I don't get the error and I see something like this in the json browser response:

{"POST": {"csrfmiddlewaretoken": "MmGTCKMAUWKkIq6DiWk96y24GsJrd2iSQxA20SMEgNqQZukBQADPejDN2WYWwJYh", "progress": "1", "save_continue": "continue_value", "s3file": "["file","other_file"]", "file": "[]", "other_file": "["static/tmp/s3file/kWos2sd4QTC9db8lbO8aMw/4.jpg"]"}, "FILES": {"file": [], "other_file": ["4.jpg"]}}

But I am still not seeing any model records created when I go into django admin after I submit a set of files.

Sorry to keep pestering you with these. I am only sending messages here after I've run up against the wall over and over...thanks again for all your help.

pdf9 commented

Hi Joe,

Can you help with this part from the instructions below? I cannot figure out how to save the correct reference to the url in the model instance so that after I run the upload I'm able to reference the urls of the files uploaded using s3file in my templates.

"S3File uploads to a single folder. Files are later moved by Django when they are saved to the upload_to location."

Thank you

Fyi I emailed you the site at info@johanneshoppe.com so you can get an idea of the end goal here.

After some investigation I actually did find a bug that is causing your problem. The library currently can't handle custom AWS_LOCATIONs without a trailing slash. I am working on a fix.

pdf9 commented

Hi Joe,

Thanks for that. Am hoping you can help with current item I'm having difficulty with -

When I use request.FILES in my views to save the model instance the application is continuing to save the files through the server. The s3file upload is working, so for each upload there are two file locations appearing in s3 - one in the tmp folder, and one in the folder from the 'upload_to='folder'' parameter from the the model.

So I changed the view to only save request.POST to the model instance and then I referenced that in the template. That works to no longer upload through the server but the string that's saved has "[]" on either end which are being converted to %5B" & "%5D respectively.

Here's a snip of what I see in django admin and a sample link of what happens when I click it

image
https://my-bucket-name.s3.amazonaws.com/static/%5B%22static/tmp/s3file/VAAaB-MlSJmRdBSbS4iBzw/Capture3.PNG%22%5D

In the template these just appear as broken links. Do you know how I can fix this? Or what I can do differently in my view so I can save the model instance with the s3file and be able to reference it in the template with {{ file.url }} without continuing to upload through the server?

thank you

pdf9 commented

Just in case it helps here's the form and view I'm working with. To clarify on previous comment - everything works, the s3file uploads to the tmp folder, it's just the view is also uploading to the upload_to location (from the FileField model) through the server (or at least it feels like it is with the speed). I can see the tmp folder appear in s3 with everything very quickly, but the form spins and spins until finally the images also appear in the upload_to in s3 as well. takes about 10x as long for those to appear than it does for the s3file tmp files.

forms

if S3FileInputMixin not in forms.ClearableFileInput.__bases__:
    forms.ClearableFileInput.__bases__ = (
        S3FileInputMixin,
    ) + forms.ClearableFileInput.__bases__


class EventForm(ModelForm):
  class Meta:
    model = Event
    fields = ['title', 'description', 'start_time', 'end_time', 'embed', 'video']
    widgets = {
      'start_time': DateInput(attrs={'type': 'datetime-local'}, format='%Y-%m-%dT%H:%M'),
      'end_time': DateInput(attrs={'type': 'datetime-local'}, format='%Y-%m-%dT%H:%M'),
      'video': forms.FileField(widget=forms.ClearableFileInput, required=False)
    }
    exclude = ['user']
    labels = {'title': '*Title*', 'description': 'Description', 'start_time': '*Start Time*',
              'end_time': 'End Time', 'embed': 'Embed Video Link', 'video': 'Video'}


  def __init__(self, *args, **kwargs):
    super(EventForm, self).__init__(*args, **kwargs)
    self.fields['start_time'].input_formats = ('%Y-%m-%dT%H:%M',)
    self.fields['end_time'].input_formats = ('%Y-%m-%dT%H:%M',)


class EventFullForm(EventForm):
  images = forms.FileField(widget=forms.ClearableFileInput(attrs={'multiple': True}), required=False)

  class Meta(EventForm.Meta):
      fields = EventForm.Meta.fields + ['images']

views

def add_event(request):
    form = EventFullForm(request.POST or None, request.FILES or None)
    files = request.FILES.getlist('images')
    if request.method == "POST":
        if form.is_valid():
            user = request.user
            title = form.cleaned_data['title']
            description = form.cleaned_data['description']
            start_time = form.cleaned_data['start_time']
            end_time = form.cleaned_data['end_time']
            embed = form.cleaned_data['embed']
            video = form.cleaned_data['video']
            event_obj = Event.objects.create(user=user, title=title, description=description, start_time=start_time,
                                             end_time=end_time, video=video, embed=embed)
            for f in files:
                Images.objects.create(event=event_obj, image=f)

            return redirect('calendarapp:event-detail', event_id=event_obj.id)
        else:
            form = EventFullForm()

    context = {'form': form, 'files': files}
    return render(request, 'calendarapp/new_event.html', context)

@pdf9 you need to pass request.FILES to your form. This package maintains the regular Django APIs.
The files are uploaded to the temp directory and are moved by your application to the upload_to location. Or they should be moved, if they are coped, that might be due to your form. However, there is a simple solution to that. I would recommend setting up expiration for the upload directory, as mentioned in the readme. Best, Joe

pdf9 commented

Thank you Joe for the responses and help. Just have a couple final (I hope) questions so I know I'm understanding you and not going crazy as I try to work through this:

  1. Is there anyway you can think of for me to validate if my application is moving the files from tmp to upload_to like it's supposed to with s3file (or how to see if it's not doing that, and is continuing to just upload them through the server in addition to the s3file upload)?

  2. Is the simple solution you mentioned related to my form, the expiration setup, or is it something else? Can you please help me out with a little bit more detail?

Hi @pdf9,

Yes, all you need to do is set up expiration. If you file end up in the temp folder, they have been uploaded directly not through the application. Both simultaneously, is not possible ;)

How to set up expiration can be found in the readme. This is a good measure regardless if the file is moved or copied to its proper location. Let's say you have a form that fails because of some service side validation. In the case the file is already uploaded, but not moved. The user will then upload a new file. To avoid orphan files piling up, expiration is very helpful. A day is usually sufficient, 30 days is the maximum I'd recommend due to GDPR.

Best,
Joe

pdf9 commented

Hi Joe,

Thanks for help. I set up the expiration but still doesn't seem to be working. It appears they're uploading directly to tmp first, and then after that they're uploading through the application/server separately to upload_to.

Screen shots below show the same five files both existing in both the tmp and the upload_to locations. tmp files all appear at same time, upload_to then start appearing a couple of seconds later and continue appearing about 1 second apart from each other (they're small files, normal images from cell phone take way longer)

Any idea how to fix this?

tmp folder:

image

upload_to folder:

image

Example expiration details if it matters:

image

Hi @pdf9 as mentioned before, that is impossible. Files an only be uploaded either way. Therefore, it must be your application, that is copying the files to the upload_to destination. Again, that is not a bad thing, since your temp location will expire those files anyways.

pdf9 commented

Hi Joe, got it, thank you. Do you any ideas on what I need to do to the application to have it just move the files, as opposed to copying?

Hm… no, not really. That highly depends on how you process the files. Usually Django would have the files in memory before they are saved to disk. We don't do this, as it would allow people to blow up the app servers memory. Let's say you upload a video for example. You don't want that in memory. You probably only want the S3 location, to start a background rendering job. Best Joe

pdf9 commented

Thanks Joe, yes I 100% don't want that and is why I want so badly to make this work. I just can't seem to get my application to work with the s3file upload so I'm not timing out the heroku server when I attempt to upload anything > than about 30mb.

I saw someone just post an item to see if there's anyway you could add a sample project to the repo. That would be awesome as I feel like I'm missing something when I'm replicating the tests/testapp. Something with a traditional django project structure would help people like me so much.

Either way thanks for trying to help. If I can't get this to work I'm looking at just teaching the small group of users I'm planning to give this to how to upload to youtube and add the embedded link to the form.