richardcornish/django-applepodcast

Download URL might not work

richardcornish opened this issue · 5 comments

The download URL for episode enclosures might not work. Construction of the URL should not assume that the domain of the current site would be the same domain as the enclosure file, which is not only a possibility but could be a probability with the proliferation of external storage services, such as Amazon S3. Possibly related to pull request 1.

We can reference the MEDIA_URL setting to get the URL prefix for media files.

But I wonder if we should rather rework the whole /slug/download functionality instead. Do we actually need this view? Can we just output a link the to media file instead?

Django already provides an interface for working with files by virtue of using the FileField class in models, which references the Storage class. Basically, the answer is you get {{ episode.enclosure.file.url }} for free in the templates, which is already being used. Using {{ MEDIA_URL }} in a template is sort of finicky if you're doing any kind of special processing in settings or elsewhere.

The benefit of a separate URL (which references the same episode.enclosure.file.url in the view) is that the download link will force the browser to download it to your computer and not just link to it.

Django's add_domain function should actually resolve the correct URL, S3 or not. Will keep examining, but if not, removing the view isn't that big of a deal.

Ah right, I thought episode.enclosure.file.url would give you the URL relative to MEDIA_ROOT, but I remembered that incorrectly, sorry. In fact the url is calculated by the storage backend.

One problem with EpisodeDownloadView is that it reads the file via HTTP from the media server and then sends it to the client, essentially killing the benefits of serving media from a separate server.

That's a good point. I hadn't considered the performance issues of opening a giant media file in memory and cramming it through Django. In cases of generating alternate media on the fly (PDFs, CSVs, XLS, etc.), I can see auto downloading of alternate kinds of HTTP responses beneficial. In the case of large media files, the benefits are less so. I will probably remove the view and just use {{ episode.enclosure.file.url }} instead.

I just wanted to update anyone following this thread, that a new EpisodeDownloadView has been added back to download episodes.

The difference between this one and the previous one is that it's a subclass of RedirectView. There are three benefits to this approach:

  • The body of the response is not delivered by Django, but is a 302 redirect to the enclosure file's URL, which fixes the previous memory issue and ultimately lands at the media file.
  • The main benefit is that one can now subclass EpisodeDownloadView and add any kind of analytics or tracking to the URL. By outputting the straight file URL, we lose any ability to track downloads because of the limitations of access to the file server. For example, Amazon S3 doesn't provide server logs of the file transfers. If one subclasses the view, it's important to add the super() call to finish the delivering the response. Likewise, the enclosure file's URL in the feed was changed as well.
  • The smaller benefit is that the enclosure URL is nice and clean, i.e. /podcast/<episode slug>/download/, but it also provides a somewhat more resilient way to link to the media file. Appending download/ is more reliable than linking to this month's choice of media server, which should be more of a dummy service.

If this method isn't preferred, the previous way of directly outputting the enclosure's URL is still possible, e.g. {{ episode.enclosure.file.url }} or {{ enclosure.file.url }}