Podcastindex-org/podcast-namespace

Proposal - podcast:complete tag

daveajones opened this issue · 28 comments

As we move forward with slowly providing replacement tags for itunes namespace tags, I had a thought: in replicating the <itunes:complete> tag, why not add an optional link to a full archive feed if it was a long running podcast and had many hundreds of episodes. The main feed might not want to carry all of those episodes. But, an archive link to a full feed would be nice.

<podcast:complete>


<podcast:complete
 archive="[feed url(string)]"
>
[yes|no(bool)]
</podcast:complete>

Channel

(optional | single)

This element allows a podcaster to signal to the world that this podcast will never publish another episode to this feed. In it's basic form, it is a direct drop-in
replacement for <itunes:complete> and functions identically. The addition of the archive attribute allows for specifying the url of a complete archival feed of the
podcast that contains every episode in case this feed does not.

Specifying a value of no is the same as this tag not being present.

Attributes

  • archive (optional) A url of a feed that contains every episode of this podcast.

Examples:

<!-- This podcast will never publish again -->
<podcast:complete>yes</podcast:complete>

<!-- Redundant usage and not recommended -->
<podcast:complete>no</podcast:complete>

<!-- This podcast is over, and a complete episode archive feed is "here" -->
<podcast:complete archive="https://example.org/rss/myarchive.xml">yes</podcast:complete>

I'm fine with replicating the tag, but I question the necessity of providing an archive feed when the master feed itself could contain all episodes.

Under what scenario would someone want to mark a feed complete, not let it include all the episodes, but still link to a feed that does have all episodes?

I can see people wanting to "complete" an entire show with all episodes.

I can also see people wanting to "complete" a show but keep only the latest episodes available.

I just can't see why they would want a hybrid.

Agree with DJL on the archive link. Though the opposite is interesting. To me, and archive feed is a complete feed. So if anything, and archive/complete feed should point back to a still being updated feed.

Use case: A 10-episode series on the history of prohibition that’s complete, but the host has an ongoing history podcast.

Good to see you here, @evoterra! (Or am I forgetting that you've already been here for a while?)

I like that idea for sure! And I think it would be better for the related/featured tag discussed elsewhere.

Also, I’d soften this:

… this podcast will never publish another episode to this feed.

A complete feed doesn’t necessarily mean a completely static, unchanging feed. To me, it signals the main content is complete, and a listener can feel confident that they can binge it all, w/o being forced to wait on new content.

But new content, like directors commentary (or, sure, episode drops for other shows) may still appear.

For app devs (I’m thinking Apollo, a fiction podcast app), they tag may be a good signal to help them flag shows and even segregate those not-part-of-the-show-but-part-of-the-feed episodes.

The actual Apple Podcasts spec says about <itunes:complete>:

The podcast update status.

If you will never publish another episode to your show, use this tag.

Specifying the <itunes:complete> tag with a Yes value indicates that a podcast is complete and you will not post any more episodes in the future.

Specifying any value other than Yes has no effect.

I wonder if, instead, we should merge these ideas into a new tag, like <podcast:frequency> that accepts frequency values like "daily, semiweekly, weekly, biweekly," and such, but then also accepts other terms like "hiatus, retired, ended."

"Retired" gives me the idea that it could still do something, but the main content is over.

Interesting ideas. The archival link is a no go, so I'll bail on that idea. I think we could connect this with another tag as Daniel mentioned in order to get the functionality that Evo is thinking about.

Hadn’t considered the frequency tag, but they could work nicely and has the added benefit of exposing the concept to more users.

If complete as a frequency choice were added, perhaps it would get more adoption?

Note: I worry a little about making too many choices (3x a week, each full moon, 2 days after each football game) for when episodes are publishing (and what anyone would do with that info). And I think I have the same concerns about various ways to indicate a “pause” of some sort. But neither are huge concerns.

I like the idea of podcast:frequency. In an older version of podlink, I wrote a function to estimate the cadence and predict when the next episode should drop. I'd suggest looking into using "RRULE" as defined in the RFC 2445, section 4.8.5.4 spec. This is how .ics and Google Calendar describe all recurring events, and it even allows for limiting by COUNT or UNTIL=$DATE if you're doing a limited run podcast and want to signal that in advance.

RRULE:FREQ=DAILY # daily
RRULE:FREQ=DAILY;UNTIL=20150919T063000Z # daily until 2015-09-19
RRULE:FREQ=WEEKLY;BYDAY=TH # every Thursday
RRULE:FREQ=WEEKLY;BYDAY=MO,WE,FR # every Mon, Wed and Fri
RRULE:FREQ=WEEKLY;BYDAY=TU;INTERVAL=2 # every other Tuesday
RRULE:FREQ=MONTHLY;BYDAY=-2FR # every month on the 2nd last Friday

Here's a JS library that hosts could implement to allow podcasters to enter a plain-text string and generate an RRULE for the podcast:frequency tag. Crawlers could use that to avoid over-scraping. If you include the timezone details, players could display a prediction based on the listener’s timezone.

In April 2021, Apple lets publishers specify “Update Frequency” for their shows in Apple Podcasts Connect. This shows up in the API as one of these options:

  • Daily
  • Weekly
  • Semiweekly (twice per week)
  • Biweekly (every other week)
  • Monthly
  • Semimonthly (twice per month)
  • Bimonthly (every other month)
  • No Set Schedule

As of July 2021, Pacific Content found that only 14% of all shows in Apple Podcasts have set an update frequency. I'm curious what the adoption would look like today.

As of July 2021, Pacific Content found that only 14% of all shows in Apple Podcasts have set an update frequency.

That probably just means that 86% of podcasts have bothered to set that tag in Apple Podcasts Connect, since it's set in Apple Podcast Connect, and Apple Podcasts Connect is a steaming pile of poo.

Getting it a usable form that hosting companies could incorporate would significantly help. Bonus points for the hosting providers who proactively let their existing customers know that they've some new fields for them to fill out. (Does anyone do this?)

Prior art in Atom extensions: RFC 5005: Feed Paging and Archiving

  • Provides an element to indicate that a feed is complete:
<fh:complete fh:xmlns="http://purl.org/syndication/history/1.0"/>
  • Provides a set of link relations to implement paged feeds, thus enabling individual shorter feeds resources but still a full feed of all feed items:
<link rel="self" href="http://example.org/index.atom?page=3"/>
<link rel="first" href="http://example.org/index.atom"/>
<link rel="previous" href="http://example.org/index.atom?page=2"/>
<link rel="next" href="http://example.org/index.atom?page=4"/>
<link rel="last" href="http://example.org/index.atom?page=10"/>
  • Provides a set of link relation especially for archived feeds and an <fh:archive/> element. That somewhat duplicates paged feeds but has the semantic implication that the (optional) paged feed resources are stable and non-changing.
<link rel="prev-archive" href="http://example.org/2003/11/index.atom"/>

Seems that 5005 is referring to “complete” in the sense of “not fragmented by paging” instead of the Apple sense of the term meaning more like “finished” and “won’t post again”.

This does remind me however that 5005 needs to be on the podcast standards web site as general good practice.

Regarding RFC 5005: Feed Paging and Archiving, I like its solution for archival - stable documents that can be used to reconstruct the whole logical feed.

I'm not so fond of the the paging functionality, as it is lossy (as per the description on the spec) - matter of fact, I don't think it is a good use case for podcasts, and I am tempted to say, without good references to back me up, that hosts are using paging when they should be using archival (a feeling based on past experiences).
Some research would be interesting here on the feeds of the index to check how they are using RFC 5005: Feed Paging and Archiving.

Regarding frequency - as a listener, I don't really care about it, once I look at the list of recent episodes, I already have a good idea about a show's frequency.
As a podcaster, I don't think I would use it.
As an app developer, I'd rather have good implementation of HTTP conditional requests across all hosts, and pulling them frequently (or using podping) than relying on a frequency tag - if a podcaster decides to post something outside of the show's normal frequency, I want it as soon as possible.
Sometimes less is more, and I feel this is the case here (but I am open to hear about interesting use cases).

Regarding the complete tag - it would be nice for UX to show something on the app (also search results when looking for podcasts) indicating that it is complete.
I don't care much about signaling a hiatus, since it can be inferred by the user based on the pub date of the last episode of the shows (maybe using symbols and colors to indicate "age" on the apps would be an interesting idea).

Out of curiosity, I grabbed a random sample of 5000 RSS feeds, but I didn't encounter a single one that actually provides a link to an RFC 5005-conformant archive feed, despite there being a standard for it.

Although I do agree archive feeds would be really nice. For instance, there's a "short stories" podcast I'm subscribed to where I'd love to go back and re-listen to old stories that are no longer in the recents feed. And on other podcasts with interviews, there have been old interviews I wanted to go back for and grab a quote. So it's more a matter of convincing the hosting companies to start adopting it by default.

(Incidentally, if that endeavour is successful, a second order problem may arise for the podcast index, in which we would probably want some code to ensure that the indexer doesn't redundantly index all of these multiple feed URLs as separate entries (the main one plus the multiple pages and archives) when only the main one should be indexed.)

want some code to ensure that the indexer doesn't redundantly index all of these multiple feed URLs as separate entries (the main one plus the multiple pages and archives) when only the main one should be indexed.)

What about just duplicating the podcast:guid? Then it would show as a collision to all directories and could be worked out using another specifier somewhere that it’s an archive. Needs some thought.

It might be easier than that. If I'm interpreting RFC 5005 correctly, the archive feed should link back to the main feed via a "current" link.

Archive documents SHOULD indicate their associated subscription
documents using the "current" link relation.

So you could simply not index any RSS feed that contains a "current" link. Or to make it more robust, any RSS feed containing a "current" link that's different from the current URL or "self".

Regarding frequency - as a listener, I don't really care about it, once I look at the list of recent episodes, I already have a good idea about a show's frequency.

I see the purpose as making the information more readily available without the person having to figure it out. We could say a description is unnecessary because someone could simply look at the episodes or listen to the podcast to know what the podcast is about.

In April 2021, Apple lets publishers specify “Update Frequency” for their shows in Apple Podcasts Connect. This shows up in the API as one of these options:

  • Daily
  • Weekly
  • Semiweekly (twice per week)
  • Biweekly (every other week)
  • Monthly
  • Semimonthly (twice per month)
  • Bimonthly (every other month)
  • No Set Schedule

I was actually going to suggest exactly these things. Sure, "semiweekly" is often misunderstood (probably not as much as when midnight is!), but the UI can easily clarify that like shown above. (Though I think "semiweekly" should cover 2–4 episodes per week and "semimonthly" should cover 1–3 episodes per month.)

PowerPress already supports <rawvoice:frequency> but uses a free-text value. And that makes me wonder what would actually be wrong with a free-text value plus some presets. For example, the UI could offer the above options, but the podcaster could also enter something like "3 episodes per week."

Do we see frequency as a separate proposal at this point, or still part of this one?

My gut is that it is a separate tag but I’d like to hear other views.

Do we see frequency as a separate proposal at this point, or still part of this one?

I'll defer to those with more experience in getting adoption, but it seems to me that more people would use frequency than would use complete. So rolling the two together, where complete is an option of frequency, might lead to more adoption.

But I'm willing to be wrong about that.

I'd be on board with that and think it makes the most sense as well.

I still support this, too.

I think semiweekly and semimonthly can also be more than twice in that period, but obviously not so much as to be "daily" or "weekly," respectively.

Seasons can also be complete, which is more nuanced. Clearly, I wouldn't want my entire show to be marked as complete if it was just a season.

I like this idea! Here's my suggestion for how to implement it on the last episode of a season:

<podcast:season name="Race for the Stars" status="complete">3</podcast:season>

I went for a "status" attribute for more flexibility than a "complete" boolean would offer. A season could have a status of "hiatus" on the last episode, or maybe some other kind of status.

I do wonder, though, how publishing systems and apps should handle this if the podcaster changes their mind, such as wanting to release bloopers or other bonus content for a season even though they already marked it as complete.

@daveajones asked me to write up a proposal, but life got hectic, and it slipped my mind.

My preference would be to mirror the purpose of the “Update Frequency” field in Apple Podcasts Connect by providing the podcaster a way to express their intention for future releases across all apps. But we can go further and give podcasters a concise, unambiguous way to communicate more nuanced information without any of the confusion of "what do you mean by biweekly?"

This data can be collected in plain text by hosts, converted to an RRULE string for the RSS feed, and apps could easily display a predicted date/time for the next episode’s release personalized to the listener’s timezone as well as complicated release schedules like "every Monday, Wednesday, and Friday."

Format

Using RRULE, podcasters can express the following:

  • frequency (daily/weely/monthly/yearly)
  • start date (The start date of this frequency schedule.)
  • end date (The last date of this frequency schedule.)
  • timezone (useful for personalizing the expected next release to the listener)
  • count (How many occurrences will be generated.)
  • interval (every 2 weeks, every 3 months, etc.)
  • weekday (every Wednesday and Friday)

Complete

If a show has set a limit (either end date or count) and we have passed that point, apps can infer that the podcaster has expressed no intention for further episodes. Apps could choose to indicate that a show is "On hiatus or finished."

Seasonal Hiatus

If a show is not currently releasing episodes but intends to in the future, they could set a release schedule starting a future date to avoid being listed as "On hiatus or finished." Hypothetically, a podcaster that plans to release 10 episodes, 1 per week, in 6 months' time could express that long before they have a trailer ready. A listener who discovers the show during the hiatus could see "Next episode: Wednesday, May 10 2023" in their podcast app.

Bonus Content

If a show unexpectedly releases bonus content after its frequency schedule is concluded, there’s no need to adjust the schedule because it should always refer to their intention for future releases.

Other Needs

If any other use cases need to be addressed by this tag, let me know!

@evoterra

I see this tag as being more of something apps/directories should display …

I like that purpose. It also helps apps more clearly display that instead of having to analyze the feed to assume it.

I also see this as being closely associated with the frequency data. A "completed" show doesn't have a frequency anymore, and a show with a frequency isn't "completed," yet. So I agree that "completed" could be part of the "frequency" tag (or whatever we call it).