/alexa-youtube-skill

Allows Alexa to play audio from YouTube videos

Primary LanguageJavaScriptMIT LicenseMIT

alexa-youtube-skill

DISCLAIMER: This skill is not officially supported by YouTube in any way and, as such, will never be published on Amazon. It is more intended as a proof-of-concept, but instructions on setting it up are provided.

By default, Amazon Alexa does not support playing audio from YouTube. In fact, it only supports a limited number of third-party audio-based skills like Spotify music. Otherwise, all default Alexa skills that use audio are tied almost exclusively to Amazon services.

alexa-youtube-skill contains the code for an unpublished skill that allows users to search and play audio from YouTube. For example, a user might say:

"Alexa, search YouTube for Frost Hyperventilate."

Alexa will then do a search, finding the most relevant video that matches the query (in this case, https://www.youtube.com/watch?v=Ol592sakmZU) and then will return and play the MP3 version of the video.

Setup Process

  1. Go on https://developer.amazon.com/ and log in with a developer account. Navigate to the "Alexa" tab and click on "Alexa Skills Kit."
  2. Click on "Add Skill." You will be taken to a setup menu.
  3. Skill Information page: give the skill a name you choose. For Invocation Name, put 'youtube' and in the Global Fields section, mark that the skill uses audio player directives.
  4. Interaction Model page: put the following under Intent Schema.
{
  "intents": [
    {
      "slots": [
        {
          "name": "VideoQuery",
          "type": "VIDEOS"
        }
      ],
      "intent": "GetVideoIntent"
    },
    {
      "intent": "AMAZON.PauseIntent"
    },
    {
      "intent": "AMAZON.ResumeIntent"
    },
    {
      "intent": "AMAZON.StopIntent"
    },
    {
      "intent": "AMAZON.RepeatIntent"
    },
    {
      "intent": "AMAZON.LoopOnIntent"
    },
    {
      "intent": "AMAZON.LoopOffIntent"
    }
  ]
}

Next, in the Sample Utterances section, put this.

GetVideoIntent search for {VideoQuery}
GetVideoIntent find {VideoQuery}
GetVideoIntent play {VideoQuery}
GetVideoIntent start playing {VideoQuery}
GetVideoIntent put on {VideoQuery}

4b. Note for German users, under Intent Schema, replace "GetVideoIntent" with "GetVideoGermanIntent". Substitute this in replace of the English commands:

GetVideoGermanIntent suche nach {VideoQuery}
GetVideoGermanIntent suche {VideoQuery}
GetVideoGermanIntent finde {VideoQuery}
GetVideoGermanIntent spiele {VideoQuery}
GetVideoGermanIntent spiele {VideoQuery} ab
GetVideoGermanIntent gib {VideoQuery} wieder
GetVideoGermanIntent fange {VideoQuery} an zu spielen
GetVideoGermanIntent fange an {VideoQuery} zu spielen
GetVideoGermanIntent zeige {VideoQuery}
GetVideoGermanIntent starte die Wiedergabe von {VideoQuery}
  1. Add a custom slot type called VIDEOS. Under "Values", put:
prince
the fray
the rolling stones
toad the wet sproket
KC and the sunshine band
john travolta and olivia newton john
DJ jazzy jeff and the fresh prince
lola
hello dolly
love me tender
fools gold
roberta flack killing me softly with his song
stevie wonder superstition
boston
full circle
dubstar
underworld
orbital
let me be your fantasy
pop will eat itself
ultra nate
4 hours Peaceful and Relaxing Instrumental Music
  1. Configuration page: under Endpoint, select AWS Lambda ARN (Amazon Resource Name) as the Service Endpoint Type. Select North America/Europe depending on where you are. In the field that pops up, leave that blank for now. We will come back to that once the skill has been uploaded to Lambda. Also, under Account Linking, make sure that 'no' is checked.
  2. Now it's time to set up Lambda. Log on to your AWS account and select "Lambda" from the main console menu. Make sure your region is set to N. Virginia (North America) or EU-Ireland (Europe).
  3. Click on "Create a Lambda function" in the Lambda console menu. For the blueprint, select alexa-skills-kit-color-expert.
  4. Configure the function. Give it a name like "alexaYoutubeSkill" and fill in an appropriate description. Assign it to a role with at least S3 read permissions. Make sure that the skill is using Node v4.3. Leave the rest the default skill for now.
  5. Click here to download alexa-youtube-skill.zip, which contains all the code for the Lambda server.
  6. Now, go back to the Lambda function you just saved. Under "Code entry type," select "Upload a ZIP file." Then, upload alexa-youtube-skill.zip under "Function Package."
  7. You will now need to fill out the required environment variables:
Key Value
ALEXA_APPLICATION_ID found under Skill Information under your skill in Amazon Developer
HEROKU_APP_URL OPTIONAL the URL for the Heroku intermediary server. Defaults to https://dmhacker-youtube.herokuapp.com if this variable is not included. Otherwise, you can choose to setup and use your own server.
  1. Under "Configuration" -> "Advanced Settings" in your Lambda server, go to the "Timeout" section. Change the timeout duration from 3 seconds to >= 1 minute.
  2. The last step is linking your Lambda function to your Alexa skill. Go back to Alexa under Amazon Developer and find your skill. In the Configuration page, put the Lambda ARN name in the blank spot that you left earlier.
  3. Go to the Test page and set Enabled to true. The skill will now work exclusively on your devices.

Technical Details

The way the skill searches, downloads, and fetches the audio is very complicated because it relies on several free utilities. The basic flow of information through the skill could be summarized as this:

Request (1) -> AWS Lambda (2) -> Custom Heroku Server (3) -> User (4)

  1. The user makes a request mentioning the skill. See the summary for an example.
  2. The skill, which is being run on an AWS Lambda server, receives the query.
  3. The skill passes that query to a custom Heroku server that I built. The Heroku server pulls up the most relevant video on YouTube, downloads the audio into a temporary public folder and returns the link/metadata to the skill. The download process is done asynchronously (the server returns immediately), so the skill blocks until the server notifies the skill that the download is complete.
  4. The skill will then send a PlayRequest to the user's Alexa with the link to the MP3 file.