Around 2020, I started developing a new approach that uses a bunch of terminal tools over yonder: https://github.com/kristopolous/music-explorer . This thing still works but I don't use it a lot.This tool (in development since 2008!) is for quickly navigating and listening to large ad-hoc playlists from youtube:
Also, a large effort is being made to have a mobile-friendly version.
There's special attention for looking for and adding:
- similar videos
- videos by the same uploader
- videos by the same "group"
Where similar videos and group videos have a number of heuristic tests on them to keep the crap out.
This isn't a fly-by-night project. In use consistently since 2008 - it's been constantly updated and refined. It's been through many iterations over the years and done many UI pivots. I honestly use it every day.
The space for long-tail thematic grouping of music has certainly grown since my inception, but the solutions don't cut it because:
- They require too much effort on my part to curate and follow hundreds of people, then unfollow ones that post junk
- If things are 'automated', they aren't very good at guessing how I listen to music (I focus on things I haven't heard).
- Many systems rightly weigh a song's popularity as part of the likelihood that I want to listen to it. This is wrong. If it's popular, I've already heard it and moved on.
- These for-profit commercial services end up pitching promoted artists through some kind of revenue model in a way that disrupts the customer experience.
This software may not be for you. That's fine. Widespread general utility wasn't on my roadmap.
Demo? See here:
Last 5 years:
- 2020 Feb: I'm actively researching how to instrument bandcamp embeds. Currently I can leverage youtube-dl to generate link at runtime and then embed them into
<video>
embeds. I'm exploring indirect navigation and referencing schemes to be able to go cross domain seamlessly. - 2020 Jan: Added an ability to merge multiple yt users uploads into a single list
- 2019 Nov: Mobile has long since been done and I use it constantly. Also I found a reference to a previous version from 2010. I believe this goes back further, maybe in svn or cvs repositories, I'll have to look somewhere.
- 2019 Feb: The modern mplayer,
mpv
supports direct youtube urls. So you can do something like curl localhost/ytmix/api/gettracks.txt/564/dopp | shuf | xargs -I %% mpv -vo null "https://youtube.com/watch?v=%%" - 2017 Jan: For PHP-7 make sure you have the mbstring extension installed.
- 2016 Aug: There's a tool now at
tools/shell-listen.sh
that accomplishes using this tool from the shell since recent (2016-09-30) FF dev versions have been eating up lots of CPU for the HTML5 yt videos. - 2016 April: I normalized the database to have a table of tracks. This table keeps track of whether a video can be played or not, how many times it's been played and the total time listened throughout those plays. It also keeps track of the uploader and channel.
The eventual goal is to use this as data to discover more content. There's a number of floating dimensions in this analysis which can lead to bad inferences, but the general idea is that if content is frequented often, listened to almost completely, and by the same uploader, then more content from that uploader would probably be good.
The fundamental flaw in this analysis is the volume versus curation problem with any follow system. Pretend Alice and Bob upload videos. You like 100 of Alice's videos and 10 of Bob's.
The question is given a video you haven't seen Y
from user X
, what is the qualitative likelihood p
that you will like it?
If both Alice and Bob upload a video say, tomorrow, which one are you more likely to enjoy?
The answer is actually indeterminant without additional info because total volume of assets hasn't been looked at.
If we add an additional field, say "number of videos uploaded":
Liked Uploaded probability
Alice 100 10000 0.01
Bob 10 20 0.50
All of a sudden Bob looks like a good candidate.
But wait, there's more! We haven't looked at individual exposure yet. Let's say I haven't seen all 10k uploads from Alice or 20 from Bob.
Liked Seen Uploaded probability
Alice 100 100 10000 1.00
Bob 10 20 20 0.50
Now things have swapped yet again! And this only takes into consideration binary classification I have to choose whether I think something is the greatest thing ever or completely intolerable. This classifier doesn't accomodate for or differentiate the in-between.
As I've suggested above, using the opinion of others collectively in aggregate is not a useful indicator otherwise I'd just hit up the local clearchannel station KRAP and listen to that all day. What most people like is total garbage and nonsense. If that metric is to be used at all, it's to discard anything that's past a certain popularity threshold.
For a while that was done as youtube would inject the latest garbage from some teen pop sensation in videos related to say, john coltrane. Ah yes, youtube, that's right; it's John Coltrane, Pharoah Sanders, Miles Davis, and Ke$ha. That's totally sensible.
The quickest way to create a new playlist is to use the import/parse.js
script which uses the YouTube data api v3 in order to create a list
of the uploads of a specified user. Unfortunately, due to v3 bullshit, You need to get an auth-key to send off the requests.
- Go to registering an application.
- In the "developer console" you need to click on a few things.
When you finish step 3 a dialog will pop up:
After clicking you'll be directed to this form. Just leave everything blank.
After clicking you'll get a spinner and be asked to wait a few seconds.
You'll eventually get something that looks like this:
This string of letters and numbers, we'll call the "auth key" has be in a file located at secrets/authkey
. In order to create the file,
after you've pulled down the code, go to the git root directory and do the following:
(git root)$ echo "Your 'auth key'" > secrets/authkey
That means that the file contains just your key, no other code, format, or syntax ... it's the simplest format imaginable.
phew now you can import.
An example would be
$ node parse.js tangramten24
User: tangramten24
> createid{"id":"tangramten24"}
> update{"id":625,"name":"Uploads by tangramten24"}
> https://www.googleapis.com/youtube/v3/channels?part=contentDetails&forUsername=tangramten24&key=AIzaSyD3cxApQz9auBO79CAy
> https://www.googleapis.com/youtube/v3/playlistItems?part=snippet&maxResults=50&playlistId=UUQpIIWe6P3FN3G_Wx3ErrQw&key=A
> tracks{"id":"4WvAjndDs48,uqOXPuKLOUo,jt-HGIdRIv4,SRP78mA021I,CLi_m7Bwf1g,gk4OU6BJMAc,vP7-cNi4ZVc,mLw94okMaY0,wWCMZH4Hhts
> https://www.googleapis.com/youtube/v3/videos?part=contentDetails&id=4WvAjndDs48%2CuqOXPuKLOUo%2Cjt-HGIdRIv4%2CSRP78mA021
+++ adding 50
> addTracks{"id":625,"param":[[786,"Re-VoLt - MAGNETIC STORMS ON THE EVENT HORIZON.","4WvAjndDs48"],[232,"TANGERINE DREAM
> https://www.googleapis.com/youtube/v3/playlistItems?part=snippet&maxResults=50&playlistId=UUQpIIWe6P3FN3G_Wx3ErrQw&key=A
> tracks{"id":"EDDf34fqugs,R-S4916XWGA,CYqIA2sVgjA,YOBIc0eDXVI,P5QdLxxOWmI,O0HM-67IgkQ,FxmCNoUUjBg,JEj5lEEGvqk,9yRxR0bkz6I
> https://www.googleapis.com/youtube/v3/videos?part=contentDetails&id=EDDf34fqugs%2CR-S4916XWGA%2CCYqIA2sVgjA%2CYOBIc0eDXV
+++ adding 50
> addTracks{"id":625,"param":[[642,"KLAUS SCHULZE - FRANK HERBERT.","EDDf34fqugs"],[445,"TANGERINE DREAM - VERNAL RAPTURE.
...
$
There's a "quota" system on youtube (and each of these requests have a "cost"). I'm pretty conscious of this since I use this thing every day. I try my hardest to be as thrifty as feasible.
If you want to avoid all the quota and key nonsense, there's a dump db/mysql-dump.db.lzma
that is occasionally updated. Just do
mysql -uroot yt < mysql-dump.db
As far as the mysql credentials they are in secrets/db.ini. Oh dear, now you know my localhost doesn't have a password for root. Woe is me. Also, there's some notes in the parent directory's readme.
There's two tables ... one that has the playlist and one that has a normalized set of the tracks. The tracks table is used for
- importing playlists so not to incur an api cost for getting the duration (thanks youtube!)
- tracking whether a track is 'active' or pulled for youtube
- storing when the track was added to the system
- storing the last time it was listened to
- having an internal view count of them.
- knowing how much of the track was listened to as a metric of whether I actually liked the content or unfortunately just ran into it frequently.
Note: Youtube also has Google+ accounts which don't have any YouTube user ids... this is fine! Just pass the channel id. For Example, the Berlin School and Ambient channel isn't associated with a youtube id. You can pass the channel id, in this case
UCqOkOS00m2lRud_sJSa96Mw
, and the tool will figure it out and do the import.
You can use youtube-dl with a FIFO-pipe and mplayer to play things over a low-bitrate connection like so:
$ mkfifo audio-pipe
$ while [ 0 ]; do mplayer -quiet audio-pipe; done
When you do a -quiet option, then mplayer doesn't send as many updates to the terminal. For me, it actually adds about 45 minutes to my battery life (~11hr 30 -> ~12hr 15) when I have this on - you should see your X resources in top (or htop) drop a few percentage points with it - regardless of whether you have it iconified or not.
$ curl localhost/ytmix/api/gettracks.txt/(id)[/query] \
| shuf \
| xargs -n 1 youtube-dl -q -o audio-pipe -f 140 --
format 140 is an audio-only format.
Also you can put an optional query on the end of it to get just those tracks that match it.
There's a much more obvious thing you can do to listen to your playlist offline. Again, using xargs and its parallel magic...
$ curl localhost/ytmix/api/gettracks.txt/(id) | xargs -n 1 -P 8 youtube-dl -q -f 140 --
Can allow you to go off and get a large assortment of m4as.
All the following can be accessed by doing a GET (unless otherwise specified) to api/[the thing listed]
.
The JSON results return
{ status: bool, result: json }
Where status
is either True or False, depending on whether the call succeeded or failed.
If the call fails, then the result will hold an error string.
There's also an "extension" support for the api call which, if the result is an array, will return it as a list with newlines, appropriate for scripting. For instance, you could do
api/names
to get all the names of the playlists as json or
api/names.txt
to get them as text. Then if you want to update the playlists you could do
curl localhos/api/names.txt | xargs -n 1 import/parse.js
There's a tool for inspecting this API. Since I made this tool, there's a bunch of complex crazy kool kid things for looking at APIs that the framework people love to use. I'm sure you can do that if you want. Anyway this one is only a 50 line html file and it works fine. It's in /test
here's a screen-shot:
List the supported functions (enumerated below).
Playlists keep an audit trail of how videos were added. This audit trail is called a method. It's a xref table. Each method has an id. Examples are things like:
The track was added by...
s:artist
- ...searching for the string "artist"r:ytid
- ...getting videos related to the ytidu:user
- ...adding videos by a specific userl:url/string
- ...scraping the url, looking for , and inferring a track-listing by surrounding text
The result is an id to refer to the method by. So
GET /api/addMethod/123/s:someone
Will return an id to refer to "s:someone" by.
Adds tracks to the playlist at id
. The JSON should be an array of YouTube ids. This is used in the node.js importer
REMOVES ALL TRACKS FROM id
. Use with Caution!
If:
createid
- Will return the next valid playlist id which can be populated.createid/source
- Will look for a playlist with the sourcesource
. If found, will return the id, otherwise, create a new one.createid/source/tracks
- Will do the same as above, but if the source does not exist, will seed a new playlist with the JSON of tracks.
Removes the ytid
from the favorite list of user
.
Generates the 2x2 preview window for playlist id
. This includes things like track count and duration. It should be run after updating a tracklist.
Returns a select * from playalist where id = <id>
in JSON format. This includes the methods
, blacklists
(of removed tracks), types
, source
, and playlist
.
Gets all the favorite tracks of user user
.
Returns the preview for a specific playlist id
in the following format:
tracks
: A list of triplets in the format: [track length in seconds, title, ytid]length
: The length of the playlist in seconds.count
: The number of tracks in the playlist.
Returns just the youtube-id entries for a playlist, delimited by a newline.
If id
is specified, returns the db entry for that user, otherwise creates a new one.
Searches string
on YouTube, returning 20 results. The format returned is as follows:
query
- the query string searched.vidList
- An array of objects with the keystitle
,ytid
, andlength
.url
- The url used for the query.
A set of previews for recent playlists.
Returns a set of videos related to YouTube-id id
In the following format:
ytid
- The ytid used in the query.related
- An array of related video objects with the keystitle
andytid
.url
- The url used for the query.
REMOVES A PLAYLIST from the database with id id
.
Adds ytid
to user id's
favorite list.
Updates the entry for id.
Returns a list of the tracks either system wide or specific to a playlist id.
An in-service way to look up snippet, details, statistics, etc. from a list of ytids
There's a function for displaying the most popular artists in a console.table
using the stats()
function (located in Utils.js). The implementation is a pretty good example of my db.js being utilized. Here's an example output: