/baconizer

An interactive game based on the 6 Degrees of Kevin Bacon parlor game where players try to find the shortest path between actors

Primary LanguageJavaScript

Baconizer

Can you beat 6 degrees of separation? The goal of this game is to find the shortest path between any 2 actors by jumping between movies and actors.

Baconizer basics

Baconizer is based on the 6 Degrees of Kevin Bacon parlor game.

Baconizer win

Technologies

  • Backend: Node.JS, Express
  • Frontend: D3.js, JavaScript, HTML & CSS
  • Hosting: Heroku

Features

Massive Database

Baconizer has a local database of over 100,000 actors and over 200,000 movies, comprising about 100MB of raw data. This database was obtained by making millions of API calls to TMDb (not to be confused with IMDb) and then pruning and cleaning the responses after writing them all to local files. For example, movies with a single actor were removed as they would not provide a link between actors.

Every time the backend server starts up, it reads through this local database and holds references to every actor and movie as nodes in a large graph. This allows for highly responsive performance when traversing this data.

Manufacturing a proprietary database and ensuring that kind of speed and performance was necessary based on the scope of the operations we're using in the guided gameplay functionality described below.

Guided Gameplay through Hints

hint process

Hints are key to the Baconizer gameplay experience: they provide help when you get stuck and make the game playable for those who may not know a lot about movies and actors. But in order to get hints in the first place, we need to know all possible shortest paths to get from our start actor to our end actor, which seems like a massive undertaking.

Baconizer is able to gather these paths every time a player makes a move, recalculating the best paths based on the player's new position.

Here's how:

Bi-directional Breadth First Search

We need to work our way into the data starting from our center node and branching out one step at a time. We can improve upon the basic strategy by starting the same search from our endpoint as well and working both sides of the problem. The images below both represent the same graph. Notice how if we start from both sides, we waste a lot fewer resources checking dead ends before getting to the answer.

Bi-directional BFS

Here's simplified part of the BFS function we call when we have a current search rung of actors and we're trying to get to a new level of movies:

const actors2Movies = (queue, BFS1SeenMovies, BFS2SeenMovies, firstSteps) => {
  // If our search ends at this level, winnerIds will hold the answers
  let winnerIds = new Set();
  
  // Let's loop through our starting queue to get to all the movies they've collectively starred in
  for (let i = 0; i < queue.length; i++) {
    let origActorId = queue[i]
    
    // "actors" is the local object holding over 100,000 entries
    // Let's loop through all the movies one actor starred in
    let mIds = actors[origActorId].movie_ids
    for (let j = 0; j < mIds.length; j++) {
      let movieId = mIds[j]
      
      // If we've already processed this movie in this BFS, we don't need to do it again
      if (BFS1SeenMovies.has(movieId)) continue
      
      // firstSteps.actors holds keys of actor ids and values of Sets
      // These sets contain ids of every movie you can click on from our center to get to that actor the fastest
      if (!firstSteps.movies[movieId]) firstSteps.movies[movieId] = new Set()
      firstSteps.actors[origActorId].forEach(origFirstStep => {
        firstSteps.movies[movieId].add(origFirstStep)
      })
      
      // Now we can check to see if our sister BFS has already encountered this movie
      // If so, we've found a path!
      if (BFS2SeenMovies.has(movieId)) winnerIds.add(movieId)
    }
  }
  // truncated...
}

Using our data structure and this type of algorithm, we can return all possible hints in just milliseconds over 99% of the time. (Side note: It's so fast that if we let it run wild without an end goal, the low-end tier of Heroku's working memory typically runs out before the algorithm has run for 5 seconds 😬 )

D3.js Force-Directed Graph

To display all your next choices at any point, I used D3.js to make a force-directed graph with the center representing your current position and all surrounding nodes representing the next options.

force directed graph

D3 uses svg or canvas elements instead of normal javascript DOM manipulation to render a lot at once with comparatively better performance. It allows us to set different types of forces on our force-directed graphs to choose the best way to display our data:

const sim = d3.forceSimulation()
    .force("y", d3.forceY(this.height / 2).strength(.2))
    .force("charge", d3.forceManyBody().strength(-3000))
    .force("link", d3.forceLink().id(d => d.frameId))
    .force("collide", d3.forceCollide().radius(55))

You can also set d3-specific event handlers on your svg elements. These ones grow & shrink images on hover:

d3.select(".images")
    .on('mouseenter', function () {
        d3.select(this)
            .transition()
            .attr("x", function (d) { return -33; })
            .attr("y", function (d) { return -60; })
            .attr("height", 100)
            .attr("width", 66);
    })
    .on('mouseleave', function () {
        d3.select(this)
            .transition()
            .attr("x", function (d) { return -25; })
            .attr("y", function (d) { return -38; })
            .attr("height", 75)
            .attr("width", 50);
    });

Actor Search and Validation

Baconizer uses "input" event handlers on <input> elements and Regular Expression matching to search for and validate actor names in the database. As you type, it checks whether your entry matches any part of any actor's name and then returns the top 10 matching actors, sorted by TMDB's "popularity" metric.

actor search

Sorting by popularity comes in handy when you're searching through 100,000 names.

Here's the bulk of the regex matching function:

const suggest = (query) => {
    // Don't bother searching for exact matches or for anything matching only one character
    if (query.length < 2 || nameToId.hasOwnProperty(query)) return null;

    // The 'i' tag makes the search case-insensitive
    const reg = new RegExp(query, 'i')

    const results = [];
    for (let i = 0; i < noAccents.length; i++) {
        // noAccents is a collection of all names with any diacritical markings removed for easier comparison
        // For example, 'Malin Akerman' would now successfully match the actress 'Malin Ã…kerman'
        if (noAccents[i].match(reg)) results.push(allActorNames[i])
    }

    if (results.length === 0) return [];
    
    const sorted = results.sort((a,b) => {
        // Sort all matching results by our stored popularity metric
        if (nameToId[a].popularity > nameToId[b].popularity) return -1;
        return 1;
    })

    // Finally, return the 10 most popular actors matching the original query
    return sorted.slice(0,10)
}

As soon as the site recognizes an actor's full name in the input, that individual's photo appears in the box below the input and the border turns green to confirm that this actor is ready to go.

Node filtering

If you're ever overwhelmed by the amount of choices on the screen, you can use the filter feature to limit the nodes that show up on any page.

filter demo

Just like the actor validation functionality, filtering uses Regular Expressions to match the names or titles of the nodes on the screen

Future Updates

Baconizer's roadmap includes the following:

  • Revamping the database to exclude certain "cheaty" entries
    • "And the Oscar Goes To..." is a good example of a "cheaty" movie: It's basically a giant compilation of famous actors and how they won some Academy Award. It makes for a very effective link, but it doesn't stand on its own as a legitimate movie.
  • Alternate carousel or list view of all the center options with variable ordering