/cspot

A Spotify Connect player targeting, but not limited to embedded devices (ESP32).

Primary LanguageCOtherNOASSERTION

C/C++ CI ESP IDF Certification Certification

🎺 cspot

A Spotify Connect player written in CPP targeting, but not limited to embedded devices (ESP32).

Currently in state of rapid development.

Only to be used with premium spotify accounts

Building

Prerequisites

Summary:

  • cmake (version 3.0 or higher)
  • gcc / clang for the CLI target
  • esp-idf for building for the esp32
  • portaudio for playback on MacOS
  • downloaded submodules
  • golang (1.16)
  • protoc
  • on Linux you will additionally need:
    • libasound and libavahi-compat-libdnssd
  • mbedtls

This project utilizes submodules, please make sure you are cloning with the --recursive flag or use git submodule update --init --recursive.

MBedTLS is now the sole option, so you can get it from there and rebuild it or have it installed system-wide using your favorite package manager. See below how to use a local version.

This library uses nanopb to generate c files from protobuf definitions. Nanopb itself is included via submodules, but it requires a few external python libraries to run the generators.

To install them you can use pip:

$ sudo pip3 install protobuf grpcio-tools

(You probably should use venv, but I am no python developer)

To install avahi and asound dependencies on Linux you can use:

$ sudo apt-get install libavahi-compat-libdnssd-dev libasound2-dev

Building for macOS

Building for macOS/Linux & Windows

The cli target is used mainly for testing and development purposes, as of now it has the same features as the esp32 target.

As MbedTLS is now use instead of OpenSSL, you need to install it or your system or have a local build. If you have a system-wide install of MbedTLS, ignore what's below

To use a local build, you have to specify the BELL_EXTERNAL_MBEDTLS and potentially MBEDTLS_RELEASE. The first one points to the "./cmake" subdir of the MbedTLS's build directory, the second optionally defines the name of the MbedTLS build (it's by default set to 'RELEASE' for Windows and 'NOCONFIG' for others).

See running the CLI for information on how to run cspot on a desktop computer.

macOS/Linux

# navigate to the targets/cli directory
$ cd targets/cli

# create a build directory and navigate to it
$ mkdir -p build && cd build

# use cmake to generate build files, and select an audio sink
$ cmake .. -DUSE_PORTAUDIO=ON [-DBELL_EXTERNAL_MBEDTLS=<mbedtls_build_dir>/cmake>] [-DMBEDTLS_RELEASE=<release_name>]

# compile
$ make 

Windows

# navigate to the targets/cli directory
$ cd targets/cli

# create a build directory and navigate to it
$ mkdir -p build && cd build

# use cmake to generate build files, and select an audio sink
$ cmake .. -A Win32|x64 -DUSE_PORTAUDIO=ON [-DBELL_EXTERNAL_MBEDTLS=<mbedtls_build_dir>/cmake>] [-DMBEDTLS_RELEASE=<release_name>]

Go to build and use cspotcli.sln under VisualStudio or use msbuild from command line.

Note that for now, only the Win32 build has been tested, not the x64 version. Under some VS releases, the protobuf might not be rebuilt automatically, just go to the project "generate_proto_sources" and do a C^F7 on each *.pb.rule

Building for Linux

The cli target is used mainly for testing and development purposes, as of now it has the same features as the esp32 target.

# navigate to the targets/cli directory
$ cd targets/cli

# create a build directory and navigate to it
$ mkdir -p build && cd build

# use cmake to generate build files, and select an audio sink
$ cmake .. -DUSE_ALSA=ON

# compile
$ make 

See running the CLI for information on how to run cspot on a desktop computer.

Building for ESP32

The ESP32 target is built using the esp-idf toolchain

# Follow the instructions for setting up esp-idf for your operating system, up to `. ./export.sh` or equivalent
# esp-idf has a Python virtualenv, install nanopb's dependencies in it
$ pip3 install protobuf grpcio-tools
# update submodules after each code pull to avoid build errors
$ git submodule update --init --recursive
# navigate to the targets/esp32 directory
$ cd targets/esp32
# run once after pulling the repo
$ idf.py set-target esp32

Configure CSPOT according to your hardware

# run visual config editor, when done press Q to save and exit
$ idf.py menuconfig

Navigate to Example Connection Configuration and provide wifi connection details

idf-menuconfig

Navigate to CSPOT Configuration, you may configure device name, output device and audio quality.

idf-menuconfig

Status LED

By default LED indication is disabled, but you can use either standard GPIO or addressable LED to indicate cspot current status. It will use different blinking patterns (and colors in case of addressable LEDs) to indicate Wifi connectivity and presense of connected Spotify client.

Building and flashing

Build and upload the firmware

# compile
$ idf.py build

# upload
$ idf.py flash

The ESP32 will restart and begin running cspot. You can monitor it using a serial console.

Optionally run as single command

# compile, flash and attach monitor
$ idf.py build flash monitor

Running

The CLI version

After building the app, the only thing you need to do is to run it through CLI.

$ ./cspotcli

If you run it with no parameter, it will use ZeroConf to advertise itself. This means that until at least one local Spotify Connect application has discovered and connected it, it will not be registered to Spotify servers. As a consequence, Spotify's WebAPI will not be able to see it. If you want the player to be registered at start-up, you need to either use username/password all the time or at least once to create a credentials file and then re-use that file. Run it with -u/-p/-c once and then run it with -c only. See command's line help.

Now open a real Spotify app and you should see a cspot device on your local network. Use it to play audio.

Architecture

External interface

cspot is meant to be used as a lightweight C++ library for playing back Spotify music and receive control notifications from Spotify connect. It exposes an interface for starting the communication with Spotify servers and expects the embedding program to provide an interface for playing back raw audio samples (AudioSink).

You can view the cspot-cli program for a reference on how to include cspot in your program. It provides a few audio sinks for various platforms and uses:

  • ALSAAudioSink - Linux, requires libasound
  • PortAudioSink - MacOS (PortAudio also supports more platforms, but we currently use it only on MacOS), requires the PortAudio library
  • NamedPipeAudioSink - all platforms, writes to a file/FIFO pipe called outputFifo which can later be played back by FFmpeg. Used mainly for testing and development.

Additionaly the following audio sinks are implemented for the esp32 target:

  • ES9018AudioSink - provides playback via a ES9018 DAC connected to the ESP32
  • AC101AudioSink - provides playback via the AC101 DAC used in cheap ESP32 A1S audiokit boards, commonly found on aliexpress.
  • PCM5102AudioSink - provides playback via a PCM5102 DAC connected to the ESP32, commonly found in the shape of small purple modules at various online retailers. Wiring can be configured in the sink and defaults to:
    • SCK to Ground
    • BCK to PGIO27
    • DIN to GPIO25
    • LCK to GPIO32
    • GND to Ground
    • VIN to 3.3V (but supposedly 5V tolerant)
  • TODO: internal esp32 DAC for crappy quality testing.

You can also easily add support for your own DAC of choice by implementing your own audio sink. Each new audio sink must implement the void feedPCMFrames(std::vector<uint8_t> &data) method which should accept stereo PCM audio data at 44100 Hz and 16 bits per sample. Please note that the sink should somehow buffer the data, because playing it back may result in choppy audio.

An audio sink can optionally implement the void volumeChanged(uint16_t volume) method which is called everytime the user changes the volume (for example via Spotify Connect). If an audio sink implements it it should set softwareVolumeControl to false in its consructor to let cspot know to disable the software volume adjustment. Properly implementing external volume control (for example via dedicated hardware) will result in a better playback quality since all the dynamic range is used to encode the samples.

The embedding program should also handle caching the authentication data, so that the user does not have to authenticate via the local network (Zeroconf) each time cspot is started. For reference on how to do it please refer to the cspot-cli target (It stores the data in authBlob.json).

Internal details

The connection with Spotify servers to play music and recieve control information is pretty complex. First of all an access point address must be fetched from Spotify (ApResolve fetches the list from http://apresolve.spotify.com/). Then a PlainConnection with the selected Spotify access point must be established. It is then upgraded to an encrypted ShannonConnection.