ESP32 Audio Docks is a range of development boards (earlier docks) that allow you to develop Audio solutions based on ESP32 chips. These were created to make Audio development entry as easy and inexpensive as possible.
HiFi-ESP | Loud-ESP | Louder-ESP |
---|---|---|
Work in progress, coming soon... |
- ESP32 Audio Docks and Louder ESP
I spent the last few years developing different solutions based on ESP devices. It all started with ESP8266, where CPU power is not really sufficient to do real-time decoding, so you're limited to a rather simple ding-dong business. Then ESP32 came, bringing two much more capable cores, so you have a powerhouse to handle communication and decoding at the same time. Perhaps most importantly it also came with SPIRAM, so you can do decent buffering (essential for streamed content). Now new ESP32 C-Series and S-Series chips are entering the market, and their potential is mostly unrealized as of today.
I created those docks and subsequently development boards, to be able to quickly prototype for the whole range of ESP8266 and ESP32 chips, starting with the simplest finger-sized toys and going all the way up to full-sized speakers.
First generation docks
ESP Audio Solo | ESP Audio Duo | Hifi ESP | Louder ESP |
---|---|---|---|
ESP8266, ESP32C3, ESP32S2 Mini modules | ESP32 Mini Module | ESP32 Mini Module | ESP32 Mini Module |
Single I2S DAC (MAX98357) with built in D-Class amp | Dual I2S DAC (MAX98357) with built in D-Class amp | PCM5100A 32bit Stereo DAC -100 dB typical noise level | Stereo I2S DAC (TAS5805M) with built in D-Class amp |
3W | 2x 3W | Non-amplified stereo output | 2x 32W (4Ω, 1% THD+N) |
1.5W | 2x 1.5W | Non-amplified stereo output | 2x 22W (8Ω, 1% THD+N) |
8MB PSRAM (4MB usable) | 8MB PSRAM (4MB usable) | 8MB PSRAM (4MB usable) | |
WiFi (ESP8266, ESP32S2) WiFi + BT5.0 (ESP32C3) | WiFi + BT4.2 + BLE | WiFi + BT4.2 + BLE | WiFi + BT4.2 + BLE Ethernet |
HiFi-ESP32 | HiFi-ESP32S3 | Loud-ESP32 | Loud-ESP32S3 | Louder-ESP32 | Louder-ESP32S3 | |
---|---|---|---|---|---|---|
Work in progress, coming soon... | Work in progress, coming soon... | |||||
MCU | ESP32-WROVER-N16R8 | ESP32-S3-WROOM-N16R8 | ESP32-WROVER-N16R8 | ESP32-S3-WROOM-N16R8 | ESP32-WROVER-N16R8 | ESP32-S3-WROOM-N16R8 |
DAC | PCM5100A 32bit Stereo DAC -100 dB typical noise level |
PCM5100A 32bit Stereo DAC -100 dB typical noise level |
Dual I2S DAC (MAX98357) with built in D-Class amp | Dual I2S DAC (MAX98357) with built in D-Class amp | Stereo I2S DAC (TAS5805M) with built in D-Class amp | Stereo I2S DAC (TAS5805M) with built in D-Class amp |
Output (4Ω) | Non-amplified stereo output, 2.1V RMS | Non-amplified stereo output, 2.1V RMS | 2x 5W | 2x 5W | 2x 32W (4Ω, 1% THD+N) | 2x 32W (4Ω, 1% THD+N) |
Output (8Ω) | Non-amplified stereo output | Non-amplified stereo output | 2x 3W | 2x 3W | 2x 22W (8Ω, 1% THD+N) | 2x 22W (8Ω, 1% THD+N) |
PSRAM | 8MB PSRAM (4MB usable) over 40MHz SPI | 8MB PSRAM over 80MHz QSPI | 8MB PSRAM (4MB usable) over 40MHz SPI | 8MB PSRAM over 80MHz QSPI | 8MB PSRAM (4MB usable) over 40MHz SPI | 8MB PSRAM over 80MHz QSPI |
Power | 5V over USB-C, 2x LP5907 3.3 V Ultra-Low-Noise LDO for analog section | 5V over USB-C, 2x LP5907 3.3 V Ultra-Low-Noise LDO for analog section | 5V (up to 2.5A) from USB-C | 5V (up to 2.5A) from USB-C | Up to 26V from external PSU 5V over USB-C with power limited to 2x5W |
Up to 26V from external PSU 5V over USB-C with power limited to 2x5W |
Connectivity | WiFi + BT4.2 + BLE W5500 Ethernet (optional module) |
WiFi + BLE W5500 Ethernet (optional module) |
WiFi + BT4.2 + BLE W5500 Ethernet (optional module) |
WiFi + BLE W5500 Ethernet (optional module) |
WiFi + BT4.2 + BLE W5500 Ethernet (optional module) |
WiFi + BLE W5500 Ethernet (optional module) |
Audio streaming requires proper buffering to work, even with ESP32 500K of RAM it is a challenging task. For that reason, most of the projects will require WROVER modules that have onboard PSRAM chips. All ESP32 Audio boards have an 8MB PSRAM chip onboard, connected via a high-speed interface. Any code using PSRAM with just work out-of-the box.
First generation docks
I2S CLK | I2S DATA | I2S WS | |
---|---|---|---|
ESP8266 | 15 | 3 | 2 |
ESP32-C3 | 5 | 20 | 6 |
ESP32-S2 | 12 | 37 | 16 |
I2S CLK | I2S DATA | I2S WS | PSRAM CE | PSRAM CLK | |
---|---|---|---|---|---|
ESP32 | 26 | 22 | 25 | 16 | 17 |
I2S CLK | I2S DATA | I2S WS | PSRAM CE | PSRAM CLK | |
---|---|---|---|---|---|
ESP32 | 26 | 22 | 25 | 16 | 17 |
I2S CLK | I2S DATA | I2S WS | PSRAM CE | PSRAM CLK | TAS5805 SDA | TAS5805 SCL | TAS5805 PWDN | TAS5805 FAULT | |
---|---|---|---|---|---|---|---|---|---|
ESP32 | 26 | 22 | 25 | 16 | 17 | 21 | 27 | 33 | 34 |
ESP32-S3 | 14 | 16 | 15 | - | - | 8 | 9 | 17 | 18 |
I2S CLK | I2S DATA | I2S WS | PSRAM RESERVED | |
---|---|---|---|---|
ESP32 | 26 | 22 | 25 | 16, 17 |
ESP32-S3 | 14 | 16 | 15 | 35, 36, 37 |
I2S CLK | I2S DATA | I2S WS | PSRAM RESERVED | |
---|---|---|---|---|
ESP32 | 26 | 22 | 25 | 16, 17 |
ESP32-S3 | 14 | 16 | 15 | 35, 36, 37 |
I2S CLK | I2S DATA | I2S WS | PSRAM RESERVED | TAS5805 SDA | TAS5805 SCL | TAS5805 PWDN | TAS5805 FAULT | |
---|---|---|---|---|---|---|---|---|
ESP32 | 26 | 22 | 25 | 16, 1 | 21 | 27 | 33 | 34 |
ESP32-S3 | 14 | 16 | 15 | 35, 36, 37 | 8 | 9 | 17 | 18 |
SPI CLK | SPI MOSI | SPI MISO | SPI CS | SPI HOST/SPEED | ETH INT | ETH RST | |
---|---|---|---|---|---|---|---|
ESP32 | 18 | 23 | 19 | 05 | 2/20MHz | 35 | 14 |
ESP32-S3 | 12 | 11 | 13 | 10 | SPI2/20MHz | 6 | 5 |
IR IN | RGB OUT | OLED SPI HOST/SPEED | OLED SPI CLK | OLED SPI MOSI | OLED SPI MISO | OLED SPI CS | OLED SPI DC | OLED RST | |
---|---|---|---|---|---|---|---|---|---|
ESP32 | 39 | 12 | 2/20MHz | 18 | 23 | 19 | 15 | 4 | 32 |
ESP32-S3 | 7 | 9 | SPI2/20MHz | 12 | 11 | 13 | 39 | (37) | 38 |
In the software section two firmware examples are provided.
- esp32-i2s-bare is base I2S implementation based on ESP-IDF implementation directly.
- esp32-i2s-esp8266audio is based on excellent ESP8266Audio library (it works with the whole ESP range, don't get fooled by the name), providing minimum code implementation.
- esp32-i2s-web-radio is based on the same library, providing minimum web-readio stream player. It expects a playlist as an input in the 'data' folder.
- Squeezelite-ESP32 - see more details below
All samples are provided as Plarformio IDE projects. After installing it, open the sample project. Select the proper environment based on your dock. Run the Build
and Upload
commands to install necessary tools and libraries, and build and upload the project to the board. Communication and proper upload method selection will be handled by IDE automatically.
Follow the ESP8266Audio library guide. Default settings will work out of the box with ESP8266 and ESP32 boards. For ESP32C3 and ESP32S2 board please adjust the pinout according to the above section
Being an ESP32-based device, you can easily integrate it into your Home Assistant using ESPHome. Start with esphome web installer, which will give you ESPHome base install and WiFi configuration in minutes. Some S2/S3 boards have issues with we-installer, you may need to use Adafruit flasher instead with binaries pulled from the HA.
Install instructions
Next, navigate to your Home Assistant (assuming you have your ESPHome integration installed), and adopt the newly created node
ESPHome will give you ESPHome configs for Solo board running with ESP32-S2/S3, as well as Duo/HiFi-ESP and Louder ESP working with ESP32.
Few words of explanation.
media_player
publishes the media player into the Home assistant, so you can use it together with the native player or Music Assistant. You have a volume knob in the HA as well.- Volume set up to 50% on player start. Especially for Louder-ESP32, this is helpful :)
The true power of the native speaker in the eHA is the use of automation. One example that I find useful. This simple automation will be pronounced every hour between 8 AM and 9 PM. Another one is used to pronounce bedtime, you get the point...
Squeezelite-ESP32 is a multimedia software suite, that started as a renderer (or player) of LMS (Logitech Media Server). Now it is extended with
- Spotify over-the-air player using SpotifyConnect (thanks to cspot)
- AirPlay controller (iPhone, iTunes ...) and enjoy synchronization multiroom as well (although it's AirPlay 1 only)
- Traditional Bluetooth device (iPhone, Android)
And LMS itself
- Streams your local music and connects to all major online music providers (Spotify, Deezer, Tidal, Qobuz) using Logitech Media Server - a.k.a LMS with multi-room audio synchronization.
- LMS can be extended by numerous plugins and can be controlled using a Web browser or dedicated applications (iPhone, Android).
- It can also send audio to UPnP, Sonos, Chromecast, and AirPlay speakers/devices.
All ESP32-based boards are tested with Squeezelite-ESP32 software, which can be flashed using nothing but a web browser. You can use Squeezelite-ESP32 installer for that purpose.
Use Installer for ESP Audio Dock to flash firmware first. It has been preconfigured to work with ESP Audio boards and will configure all hardware automatically.
Install instructions
Select the correct device first | |
Connect the device to the USB port and select it from the list | |
Press Flash and wait around 2 minutes |
|
(Optional) You may enter the serial console to get more information | |
Device is in recovery mode. Connect to squeezelite-299fac wifi network with squeezelite password (your network name suffix will be different) |
|
When redirected to the captive portal let the device scan wifi network and provide valid credentials | |
You can use provided IP address (http://192.168.1.99/ on the screenshot) to access settings page | |
(Optional) You may change device names to something close to your heart | |
Exit recovery |
You can use it now
Bluetooth | Spotify Connect | AirPlay | LMS Renderer |
---|---|---|---|
If you have optional ethernet on the board, please put this config in the NVS settings
eth_config = model=w5500,cs=5,speed=20000000,intr=35,rst=14
spi_config = mosi=23,clk=18,host=2,miso=19
eth_config = model=w5500,cs=10,speed=20000000,intr=6,rst=5
spi_config = mosi=11,clk=12,host=2,miso=13
Please visit the hardware section for board schematics and PCB designs. Note that PCBs are shared as multi-layer PDFs.
First generation docks
Image | Legend |
---|---|
MAX98357 DAC Speaker Terminal |
Image | Legend |
---|---|
MAX98357 DAC Speaker Terminals 8MB PSRAM IC |
Image | Legend |
---|---|
PCM5100A DAC Speaker Terminals 8MB PSRAM IC Ultra-Low noise LDO 3V3 Voltage regulator |
Image | Legend |
---|---|
TAS5805M DAC Speaker Terminals 8MB PSRAM IC 3V3 Drop-Down voltage regulator (powers ESP32) Input Voltage terminal |
|
(REV B, C, D) | TAS5805M DAC Speaker Terminals - 8MB PSRAM IC (Hidden under ESP32 module) - 3V3 Drop-Down voltage regulator (powers ESP32, hidden under ESP32 module) Input Voltage terminal |
Image |
---|
coming soon...
Image |
---|
Every board has a header that allows to solder in W5500 SPI Ethernet module that is very easy to find. The only downside is that with the module installed board will not fit the case, unless it is cut to accomodate extra height.
HiFi-ESP32(S3) | Louder-ESP32(S3) |
---|---|
image coming soon... |
TAS5805M DAC Allows 2 modes of operation - BTL (stereo) and PBTL (parallel, or mono). In Mono amp will use a completely different modulation scheme and basically will fully synchronize output drivers. Jumpers on the board allow both output drivers to connect to the same speaker. The most important step is to inform the Amp to change modulation in the first place via I2C comman. In the case of sqeezelite DAC controlsset value is the following:
dac_controlset: `{"init":[{"reg":3,"val":2},{"reg":3,"val":3},{"reg":2,"val":4}],"poweron":[{"reg":3,"val":3}],"poweroff":[{"reg":3,"val":0}]}`
compared to default:
dac_controlset: `{"init":[{"reg":3,"val":2},{"reg":3,"val":3}],"poweron":[{"reg":3,"val":3}],"poweroff":[{"reg":3,"val":0}]}`
One can test audio with a single speaker connected between L and R terminals (plus on one side and minus on the other). Optionally, jumpers on the board will effectively connect the second driver in parallel doubling the current capability.
Important point, this will send only one channel to the output, that’s just how the DAC works. True mono as (L+R)/2 is possible via more in-depth configuration (very poorly documented), but I haven’t managed to configure that on the stand. I’m still working on that. (Along with a few more really cool DSP features that this DAC has, like EQ, subwoofer mode and tone compensation settings)
BTL | PBTL | |
---|---|---|
Descriotion | Bridge Tied Load, Stereo | Parallel Bridge Tied Load, Mono |
Rated Power | 2×23W (8-Ω, 21 V, THD+N=1%) | 45W (4-Ω, 21 V, THD+N=1%) |
Schematics | ||
Speaker Connection |
Starting from Rev E, an additional header is exposed to allow datasheet-specced connectivity
Image | Legend |
---|---|
Stereo Mode - leave open | |
Mono (PBTL) Mode, close horisontally |
The TAS5805M DAC has a very powerful DSP, that allows doing lots of data processing on the silicon, that otherwise would take a considerable part of your CPU time. As of the moment of writing it is mostly an undiscovered part of the DAC, since unfortunately, TI is not making it very easy for developers. (A minute of complaint) To be more specific, you need to be (A) a proven hardware manufacturer to get access to the configuration software, namely PurePath. (B) you need to apply for a personal license and go through an approval process, and after a few weeks of waiting you get access to one DAC configuration you asked for. (C) You find out that it will work with TI's own evaluation board that will set you back $250 if you'd be able to find one. Otherwise, all you have is a list of I2C commands that you need to transfer to the device on your own cost. No wonder no one knows how to use it.
But moanings aside, what do you get after:
- Flexible input mixer with gain corrections
- 15 EQ with numerous filter configurations
- 3-band Dynamic Range Compression with flexible curve configuration
- Automatic Gain Limiter with flexible configuration
- Soft clipper
- and a few other things
At this moment it is very experimental. In the perfect world, you should be able to adjust all of those settings to make your speaker-enclosure setup work the best it can, and even apply your room factors into the equation. But with above disclaimer I can only deliver limited set of configurations corresponding to the most common use cases:
- Stereo mode with enabled DRC (Loudness) and AGL settings
- Full range Mono mode with DRC (Loudness) and AGL settings
- Subwoofer Mono mode with few filter frequency options
- Bi-Amp configuration with few crossover frequency options
All of the above are available right now for experimentation. I'm keen to hear your feedback while I moving forward with porting this to other software options
- - Bare I2S TAS5805M library
- - espragus-snapclient software
- - squeezelite-esp32 <- to do
- - flexible configurations with on-the-fly configuration changes
Barrel jack used is spaced at 6mm hole/2mm pin, which is typically 5.5/2.5mm jack on the male side.
The screw terminal is connected parallel to the barrel jack, you can use either interchangeably.
The power adapter specs depend on the speaker you're planning to use. DAC efficiency is close to 100%, so just take the power rating of your speaker (say 2x10w), and impedance (say 8 ohms) and you'd need at least 9 volts rated at 1.2 amps per channel, round up to 3 total amps.
It is not recommended to go beyond the voltage your speakers can take, otherwise, the amp will blow your speakers in no time.
HiFi-ESP32(S3), Loud-ESP32(S3) and Louder-ESP32(S3) are mechanically compatible with Raspberry Pi 3/4 cases, tested with transparent ones. Also, community members created a few 3-D printable designs that can be found here and here
Hifi-ESP32 | Loud-ESP32 | Louder-ESP32 |
---|---|---|
Image coming soon... |
You may support my work by ordering these products at Tindie and Elecrow
- ESP Audio Dock at Tindie
- Louder ESP32 at Elecrow
- Louder-ESP32 and Louder-ESP32S3 at Tindie
- HiFi-ESP32 and HiFi-ESP32S3