Hadoop on AARCH64

This documents my process of deploying Hadoop and friends on aarch64 system.


  1. Ubuntu 20.04
  2. Raspberry Pi 4 4GB

Package Versions

  1. Hadoop 2.10.0
  2. Hive 1.2.2
  3. OpenJDK 8



Building native extensions

  1. Dependencies
  2. Protobuf 2.5
  3. Hadoop
Install Dependencies
sudo apt-get install -y libsnappy-dev net-tools maven cmake unzip curl zip ssh vim-tiny pdsh openjdk-8-jdk zlib1g-dev libssl-dev autoconf automake libtool make g++ libsasl2-dev pkg-config

sudo apt-get install build-essential
  1. Protobuf 2.5 does not support building on aarch64
  2. Apply the following patch to workaround
  3. https://gist.github.com/BennettSmith/7111094
Patch Hadoop 2.10.0

Hadoop 2.x does not build with OpenSSL 1.1.1 (only OpenSSL 1.0.2, which is EOL, not installable on ubuntu 20.04)

  1. https://issues.apache.org/jira/browse/HADOOP-16647


  1. Copy the configuration files into place


  1. Need MySQL 5.7 as metastore

MySQL 5.7

  1. Ubuntu 20.04 apt only has MySQL 8.0
  2. Install via Docker
  3. https://scito.ch/content/mysql-57-docker-container-raspberry-pi-using-debian-sid


  1. https://qiita.com/thashi/items/f8884b78df3faa5ff887
  2. https://scito.ch/content/mysql-57-docker-container-raspberry-pi-using-debian-sid
  3. https://www.linode.com/docs/databases/hadoop/how-to-install-and-set-up-hadoop-cluster/#architecture-of-a-hadoop-cluster
  4. https://github.com/dvoros/docker-hive
  5. https://dev.to/awwsmm/building-a-raspberry-pi-hadoop-spark-cluster-8b2