/speedy

a distributed docker image storage

Primary LanguageCMIT LicenseMIT

Introduction

Speedy is a high performance distributed docker image storage solution written by go/c. It can be easily scaled out by adding more storage instance and no data have to move around between storage instance.

Features

  • High performance and efficient file storage engine written by c.
  • High availability by multi copy of storage instance and stateless frond-end proxy image server.
  • High controllability by introduce weak central master node. The upload/download process will not go through the master node.
  • High scalability by dynamically adding more storage instance and frond-end proxy image server.
  • Large file will be divided into small chunks and upload/download those chunks concurrently.
  • Onboard storage monitoring system.
  • Onboard rich operation tools.
  • Docker registry 1.0 API are supported.

Upcoming Features

  • Online data transfer system.
  • Docker registry 2.0 API support.
  • More operation tools.

Architecture

architecture

Component

  • docker-registry-speedy-driver
    Docker-registry backend storage driver for speedy, It divides docker image layer into fixed-size chunk and uploads/downloads concurrently .

  • imageserver
    It is a stateless frond-end proxy server designed to provide restful api to upload and download docker image. imageserver get chunkserver information and file id from chunkmaster periodically. imageserver choose a suitable chunkserver group to storage docker image according to chunkserver information independently, we can start many imageserver to provice service at the same time and docker-registry-speedy-driver can use anyone of them equally.

  • chunkmaster
    It is a central master node designed to maintain chunkserver information and allocate the file id. chunkmaster store chunkserver information to mysql and keep in memory as cache, while imageserver try to get chunkserver information chunkmaster send the information in memory to imageserver. while imageserver try to get file id, chunkmaster allocate a continuous range of file id and send to imageserver.

  • chunkserver
    It is a highly optimized storage engine for performance and space efficiency.It appends single small image file into large files and maintain file index in memory keeping the IO overhead to a minimum. Normally, a chunkserver group is consist of 3 chunkservers, imageserver writes data to a chunkserver group suceess means storing data to each chunkserver of the group success.

  • metaserver
    It is an another distributed key-value storage, since It's not open-source yet, you can use mysql instead which store the image layer metadata informations.

How To Install

see INSTALL and USAGE

Startup sequence

1.metaserver
2.chunkmaster
3.chunkserver
4.imageserver
5.docker-registry

After that you can push and pull docker images.

Performance Test

We made a performance test about upload and download of Speedy. We use 4 normal servers
(CPU: 24 core 2G HZ; Memory: 16G; Disk: 300G 10k SAS; Ethernet: 1 Gigabit) to construct
our test environment.

                       imageserver [node1]
                    /          |           \
                   /           |            \
    chunk-server[node2] chunk-server[node3] chunk-server [node4]

We simply use mysql instead of our internal MetaServer, at the same time,
chunkserver is on the default buffer io model.

  • Performance Results
concurrent   fileSize(M)  file count  upload time(seconds)  download time(seconds)  upload speed(M/s)  download speed(M/s) 
10 16 100 42.85 15.23 37.34 105.06
50 16 100 42.82 16.29 37.37 98.22
100 16 100 45.69 14.50 35.02 110.34
10 16 500 214.14 72.45 37.36 110.42
50 16 500 213.90 71.40 37.40 112.04
100 16 500 213.92 71.51 37.40 111.87
10 16 1000 427.97 147.78 37.39 108.27
50 16 1000 427.79 146.62 37.40 109.13
100 16 1000 427.80 142.81 37.40 109.13

We can easily got that download speed reach the limit of Ethernet, about 110 M/s.
Although upload speed looks like just 1/3 of download speed, acctually upload also
reach the Ethernet limit, that is the result of upload will concurrently write three
chunkservers.