The Distributed File System has been written as a project for Innopolis University Distributed Systems course, Fall 2019.
- Run a server instance and install docker-compose on it.
- Clone docker compose file from GitHub repository.
docker-compose up
- Run required number of instances and install docker-compose on them.
- Clone docker compose file from GitHub repository.
- In docker-compose.yml file in
command
line specify private address of naming server and private own address of storage server correspondingly. docker-compose up
- Download docker-compose.yml for Client
- Run
docker-compose up -d
in directory of download - Run
docker ps
to get the container ID - Run
docker exec -ti <container ID> /bin/bash
- Inside the container run
./Client <naming server IP>
- Here you go.
- Download archive
- Extract JAR
- Run
java -jar Client.jar <naming server IP>
- Here you go.
File system's users should be able to perform certain operations on files and directories.
On files: upload, download, create, remove copy, move, get info.
On directories: create, remove, list.
All the files are meant to be replicated on multiple storage servers such that the DFS become fault-tolerant: when a storage node fails or offline, the data is accessable on other storages that keep the file replica.
The system consists of Client, Naming Server and Multiple Storage Servers.
Naming Server keeps the filesTree - structure of directories and files, metadata of files. It assigns IP to each file by which it can be accessed by Storages or Client. Also Naming Server keeps IPs of each running Storage Server.
Storage Server keeps files. From time to time (each 5 seconds) it pings the Naming Server with a heartbeat. If there is no a heartbeat within 10 seconds, Naming Server removes the Storage from runnig Storages list.
Client is a console application that allows to perform a set of operations on files and directories.
-
init
- clear all -
touch <new_file_path>
- create empty file -
get <remote path> (local path)
- download file -
put <local path> (remote path)
- upload file -
rm <path>
- delete file -
info <path>
- file info -
cp <from> <to>
- copy file -
mv <from> <to>
- move file -
cd <path>
- open directory -
ls (path)
- read directory -
mkdir (path)
- create directory -
help
- list available commands -
setdd <path>
- set a directory for downloads -
getdd
- get current directory for downloads
<arg>
- required argument, (arg)
- optional argument
Possible Acknowledgements: OK
Possible Acknowledgements: OK, INCORRECT_NAME, FILE_OR_DIRECTORY_DOES_NOT_EXIST, FILE_OR_DIRECTORY_ALREADY_EXISTS
- storageIp - where the Client should download the file
- fileId - globally unique file ID
Possible Response codes: OK, TOUCHED,INCORRECT_NAME, FILE_OR_DIRECTORY_DOES_NOT_EXIST, NO_NODES_AVAILABLE
- storageIP - where the Client should download the file
- fileId - globally unique file ID
- [replicasIPs] - IPs of Storages to which the file should be replicated
At step(4) after uploading the file to primary storage, it aknowledges Client and Naming Server. Right after that, Storage starts sending file to replicated storages - after finishing of uploading each notifies the Naming Server
Possible Response codes: OK, NO_NODES_AVAILABLE, INCORRECT_NAME, FILE_OR_DIRECTORY_DOES_NOT_EXIST, FILE_OR_DIRECTORY_ALREADY_EXISTS
Possible Acknowledgements: OK,CONFIRMATION_REQUIRED, FILE_OR_DIRECTORY_DOES_NOT_EXIST
Each Storage pings Naming server with a heartbit with period of 5 seconds. Every 6th heartbeat is fetchFile request. As a response Naming server sends the list of fileIPs that the storage should keep (thus, all the others excess files are removed). Also, Naming server sends a list of tuples {fileIP, storageIP} - certain files from corresponding storages should be requested by the Storage to be downloaded (handles cases of replication failure while uploading from client).
Possible Acknowledgements: OK, INCORRECT_NAME, FILE_OR_DIRECTORY_DOES_NOT_EXIST
Naming server keeps structures of unique fileIDs and Storage paths by which the file could be reached. Thus, when copying, Naming Server just adds a new path to the corresponding file list.
Possible Acknowledgements: OK, INCORRECT_NAME, FILE_OR_DIRECTORY_DOES_NOT_EXIST, FILE_OR_DIRECTORY_ALREADY_EXISTS
Naming server keeps structures of unique fileIDs and Storage paths by which the file could be reached. Thus, when moving, Naming Serverat first adds a new path to the corresponding file list, and then deletes the fromPath from the corresponding file list.
Possible Acknowledgements: OK, INCORRECT_NAME, FILE_OR_DIRECTORY_DOES_NOT_EXIST, FILE_OR_DIRECTORY_ALREADY_EXISTS
Possible Acknowledgements: OK, FILE_OR_DIRECTORY_DOES_NOT_EXIST
If the requested path exists in NAming Server fileTree, it notifies Client with either success or fail. Current directory is displayed in console.
Possible Acknowledgements: OK, FILE_OR_DIRECTORY_DOES_NOT_EXIST
Possible Acknowledgements: OK, FILE_OR_DIRECTORY_DOES_NOT_EXIST, FILE_OR_DIRECTORY_ALREADY_EXISTS
In our project custom protocols based on Socket connection were used being used. Aknowledgements are being sent as Objects within ObjectOutputStream. Those objects contains Response Codes and some optional field depending in the type of command.
While file downloading/uploading (get/put) Client and Storage Server use the following protocol alike TCP:
Code | Meaning | Commands |
---|---|---|
OK | successful execution | all |
NO_NODES_AVAILABLE | no nodes available for uploading a file | put, get |
INCORRECT_NAME | requested path is "/" | put, info, touch, get, cp, mv |
FILE_OR_DIRECTORY_DOES_NOT_EXIST | no requested path exists | put, info, get, cp, mv, cd, ls, mkdir |
FILE_OR_DIRECTORY_ALREADY_EXISTS | requested(for creation) path is already exist | put, cp, mv, mkdir, rm |
TOUCHED | everything is OK, but the requested file has no content | get |
CONFIRMATION_REQUIRED | procedure requires confirmation for continuing | rm |
- Java
- Docker
- Gradle
CI is done for the project!
Push to github -> auto gradle build -> push to dockerhub
- Elena Lukyanchikova, B17-SE-01 -- (client, documentation)
- Rim Rakhimov, B17-SB -- (naming server, deployment)
- Ruslan Shakirov, B17-SE-01 -- (storage server, deployment)