This repository focuses on creating shared libraries of the MapReduce project. The libraries would then be used within another project via explicit linking. For each class in this project create a base class interface (base class) that gets inherited by its corresponding subclass. IOn each case, after creating the .so file, it will be used in conjunction with the library file using dlopen API.
This folder will hold the following -
-
FileProcessorBase.hpp
Abstract implementation of the FileProcessor class. This class has two functions -
i) Create a memory object that represents all text files within a directory ii) Facilitate the loading of intermediate memory objects (Mapper, Reducer, Shuffler) to disk
-
FileProcessorInput.cpp The actual implementation that will load all raw data in a directory as a memory object. This is done by the runOperations virtual method.
-
FileProcessorMapOutput.cpp The actual implementation that will write the Mapper memory object to disk. This is done by the runOperations virtual method.
-
FileProcessorShufOutput.cpp The actual implementation that will write the Shuffler memory object to disk. This is done by the runOperations virtual method.
-
FileProcessorRedOutput.cpp The actual implementation that will write the Reducer memory object to disk. This is done by the runOperations virtual method.
The commands to build each implementation as a library are as follows -
Let's start by cd'ing to FileProcessor directory and creating the object file by running the g++ command
cd FileProcessor/
g++ -std=c++17 -c -o FileProcessorInput.o FileProcessorInput.cpp -fPIC
Now we will create a shared library using gcc
gcc -shared -o FileProcessorInput.so FileProcessorInput.o
You will see a file FileProcessorInput.so
created. We will move it to the libs folder.
mv FileProcessorInput.so libs/
Let's remove the object file
rm -f FileProcessorInput.o
Let's start by cd'ing to FileProcessor directory and creating the object file by running the g++ command
cd FileProcessor/
g++ -std=c++17 -c -o FileProcessorMapOutput.o FileProcessorMapOutput.cpp -fPIC
Now we will create a shared library using gcc
gcc -shared -o FileProcessorMapOutput.so FileProcessorMapOutput.o
You will see a file FileProcessorMapOutput.so
created. We will move it to the libs folder.
mv FileProcessorMapOutput.so libs/
Let's remove the object file
rm -f FileProcessorMapOutput.o
Let's start by cd'ing to FileProcessor directory and creating the object file by running the g++ command
cd FileProcessor/
g++ -std=c++17 -c -o FileProcessorShufOutput.o FileProcessorShufOutput.cpp -fPIC
Now we will create a shared library using gcc
gcc -shared -o FileProcessorShufOutput.so FileProcessorShufOutput.o
You will see a file FileProcessorShufOutput.so
created. We will move it to the libs folder.
mv FileProcessorShufOutput.so libs/
Let's remove the object file
rm -f FileProcessorShufOutput.o
Let's start by cd'ing to FileProcessor directory and creating the object file by running the g++ command
cd FileProcessor/
g++ -std=c++17 -c -o FileProcessorRedOutput.o FileProcessorRedOutput.cpp -fPIC
Now we will create a shared library using gcc
gcc -shared -o FileProcessorRedOutput.so FileProcessorRedOutput.o
You will see a file FileProcessorRedOutput.so
created. We will move it to the libs folder.
mv FileProcessorRedOutput.so libs/
Let's remove the object file
rm -f FileProcessorRedOutput.o
This folder will hold the following -
-
MapperBase.hpp
Abstract implementation of the Mapper class, describing interface details within MapperBase.hpp file. We have declared the base class implementation of MapperBase class. We have also declared class factory functions to generate objects. This will be used by calling programs later.
-
MapperImpl.cpp The implementation details for the subclass (Mapper) is provided with MapperImpl.cpp file. This will used by calling programs later.
The commands to build each implementation as a library are as follows -
Let's start by cd'ing to Mapper directory and creating the object file by running the g++ command
cd Mapper/
g++ -std=c++17 -c -o MapperImpl.o MapperImpl.cpp -fPIC
Now we will create a shared library using gcc
gcc -shared -o MapperImpl.so MapperImpl.o
You will see a file MapperImpl.so
created. We will move it to the libs folder.
mv MapperImpl.so libs/
Let's remove the object file
rm -f MapperImpl.o
This folder will hold the following -
-
ReducerBase.hpp
Abstract implementation of the Reducer class, describing interface details within ReducerBase.hpp file. We have declared the base class implementation of ReducerBase class. We have also declared class factory functions to generate objects. This will be used by calling programs later.
-
ReducerImpl.cpp The implementation details for the subclass (Reducer) are provided with ReducerImpl.cpp file. This will used by calling programs later.
The commands to build each implementation as a library are as follows -
Let's start by cd'ing to Reducer directory and creating the object file by running the g++ command
cd Reducer/
g++ -std=c++17 -c -o ReducerImpl.o ReducerImpl.cpp -fPIC
Now we will create a shared library using gcc
gcc -shared -o ReducerImpl.so ReducerImpl.o
You will see a file ReducerImpl.so
created. We will move it to the libs folder.
mv ReducerImpl.so libs/
Let's remove the object file
rm -f ReducerImpl.o
This folder will hold the following -
-
ShufflerBase.hpp
Abstract implementation of the Shuffler class, describing interface details within ShufflerBase.hpp file. We have declared the base class implementation of ShufflerBase class. We have also declared class factory functions to generate objects. This will be used by calling programs later.
-
ShufflerImpl.cpp The implementation details for the subclass (Shuffler) is provided with ShufflerImpl.cpp file. This will be used by calling programs later.
The commands to build each implementation as a library are as follows -
Let's start by cd'ing to Shuffler directory and creating the object file by running the g++ command
cd Shuffler/
g++ -std=c++17 -c -o ShufflerImpl.o ShufflerImpl.cpp -fPIC
Now we will create a shared library using gcc
gcc -shared -o ShufflerImpl.so ShufflerImpl.o
You will see a file ShufflerImpl.so
created. We will move it to the libs folder.
mv ShufflerImpl.so libs/
Let's remove the object file
rm -f ShufflerImpl.o
We rely on class factories and the extern keyword to extend the presence of class instances to downstream implementations.
E.g.
// Class factories - constructor
extern "C" FileProcessorBase* createInputObj(const std::string &operation, const std::string &path){
return new FileProcessorInput(operation, path);
}
// Class factories - destructor
extern "C" void removeInputObj(FileProcessorBase* fpObj){
delete fpObj;
}
We are function pointers (createInputObj
and removeInputObj
) that will internally create instances that points to BASE implementation.
These class factories are exposed using the extern keyword which specifies these symbols (createInputObj
and removeInputObj
) have external linkage.
In the base interface, there are type declarations which expose the class factory.
E.g.
// Signifies the Factory that will read input directories using createInputObj within FileProcessorInput.cpp implementation
typedef FileProcessorBase* create_t(const std::string &operation, const std::string &path);
// Signify the Factory that will remove the instance within FileProcessInput.cpp implementation
typedef void destroy_t(FileProcessorBase*);
These type declarations along with symbols (factory) are used to create symbolic links using the DLOPEN api. This is how we accomplish external linking.