/lsmv-leveldb

code analysis documents for https://github.com/google/leveldb

Primary LanguagePython

LevelDB code analysis

This is repository contains code analysis documents for google/leveldb.

Build Status

Table of Contents and Schedule

LevelDB Code Analysis

0. Using demo (~ end of May 2020)

C++ demos for making use of leveldb library. Write C++ code snippets as applications to call levelDB interfaces like Put, Get, Delete, Iterator, etc. This will create a simple database on local disk (and memory) with the directory structure described below and in here

1. Doxygen

I generated an HTML doc using doxygen to find the structure of the code. For example, the main interface lies in Classes -> Class List -> leveldb -> DB and we can see the 2 child classes of leveldb::DB, which are leveldb::DPImpl and leveldb::ModelDB. Also list of all members.

This part includes class definitions used by the following contents.

3. Write and Read Operations (~ Week 1, June 2020)

This part of doc will include

  • API for write and read
  • Overview of data structure involved, i.e. batch, log, memtable, sstable
  • The process of write and read
  • Snapshot mechanism

The main interfaces to the DB are declared in include/leveldb/db.h. A list of all members of DB class is here (Generated by doxygen). Virtual functions declared in the main interfaces listed:

virtual Status Put (const WriteOptions &options, const Slice &key, const Slice &value);
virtual Status Delete (const WriteOptions &options, const Slice &key);
virtual Status Write (const WriteOptions &options, WriteBatch *updates);
virtual Get (const ReadOptions &options, const Slice &key, std::string *value);
virtual Iterator * NewIterator (const ReadOptions &options);
virtual const Snapshot * GetSnapshot ();
virtual void ReleaseSnapshot (const Snapshot *snapshot);
virtual bool GetProperty (const Slice &property, std::string *value);
virtual void GetApproximateSizes (const Range *range, int n, uint64_t *sizes);
virtual void CompactRange (const Slice *begin, const Slice *end);

4. Implementation of in-memory table (~ Week 2, June 2020)

At present, the in-memory table of LevelDB is implemented as a skip-list. In this part I will look into the implementation of the skip-list.

Public member functions including:

MemTable (const InternalKeyComparator &comparator);
MemTable (const MemTable &)=delete;
MemTable & 	operator= (const MemTable &)=delete;
void Ref ();
void Unref ();
size_t ApproximateMemoryUsage ();
Iterator * NewIterator ();
void Add (SequenceNumber seq, ValueType type, const Slice &key, const Slice &value);
bool Get (const LookupKey &key, std::string *value, Status *s);

and there is a private member named table_, which is actually a skip-list, declared in memtable.h:

//memtable.h
// ...
  typedef SkipList<const char*, KeyComparator> Table;
  // ...
  Table table_;
// ...

5. Sorted tables and manifest (~ Week 2, June 2020)

On-disk file structures and their implementations and file operations in the source code (ref: doc/impl.md).

  • Log files (*.log)
  • Sorted tables (*.ldb)
  • Manifest (like index of the on-disk tables)
  • CURRENT: a simple text file that contains the name of the latest MANIFEST file.
  • Info logs (LOG and LOG.old)
  • Others (LOCK, *.dbtmp)

6. Compactions (~ Week 3, June 2020)

7. Bloom Filters and Other Utilities (~ Week 4, June 2020)

8. Recovery (~ Week 4, June 2020)

Multi-Version Algorithm Design