/flatjsondb

Flat file database for storing json and asciidoctor files.

Primary LanguageJava

FlatJsonDB

Main goals

  • Simple straight forward in process database

  • Store json and other text files hierarchical in the file system

  • Integrated apache lucene support

  • Mapping and handling relations between entities, including lazy loading

  • In process database

  • Support versioning via GIT(jgit), including support for submodules to get a distributed storage that can be synced

Quickstart

Create an entity

We create a new entity which has the following properties:

  • a name as natural id

  • a string attribute

  • simple auditing times like creationTime and updateTime

@Entity
public class ExampleEntity extends NamedEntity {
  private String attribute;

  protected ExampleEntity() {//jackson needs default constructor
    super(null);
  }

  public ExampleEntity(String name) {
    super(name);
  }

  public String getAttribute() {
    return attribute;
  }

  public ExampleEntity setAttribute(String attribute) {
    this.attribute = attribute;
    return this;
  }
}

Setup the repository and sessionfactory

//open a repository with a java.nio.Path
Repository repository = new Repository(Paths.get("myRepoPath"));

//open a new session factory
//the factory needs at least one repository and a package to scan or 1-n entity classes
//a factory needs to be closed again to release resources(mainly lucene index)
SessionFactory factory = new SessionFactory(repository, ExampleEntity.class);

/*
 * execute some code
 */

factory.close();//if we don't use a try-with-resource block we need to close the factory manually

Persist modify and delete

//persist an entity
factory.transactedSession(session -> {
  ExampleEntity entity = new ExampleEntity("Hello world!");
  assert entity.getId() == null;
  session.persist(entity);
  assert entity.getId() != null;//the unique id is written into the entity with the persist call
});

//modify entity
factory.transactedSession(session -> {
  ExampleEntity entity = session.findByNaturalId(ExampleEntity.class, "Hello world!");
  entity.setAttribute("Coffee");

  ExampleEntity entity2 = session.findByNaturalId(ExampleEntity.class, "Hello world!");
  assert entity==entity2;//same instance always returned
});

//read entity
ExampleEntity entity = factory.transactedSessionRead(session -> session.findByNaturalId(ExampleEntity.class, "Hello world!"));
assert "Coffee".equals(entity.getAttribute());//entity is now detached but can still be used.

//delete enity again
factory.transactedSession(session -> {
  ExampleEntity reloaded = session.findByNaturalId(ExampleEntity.class, "Hello world!");
  session.remove(reloaded);
});

The contents look like this:

scar@scar:~$ cd /tmp/myRepoPath/
scar@scar:/tmp/myRepoPath$ ls
ExampleEntity
scar@scar:/tmp/myRepoPath$ cd ExampleEntity/
scar@scar:/tmp/myRepoPath/ExampleEntity$ ls
Hello_world.json
scar@scar:/tmp/myRepoPath/ExampleEntity$ cat Hello_world.json
{
  "de.ks.flatadocdb.integration.ExampleEntity" : {
    "version" : 1,
    "id" : "720ce395f0a0a151f198877bd9122257db55cdbe",
    "creationTime" : [ 2015, 12, 6, 9, 28, 59, 50000000 ],
    "updateTime" : [ 2015, 12, 6, 9, 29, 9, 858000000 ],
    "name" : "Hello world!",
    "attribute" : "Coffee"
  }
}

Complete code

  @Test
  public void testExample() throws Exception {
    //open a repository with a java.nio.Path
    Repository repository = new Repository(myRepoPath);

    //open a new session factory
    //the factory needs at least one repository and a package to scan or 1-n entity classes
    //a factory needs to be closed again to release resources(mainly lucene index)
    try (SessionFactory factory = new SessionFactory(repository, ExampleEntity.class)) {
      factory.transactedSession(session -> {
        ExampleEntity entity = new ExampleEntity("Hello world!");
        assert entity.getId() == null;
        session.persist(entity);
        assert entity.getId() != null;//the unique id is written into the entity with the persist call
      });

      //modify entity
      factory.transactedSession(session -> {
        ExampleEntity entity = session.findByNaturalId(ExampleEntity.class, "Hello world!");
        entity.setAttribute("Coffee");

        ExampleEntity entity2 = session.findByNaturalId(ExampleEntity.class, "Hello world!");
        assert entity==entity2;//same instance always returned
      });

      //read entity
      ExampleEntity entity = factory.transactedSessionRead(session -> session.findByNaturalId(ExampleEntity.class, "Hello world!"));
      assert "Coffee".equals(entity.getAttribute());//entity is now detached but can still be used.

      //delete enity again
      factory.transactedSession(session -> {
        ExampleEntity reloaded = session.findByNaturalId(ExampleEntity.class, "Hello world!");
        session.remove(reloaded);
      });
    }
//    factory.close(); if we don't use a try-with-resource block we need to close the factory manually
  }

Components

The flatjsondb consists of one or more repositories.
All repository entries are entities which are registered at a global metamodel.
Each repository has its own indexes, including the lucene index for searching.

Repository

A repository is generally a folder on your filesystem containing a bunch of json/text or other files which are mapped as entities. It might look like the following:

scar@scar:/tmp/tempRepo$ pwd
/tmp/tempRepo

scar@scar:/tmp/tempRepo$ ls -lA
total 8
drwxr-xr-x 2 scar scar 4096 Dec  6 07:58 .lucene
drwxr-xr-x 2 scar scar 4096 Dec  6 07:58 .index
drwxr-xr-x 2 scar scar 4096 Dec  6 07:58 .git
drwxr-xr-x 2 scar scar 4096 Dec  6 07:58 TestEntity

scar@scar:/tmp/tempRepo$ cd TestEntity/
scar@scar:/tmp/tempRepo/TestEntity$ ls -lA
total 4
-rw-r--r-- 1 scar scar 262 Dec  6 07:58 blubber.json

scar@scar:/tmp/tempRepo/TestEntity$ cat blubber.json
{
  "de.ks.flatadocdb.metamodel.TestEntity" : {
    "version" : 1,
    "id" : "3708a8ca06b62afd2d3d9b1039702b5b61e59e40",
    "creationTime" : [ 2015, 12, 6, 7, 58, 27, 909000000 ],
    "updateTime" : null,
    "name" : "blubber",
    "attribute" : "Steak"
  }
}

In addition to the entities it contains the index files:

  • .git git repository

  • .lucene lucene index files

  • .index faltadocdb index used to prevent file system scanning and parsing at startup.

As a repository is manged by git you can use it in a distributed way:
For example you can have a clone on your local computer and a seperate one on a notebook and sync both via wlan.+ Or you can manage one main clone on a cloud storage and push to it from different machines.

Different repositories can be used eg. for private, family or work stuff.

Entity

An entity is any java class annotated with @Entity.

@Entity
public class TestEntity extends NamedEntity {
...

As you can see we already provide some base classes (NamedEntity and BaseEntity) you can extend from. Those are just suggestions, you can always use the annotations to map your own entities.

An entity has to fulfill the following requirements:

  • Annotated witht @Entity

  • 1 Verision field @Version long version;

  • 1 Id field @Id String version;

  • For jackson it needs to have a default constructor

Entity ID

The ID of an entity is the SHA1 checksum of the relative path in the repository(which is unique).

Lifecycle callbacks

Lifecycle callbacks are support via parameter-less methodds annotated with the following:

  • @PostLoad

  • @PostPersist

  • @PostRemove

  • @PostUpdate

  • @PrePersist

  • @PreRemove

  • @PreUpdate

Relations

An entity can contain relations to other entities. These relations are always mapped via 1-n IDs. If an entity contains another entity that is not annotated as a relation, this entity will be stored by the persister(like an embeddable). The following annotations exist:

  • @ToMany → maps a List or a Set of entities

  • @ToOne → maps a single entity

  • @Children → maps a List or a Set of entities

  • @Child → maps a single child entity

The child* annotations have a special meaning. They define that those entities are not stored in their usual folder, but in a subfolder of their parent’s directory.

Related entities are always persisted with their parent. However removal of the relation owner will not remove the related entities.

Customize Entity loading and saving

The @Entity annotation contains some field defining the behaviour on how and where the entity is loaded/saved;

  • FolderGenerator → Generates the target folder to store the entity in

  • FileGenerator → Generates the file name for the entity

  • EntityPersister → used to load/save an entity. Custom implementations can be used for eg. asciidoctor, xml whatever.

Session

The session is the main object to manipulate entities. It must only be used by a single thread. The following methods are important:

  • findById → finds an entity by the sha1 id

  • findByNaturalId → finds an entity by the natural id, eg. name

  • persist(Object) → stores an entity, multiple calls for the same entity will result in a NOOP

  • remove(Object) → deletes an entity

  • lucene(…​) → provides read access to an IndexSearcher of lucene

Updates of entities are done automatically by a dirty check. The dirty check implementation is quite simple right now. It generates the file contents it would write to the file system and compares the md5 sum with the previous md5 sum.

Combine the session with your own transactions

The session provides the usual methods you need to include it in your own transaction environment:

  • prepare

  • commit

  • rollback

However due to the mass of file operations the commit phase can break down and destroy you 2 phase commit.

LuceneIndex

Lucene is included to provide search access for all entities. Sadly lucene is not that fast when I commit the current state. Therefore the LuceneIndex is not transaction safe. I try to ensure that it is only updated after a successful commit, but this is can still go wrong.

By default all Enum, String or primitive fields and collections/arrays thereof are indexed. You can generate your own list of indexable fields by using an own imlementation of LuceneDocumentExtractor (See @Entity).