/DoctrineBatchUtils

:loop: A set of utilities to operate with Doctrine ORM's batch processing techniques

Primary LanguagePHPMIT LicenseMIT

DoctrineBatchUtils

This repository attempts to ease the pain of dealing with batch-processing in the context of Doctrine ORM transactions.

This repository is maintained by Patrick Reimers (PReimers).

License Current release Build Status

Installation

Supported installation method is via Composer:

composer require ocramius/doctrine-batch-utils

Current features

As it stands, the only implemented utility in this repository is an IteratorAggregate that wraps around a DB transaction and calls ObjectManager#flush() and ObjectManager#clear() on the given EntityManager.

Example (array iteration)

It can be used as following:

use DoctrineBatchUtils\BatchProcessing\SimpleBatchIteratorAggregate;

$object1  = $entityManager->find('Foo', 1);
$object2  = $entityManager->find('Bar', 2);

$iterable = SimpleBatchIteratorAggregate::fromArrayResult(
    [$object1, $object2], // items to iterate
    $entityManager,       // the entity manager to operate on
    100                   // items to traverse before flushing/clearing
);

foreach ($iterable as $record) {
    // operate on records here
}

$record freshness

Please note that the $record inside the loop will always be "fresh" (managed state), as the iterator re-fetches it on its own: this prevents you from having to manually call ObjectManager#find() on your own for every iteration.

Automatic flushing/clearing

In this example, the EntityManager will be flushed and cleared only once, but if there were more than 100 records, then it would flush (and clear) twice or more.

Example (query/iterators)

The previous example is still not memory efficient, as we are operating on a pre-loaded array of objects loaded by the ORM.

We can use queries instead:

use DoctrineBatchUtils\BatchProcessing\SimpleBatchIteratorAggregate;

$iterable = SimpleBatchIteratorAggregate::fromQuery(
    $entityManager->createQuery('SELECT f FROM Files f'),
    100 // flush/clear after 100 iterations
);

foreach ($iterable as $record) {
    // operate on records here
}

Or our own iterator/generator:

use DoctrineBatchUtils\BatchProcessing\SimpleBatchIteratorAggregate;

// This is where you'd persist/create/load your entities (a lot of them!)
$results = function () {
    for ($i = 0; $i < 100000000; $i += 1) {
        yield new MyEntity($i); // note: identifier must exist in the DB
    }
};
 
$iterable = SimpleBatchIteratorAggregate::fromTraversableResult(
    $results(),
    $entityManager,
    100 // flush/clear after 100 iterations
);

foreach ($iterable as $record) {
    // operate on records here
}

// eventually after all records have been processed, the assembled transaction will be committed to the database

Both of these approaches are much more memory efficient.