This PHP library offers a way to read data from, and write data to, a range of file formats and media. Additionally, it includes tools to manipulate your data.
- Read from and write to CSV files, Excel files, databases, and more.
- Convert between charsets, dates, strings and objects on the fly.
- Build reusable and extensible import workflows.
- Decoupled components that you can use on their own, such as a CSV reader and writer.
- Well-tested code.
This library is available on Packagist. The recommended way to install it is through Composer:
$ composer require ddeboer/data-import:@stable
For integration with Symfony2 projects, the DdeboerDataImportBundle is available.
Each data import revolves around the workflow and takes place along the following lines:
- Construct a reader.
- Construct a workflow and pass the reader to it. Add at least one writer to the workflow.
- Optionally, add filters, item converters and value converters to the workflow.
- Process the workflow. This will read the data from the reader, filter and convert the data, and write the output to each of the writers.
So, schematically:
use Ddeboer\DataImport\Workflow;
use Ddeboer\DataImport\Reader;
use Ddeboer\DataImport\Writer;
use Ddeboer\DataImport\Filter;
$reader = new Reader\...;
$workflow = new Workflow($reader);
$workflow->addWriter(new Writer\...());
->addWriter(new Writer\...())
->addFilter(new Filter\CallbackFilter(...))
->process();
This library includes a handful of readers:
- An
ArrayReader
for reading arrays (or testing your workflow). - A
CsvReader
for reading CSV files, optimized to use a little memory as possible:use Ddeboer\DataImport\Reader\CsvReader; $reader = new CsvReader(new \SplFileObject('/path/to/csv_file.csv'));
- A
DbalReader
to read data through Doctrine’s DBAL:use Ddeboer\DataImport\Reader\DbalReader; $reader = new DbalReader( $connection, // Instance of \Doctrine\DBAL\Connection 'SELECT u.id, u.username, g.name FROM `user` u INNER JOIN groups g ON u.group_id = g.id' );
- A
DoctrineReader
to read data through the Doctrine ORM:use Ddeboer\DataImport\Reader\DoctrineReader; $reader = new DoctrineReader($entityManager, 'Your\Namespace\Entity\User');
- An
ExcelReader
that acts as an adapter for the PHPExcel library:use Ddeboer\DataImport\Reader\ExcelReader; $reader = new ExcelReader(new \SplFileObject('/path/to/ecxel_file.xls'));
- You can create your own data reader by implementing the ReaderInterface.
After you’ve set up your reader, construct the workflow from it:
$workflow = new Workflow($reader);
Many of the data writers closely resemble their reader counterparts:
- An
ArrayWriter
. - A
CsvWriter
. - A
DoctrineWriter
.
Also available are:
- A
ConsoleProgressWriter
that displays import progress when you start the workflow from the command-line:use Ddeboer\DataImport\Writer\ConsoleProgressWriter; use Symfony\Component\Console\Output\ConsoleOutput; $output = new ConsoleOutput(...); $progressWriter = new ConsoleProgressWriter($output, $reader);
Build your own writer by implementing the WriterInterface.
If you want, you can use multiple writers:
$workflow->addWriter($progressWriter)
->addWriter($csvWriter);
A filter decides whether data input is accepted into the import process. The library currently ships with a CallbackFilter:
use Ddeboer\DataImport\Filter\CallbackFilter;
// Don’t import The Beatles
$filter = new CallbackFilter(function($data) {
if ('The Beatles' == $data['name']) {
return false;
} else {
return true;
}
});
$workflow->addFilter($filter);
Value converters are used to convert specific fields (e.g., columns in database).
-
A
DateTimeValueConverter
that converts a date representation in a format you specify into aDateTime
object: ```php use Ddeboer\DataImport\ValueConverter\DateTimeValueConverter;$converter = new DateTimeValueConverter('d/m/Y H:i:s'); $workflow->addValueConverter('my_date_field', $converter); ```
-
A
StringToObjectConverter
that looks up an object in the database based on a string value:use Ddeboer\DataImport\ValueConverter\StringToObjectConverter; $converter = new StringToObjectConverter($repository, 'name'); $workflow->addValueConverter('input_name', $converter);
use Ddeboer\DataImport\Workflow;
use Ddeboer\DataImport\Source\HttpSource;
use Ddeboer\DataImport\Source\Filter\Unzip;
use Ddeboer\DataImport\Reader\CsvReader;
use Ddeboer\DataImport\ValueConverter\DateTimeValueConverter;
(...)
// Create the source; here we use an HTTP one
$source = new HttpSource('http://www.opta.nl/download/registers/nummers_csv.zip');
// As the source file is zipped, we add an unzip filter
$source->addFilter(new Unzip('nummers.csv'));
// Retrieve the \SplFileObject from the source
$file = $source->getFile();
// Create and configure the reader
$csvReader = new CsvReader($file);
$csvReader->setHeaderRowNumber(0);
// Create the workflow
$workflow = new Workflow($csvReader);
$dateTimeConverter = new DateTimeValueConverter();
// Add converters to the workflow
$workflow
->addValueConverter('twn_datumbeschikking', $dateTimeConverter)
->addValueConverter('twn_datumeind', $dateTimeConverter)
->addValueConverter('datummutatie', $dateTimeConverter)
// You can also add closures as converters
->addValueConverter('twn_nummertm',
new \Ddeboer\DataImport\ValueConverter\CallbackValueConverter(
function($input) {
return str_replace('-', '', $input);
}
)
->addValueConverter('twn_nummervan',
new \Ddeboer\DataImport\ValueConverter\CallbackValueConverter(
function($input) {
return str_replace('-', '', $input);
}
)
// Use one of the writers supplied with this bundle, implement your own, or use
// a closure:
->addWriter(new \Ddeboer\DataImport\Writer\CallbackWriter(
function($csvLine) {
var_dump($csvLine);
}
));
// Process the workflow
$workflow->process();
The ArrayValueConverterMap is used to filter values of a multi-level array.
The converters defined in the list are applied on every data-item's value that match the defined array_keys.
//...
$data = array(
'products' => array(
0 => array(
'name' => 'some name',
'price' => '€12,16',
),
1 => array(
'name' => 'some name',
'price' => '€12,16',
),
)
);
// ...
// create the workflow and reader etc.
// ...
$workflow->addValueConverter(new ArrayValueConverterMap(array(
'name' => array(new CharsetValueConverter('UTF-8', 'UTF-16')), // encode to UTF-8
'price' => array(new CallbackValueConverter(function($input) { // remove € char
return str_replace('€', '', $intput);
}),
)));
// ..
// after filtering data looks as follows
$data = array(
'products' => array(
0 => array(
'name' => 'some name', // in UTF-8
'price' => '12,16',
),
1 => array(
'name' => 'some name',
'price' => '12,16',
),
)
);
The global-mapping allows you to define an array that is used to rename fields of an item.
Using global-mapping can be used to add renaming-rules for a multi-level array and is applied after the standard-mapping rules are applied.
//...
$data = array(
0 => array(
'foo' => 'bar',
'baz' => array(
'some' => 'value',
'some2' => 'value'
)
)
);
// ...
// create the workflow and reader etc.
// ...
// this defines a single mapping
$workflow->addMapping('baz', 'bazinga');
// this defines renaming global rules
$workflow->setGlobalMapping(array(
'foo' => 'fooloo',
'bazinga' => array( // we need to use the new name here because global mapping is applied after standard mapping
'some' => 'something',
'some2' => 'somethingelse'
)
));
// ..
// after filtering data looks as follows
$data = array(
0 => array(
'fooloo' => 'bar',
'bazinga' => array(
'something' => 'value',
'somethingelse' => 'value'
)
)
);