wizacode/php-etl

Add the possibility to chain extractors

ecourtial opened this issue · 2 comments

Imagine the following use case : we need the data coming form two different sources, for instance two CSV files.
Both have field in common, ex : email.

Open them in Excel, to sort them by email.
Edit the ETL to be able to chain Extractors to get the data from the two sources and merge them in one row. Since the two CSVs are sorted by email, we assume that each line in one file match the one in the other file. But we still thrown an exception if the email doesn't match.

Note that we must be able to specifify the field in common between the two files.
Question : be able to merge n files, instead of only two?

The current structure of the pipeline does not allow to chain extractors. The pipeline only accepts one extractor, one transformer and one csv. It would require a major refactorisation of the pipeline object.

So the idea will be instead to create a new MultifilesCsvExtractor, relying itself on the common Csv Extractor.

@ecourtial What do you think about something like this #25 ?