AbsaOSS/pramen

Add a File Source that allows loading files based on a date pattern

Closed this issue · 0 comments

Background

Currently, the file source always loads all files from the specified directory. This works well for lookup tables etc, but in order to implement idempotent ingestions supporting reruns it would be beneficial to encode info date in file names and always load files that correspond to the current information date.

Feature

Add a File Source that allows loading files based on a date pattern.

Example

pramen.sources = [
  {
    name = "file_source"
    factory.class = "za.co.absa.ingestionaas.pramen.components.sources.FilePatternSource"
  }
]

operations = [
   #... 
   {
      #...
      input.path = "s3://my_bucket/path/TEST_{{yyyyMMdd}}.*"
   }
]