Garmelon/PFERD

Request: Make download summary data public

TheChristophe opened this issue · 1 comments

Hi,
I have a setup where my system automatically runs a script that runs PFERD to download ILIAS contents, and then runs rclone to upload everything to the cloud. Since all of this is headless, I have configured my pferd config to email me a list of new and changed files when it's done.
It's built simple:

    summary = pferd._download_summary
    # List[Path] -> List[str]
    new_files = list(map(lambda f: str(f.relative_to(cwd)), summary.new_files))
    updated_files = list(map(lambda f: str(f.relative_to(cwd)), summary.modified_files))
    mail.mail_update(new_files, updated_files)

Unfortunately, for this I have to use pferd._download_summary, which is private. Is it possible to add some native method to receive a 'changelog' of sorts?

Upon successful completion, crawlers will leave a JSON file called .report in their output directory. This file includes information about file additions, deletions and changes and shouldn't be hard to parse/use.