Support multiple configs with same filename

Question

Support multiple configs with same filename

mirague opened this issue 4 years ago · 6 comments

Describe the bug
In the bank2ynab.conf there are several configurations that target the same file name, e.g. export.csv. This causes all of these configs to write a fixed_ file, meaning only the last config will be honored (see the logs below).

What did you EXPECT to happen?
It should pick the config that manages to parse the first targeted file successfully. Given the below .csv files that should be SE Nordea.

What ACTUALLY happened?
Ends up with an incorrect empty fixed_export.csv file.

What did you DO? (steps to reproduce)
Steps to reproduce the behavior:

Create a file export.csv in your exports folder with the follow content:

Datum,Transaktion,Kategori,Belopp,Saldo
2020-05-05,Reservation Kortköp Spotify,,"-99,00",
2020-05-03,Reservation Kortköp HEMKOP,,"-34,95",
2020-05-05,Swish inbetalning JOHN DOE,,"3.000,00","8.778,76"
2020-05-05,Kortköp 200999 YOU NEED A BUDGET,,"-50,19","5.778,76"
2020-05-05,Autogiro K*SomeStore,,"-943,00","5.828,95"
2020-05-04,Kortköp 200999 APPLE COM BILL,,"-9,00","6.779,95"
2020-05-04,Kortköp 200999 STEAMGAMES COM,,"-165,00","6.788,95"
2020-05-01,Vardagspaket Månadspris kort,,"-12,00","6.953,95"
2020-04-30,Autogiro K*HEMKOP,,"-279,36","6.965,95"
2020-04-29,Autogiro K*DISCORD* N,,"-50,60","7.245,31"

Then run bank2ynab.py
And then observe how the created fixed_export.csv is empty

What's your software environment?

Script language: Python 3.5
Operating system: MacOS
OS version: 10.15 Catalina

Can you provide other helpful information?
Logs:

INFO: 
Parsing input file:  /Users/mirage/Downloads/Exports/export.csv (format: BE Keytrade Bank)
INFO: Using encoding utf-8 with confidence 0.99
INFO: Using encoding utf-8 with confidence 0.99
INFO: Parsed 0 lines
INFO: Writing output file: /Users/mirage/Downloads/Exports/fixed_export.csv
INFO: 
Parsing input file:  /Users/mirage/Downloads/Exports/export.csv (format: DK Sparkassen Thy)
INFO: Using encoding utf-8 with confidence 0.99
INFO: Using encoding utf-8 with confidence 0.99
INFO: Parsed 0 lines
INFO: Writing output file: /Users/mirage/Downloads/Exports/fixed_export.csv
INFO: 
Parsing input file:  /Users/mirage/Downloads/Exports/export.csv (format: HU OTP)
INFO: Using encoding utf-8 with confidence 0.99
INFO: Using encoding utf-8 with confidence 0.99
INFO: Parsed 0 lines
INFO: Writing output file: /Users/mirage/Downloads/Exports/fixed_export.csv
INFO: 
Parsing input file:  /Users/mirage/Downloads/Exports/export.csv (format: SE Nordea)
INFO: Using encoding utf-8 with confidence 0.99
INFO: Using encoding utf-8 with confidence 0.99
INFO: Parsed 10 lines
INFO: Writing output file: /Users/mirage/Downloads/Exports/fixed_export.csv
INFO: 
Parsing input file:  /Users/mirage/Downloads/Exports/export.csv (format: SE Swedbank 2019)
INFO: Using encoding utf-8 with confidence 0.99
INFO: Using encoding utf-8 with confidence 0.99
INFO: Parsed 0 lines
INFO: Writing output file: /Users/mirage/Downloads/Exports/fixed_export.csv
INFO: 
Done! 5 files processed.

Answer 1 · 2020-05-05T20:35:04.000Z

Oh, this is interesting. The ideal outcome in this scenario would be to append a number after each fixed file, right? I was sure the script already did this! Hmm, I'll have to investigate.

Answer 2 · 2020-05-05T20:40:38.000Z

So, the while loop in the following section handles that.

def write_data(self, filename, data):
        """ write out the new CSV file
        :param filename: path to output file
        :param data: cleaned data ready to output
        """
        target_dir = dirname(filename)
        target_fname = basename(filename)[:-4]
        new_filename = "{}{}.csv".format(self.config["fixed_prefix"], target_fname)
        while os.path.isfile(new_filename):
            counter = 1
            new_filename = "{}{}_{}.csv".format(
                self.config["fixed_prefix"], target_fname, counter
            )
            counter += 1
        target_filename = join(target_dir, new_filename)
        logging.info("Writing output file: {}".format(target_filename))
        with EncodingCsvWriter(target_filename) as writer:
            for row in data:
                writer.writerow(row)
        return target_filename

Ah, I've misunderstood the issue - it's not that you're getting multiple output files, it's that you're seeing the wrong config "trigger" for your file. Fixing this would require a quality check of the data after parsing, I think.

Answer 3 · 2020-05-05T20:51:24.000Z

Exactly! In the above case it writes to the same file (fixed_export.csv) 5 times, while the logs clearly show that it only parsed 10 lines for one of the files (for format: SE Nordea).

Answer 4 · 2020-05-05T21:00:37.000Z

This seems like a relatively easy fix based off the length of the output data so. Let me see if I can put something together fast.

Answer 5 · 2020-05-05T21:14:05.000Z

Have a look at the above Pull Request and see if it fixes this issue please.

Answer 6 · 2020-05-05T21:20:18.000Z

LGTM! 💯 Worked great.