redstreet/beancount_reds_importers

OFX Tag Issue - List Index Out of Range

Closed this issue · 9 comments

Hi Again,

One of my accounts (Wise), doesn't export OFX files, but there is a tool to convert the CSV files to OFX files ({ofxstatement}). Unfortunately, the Reds OFX importers don't seem to be able to read the converted files.

I currently use {bean count-import}, which is able to import these OFX files after removing an empty <DTASOF /> tag from the end of the file - see here and here for more details. However, I prefer the logic of your framework and like the idea of using your plugins, so I'm trying to make the switch, but this is a sticking point.

I get the error below.

❯ bean-extract my.import 2021_Wise-CAD.ofx
ERROR:root:Importer personal_importers.wise_cad.Importer.identify() raised an unexpected error: list index out of range
Traceback (most recent call last):
  File "/Users/usr/.pyenv/versions/3.10.4/envs/personal-finance/lib/python3.10/site-packages/beancount/ingest/identify.py", line 63, in find_imports
    matched = importer.identify(file)
  File "/Users/usr/.pyenv/versions/3.10.4/envs/personal-finance/lib/python3.10/site-packages/beancount_reds_importers/libreader/reader.py", line 25, in identify
    self.initialize_reader(file)
  File "/Users/usr/.pyenv/versions/3.10.4/envs/personal-finance/lib/python3.10/site-packages/beancount_reds_importers/libreader/ofxreader.py", line 20, in initialize_reader
    self.ofx = ofxparse.OfxParser.parse(open(file.name))
  File "/Users/usr/.pyenv/versions/3.10.4/envs/personal-finance/lib/python3.10/site-packages/ofxparse/ofxparse.py", line 396, in parse
    ofx_file = OfxPreprocessedFile(file_handle)
  File "/Users/usr/.pyenv/versions/3.10.4/envs/personal-finance/lib/python3.10/site-packages/ofxparse/ofxparse.py", line 183, in __init__
    tag_name = re.findall(r'(?i)<([a-z0-9_\.]+)>', token)[0]
IndexError: list index out of range
;; -*- mode: beancount -*-

where my importer has the following code

"""Wise ofx importer for beancount."""

from beancount_reds_importers.libreader import ofxreader
from beancount_reds_importers.libtransactionbuilder import banking


class Importer(banking.Importer, ofxreader.Importer):
    def custom_init(self):
        if not self.custom_init_run:
            self.max_rounding_error = 0.04
            self.account_number_field = 'account_id'
            self.filename_pattern_def = '.*[Ww]ise'
            self.custom_init_run = True

Based on the trace it looks like it might be an issue with the {ofxparse} package and possible empty tags, as well as this issue`, so I appreciate this may not be something that can be fixed here, but I have only moderate python experience so my ability to diagnose a problem and hack a solution is limited, and I figured I may be missing something.

If it would be useful, I can try and put together a sample OFX file for a reprex.

Thanks,

Allow me to go on a slight excursion here :). From experience running into this type of situation, I'd normally recommend biting the bullet and simply writing a csv importer for your CSV file.

Because even if you manage to get the csv -> ofx -> import route working now, between future breaking changes to the csv and missing features in ofxstatement, problems will invariably crop up in layers that make it very hard for you to debug. This is a big part of my motivation for writing beancount_reds_importers: to make writing importers as easy as possible, so I didn't have to pursue complex solutions that would invariably keep breaking.

Ofxstatement's last commit was Apr 2021. That's a big red flag to me in creating a dependency on that repo.

Now for the other side of the argument (favoring considering the csv->ofx->import route):

  1. I do see that this particular case is not going to be exactly trivial. I wrote an importer for Discover last week, and it literally took just a few minutes. It's here. I can't tell if Wise is going to be much more effort or not looking at this example. It may be just as simple; it may not. The main issue I see is, transfers across currency may not currently be directly supported by any of the transaction builders.

  2. Your stated python experience.

If it were me, I'd definitely go with writing my own csv importer and solving the problem once and for all, since (1) and (2) above are not a big factor for me. I don't have a strong recommendation for you on which path to choose, but if you go with writing your own importer, I'm here to help answer questions as they come up. And if at all you plan to write more importers, the experience will be very valuable and will save time in the future.

Now back to a direct answer to your question: I tend to have very limited time these days, and if you manage to send me a test case that definitely mirrors the issue, unless it's something I can fix instantly (< 5 mins), it'll probably sit in my inbox for an indefinite amount of time, unfortunately. You seem to have done due diligence here, so it's unlikely I'll find something you haven't found. But feel free to send me something, and I can send a couple minutes to see if something obvious stands out. Speaking of which, have you tried looking at all empty tags and removing them?

Thanks again for the quick and extensive response. I think I'll go the route of writing a CSV importer for the reasons you mentioned - I had written one for {beancount-import}.

I started to write one but have run into errors about the type. There is no obvious type, like in the Schwab CSV importer, so I tried to leave it blank like in your Discover example, but that cause this error.

❯ bean-extract my.import test-Wise-USD.csv
ERROR:root:Importer personal_importers.wise_csv.Importer.extract() raised an unexpected error: 'Importer' object has no attribute 'transaction_type_map'
Traceback (most recent call last):
  File "/Users/usr/.pyenv/versions/3.10.4/envs/personal-finance/lib/python3.10/site-packages/beancount/ingest/extract.py", line 182, in extract
    new_entries = extract_from_file(
  File "/Users/usr/.pyenv/versions/3.10.4/envs/personal-finance/lib/python3.10/site-packages/beancount/ingest/extract.py", line 67, in extract_from_file
    new_entries = importer.extract(file, **kwargs)
  File "/Users/usr/.pyenv/versions/3.10.4/envs/personal-finance/lib/python3.10/site-packages/beancount_reds_importers/libtransactionbuilder/banking.py", line 89, in extract
    self.read_file(file)
  File "/Users/usr/.pyenv/versions/3.10.4/envs/personal-finance/lib/python3.10/site-packages/beancount_reds_importers/libreader/csvreader.py", line 129, in read_file
    rdr = self.convert_columns(rdr)
  File "/Users/usr/.pyenv/versions/3.10.4/envs/personal-finance/lib/python3.10/site-packages/beancount_reds_importers/libreader/csvreader.py", line 88, in convert_columns
    rdr = rdr.convert('type', self.transaction_type_map)
AttributeError: 'Importer' object has no attribute 'transaction_type_map'
;; -*- mode: beancount -*-

This is my importer (based on the {ofxstatement} sample, which differs very slightly in a non-significant manner from my exports).

""" Wise CAD csv importer."""

from beancount_reds_importers.libreader import csvreader
from beancount_reds_importers.libtransactionbuilder import banking


class Importer(banking.Importer, csvreader.Importer):
    def custom_init(self):
        self.max_rounding_error = 0.04
        self.filename_pattern_def = '.*Wise*'
        

        # format specific to Wise CSV
        self.date_format = "%m/%d/%Y"
        self.skip_head_rows = 0

        self.header_identifier = "TransferWise ID,Date,Amount,Currency,Description,Payment Reference,Running Balance,Exchange From,Exchange To,Exchange Rate,Payer Name,Payee Name,Payee Account Number,Merchant,Total fees"

        # CSV column spec
        self.header_map = {
            "Date": "date",
            "Amount": "amount",
            "Currency": "currency",
            "Description": "memo",
            "Running Balance": "balance",
            "Payee Name": "payee",
            "Total fees": "fees",

        }
    
        # self.transaction_type_map = {
        #     "*" : ""
        #     }
        # self.skip_transaction_types = [""]

    def skip_transaction(self, ot):
        return False

    # Cleaning up CSV: add/remove columns
    def prepare_raw_columns(self, rdr):

        rdr = rdr.cut("Date", "Amount", "Currency", "Description", "Payee Name", "Total fees", "Running Balance")
        return rdr

If I uncomment the transaction_type_map lines, it produces the following error

❯ bean-extract my.import test-Wise-USD.csv
ERROR:root:Importer personal_importers.wise_csv.Importer.extract() raised an unexpected error: selection is not a field or valid field index: 'type'
Traceback (most recent call last):
  File "/Users/cfa5228/.pyenv/versions/3.10.4/envs/personal-finance/lib/python3.10/site-packages/petl/transform/conversions.py", line 368, in iterfieldconvert
    k = flds.index(k)
ValueError: 'type' is not in list

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/usr/.pyenv/versions/3.10.4/envs/personal-finance/lib/python3.10/site-packages/beancount/ingest/extract.py", line 182, in extract
    new_entries = extract_from_file(
  File "/Users/usr/.pyenv/versions/3.10.4/envs/personal-finance/lib/python3.10/site-packages/beancount/ingest/extract.py", line 67, in extract_from_file
    new_entries = importer.extract(file, **kwargs)
  File "/Users/usr/.pyenv/versions/3.10.4/envs/personal-finance/lib/python3.10/site-packages/beancount_reds_importers/libtransactionbuilder/banking.py", line 90, in extract
    for ot in self.get_transactions():
  File "/Users/usr/.pyenv/versions/3.10.4/envs/personal-finance/lib/python3.10/site-packages/beancount_reds_importers/libreader/csvreader.py", line 135, in get_transactions
    for ot in self.rdr.namedtuples():
  File "/Users/usr/.pyenv/versions/3.10.4/envs/personal-finance/lib/python3.10/site-packages/petl/util/base.py", line 525, in iternamedtuples
    for row in it:
  File "/Users/usr/.pyenv/versions/3.10.4/envs/personal-finance/lib/python3.10/site-packages/petl/transform/conversions.py", line 440, in iterfieldconvert
    for row in it:
  File "/Users/usr/.pyenv/versions/3.10.4/envs/personal-finance/lib/python3.10/site-packages/petl/transform/conversions.py", line 440, in iterfieldconvert
    for row in it:
  File "/Users/usr/.pyenv/versions/3.10.4/envs/personal-finance/lib/python3.10/site-packages/petl/transform/conversions.py", line 440, in iterfieldconvert
    for row in it:
  [Previous line repeated 4 more times]
  File "/Users/usr/.pyenv/versions/3.10.4/envs/personal-finance/lib/python3.10/site-packages/petl/transform/conversions.py", line 370, in iterfieldconvert
    raise FieldSelectionError(k)
petl.errors.FieldSelectionError: selection is not a field or valid field index: 'type'
;; -*- mode: beancount -*-

On your point about transfers across currency, I was thinking about just putting them through an equity account, e.g. USD and CAD specific accounts, that shouldn't matter if they don't add up to 0, though if you have strong feelings about this please let me know!

You're welcome! The issue you ran into is because I'm not yet sure if Wise fits neatly into one of the existing transaction builders (same as what I mentioned above).

If you could send me an anonymized example of the csv, and the expected output (beancount transactions), I can probably speed up your importer writing by either providing guidance or adding the transaction builder code if that's necessary.

Use the ideal Beancount transaction you'd like to see, without worrying about using Equity accounts :).

Sorry for the delay in response. I've attached an anonymized CSV file here so it may be useful for others to read/look at the process (note that Wise uses dd/mm/yyyy date formats). Below is the expected output.

Red_Wise-sample.csv

2022-11-02 * "Wise Charges for: TRANSFER-111111111"
  Assets:Banks:USD:Wise  -0.39 USD
  Expenses:Banking:USD:Charges  0.39 USD

2022-11-02 * "Sent money to Callum Arnold"
  Assets:Banks:USD:Wise  -1 USD
  Assets:Banks:USD:OtherBank  1 USD ; Note that this would be present in the OtherBank file

2022-11-08 * "Received money from Jane Doe with reference"
  Assets:Banks:USD:Wise  200 USD
  Income:Gifts:USD  -200 USD

2022-11-17 * "Converted GBP to USD"
  Assets:Banks:USD:Wise  200 USD @ 1.17805 GBP
  Assets:Banks:GBP:Wise  -169.77 GBP ; Note that this would also have an associated charge within the GBP CSV file like the transfer at the top

2022-11-17 * "Converted GBP to USD"
  Assets:Banks:USD:Wise  400 USD @ 1.1781 GBP
  Assets:Banks:GBP:Wise  -339.53 GBP ; Note that this would also have an associated charge within the GBP CSV file like the transfer at the top

2022-11-18 * "Paid to AMEX EPAYMENT"
  Liabilities:USD:Amex  150 USD
  Assets:Banks:USD:Wise  -150 USD

2022-11-18 * "Paid to AMEX EPAYMENT"
  Liabilities:USD:Amex  400 USD
  Assets:Banks:USD:Wise  -400 USD

2022-11-18 balance Assets:Banks:USD:Wise                   255.95 USD

Thanks for your help!

This is straightforward and quite simple, and can be handled easily by banking.py with a couple small tweaks:

  1. You shouldn't get errors about type if you use the bleeding edge version from the git repo
  2. For the currency conversions, you need to add just a line or two to banking.py: replace this
    with something that looks like this, if ot.unit_price and ot.target_currency exist, and are non-empty
  3. Map your Exchange Rate column to unit_price
  4. Map your Exchange To (or From) for each line to target_currency. You will have to ensure your code handles conversions in both directions. It should just be a line or two.

Nothing else should be needed, as smart_importer will take care of the target accounts. Feel free to let me know if you have any questions at all!

@arnold-c : I just checked in cf89865 to handle (2) above, FYI.

@arnold-c for (3) and (4), here are two new importers that implement these:

Thanks very much @redstreet! I had written the following importer which seemed to work as hoped, and at first glance looks similar to your propositions, but when I get time to check in detail I'll update this issue with a final working copy.

For the time being I'll close the issue as I think we're sorted as far as this issue is concerned. Thanks very much for your patience and time in walking me through this process.

""" Wise csv importer."""

import itertools
from beancount.core import data
from beancount_reds_importers.libreader import csvreader
from beancount_reds_importers.libtransactionbuilder import (banking, common)


class Importer(banking.Importer, csvreader.Importer):
    def custom_init(self):
        self.max_rounding_error = 0.04
        self.filename_pattern_def = '.*statement*'

        # format specific to Wise CSV
        self.date_format = "%d-%m-%Y"
        self.skip_head_rows = 0

        self.header_identifier = '"TransferWise ID",Date,Amount,Currency,Description,"Payment Reference","Running Balance","Exchange From","Exchange To","Exchange Rate","Payer Name","Payee Name","Payee Account Number",Merchant,"Card Last Four Digits","Card Holder Full Name",Attachment,Note,"Total fees"'

        self.header_map = {
            "Date": "date",
            "Amount": "amount",
            "Currency": "currency",
            "Description": "memo",
            "Running Balance": "balance",
            "Payee Name": "payee",
            "Total fees": "fees",
            "Exchange Rate": "unit_price",
            "Exchange To": "target_currency",
            "Exchange From": "source_currency"
        }

    # Cleaning up CSV: add/remove columns
    def prepare_raw_columns(self, rdr):

        rdr = rdr.cut("Date", "Amount", "Currency", "Description", "Payee Name", "Total fees", "Running Balance", "Exchange Rate", "Exchange To", "Exchange From")
        return rdr

    def extract(self, file, existing_entries=None):
        """
        Overwrite extract method in banking.py to add exchange rate and target
        currency. Much of the code is the same, except for the if statement at
        the end.
        """
        self.initialize(file)
        counter = itertools.count()
        new_entries = []
        config = self.config

        self.read_file(file)
        for ot in self.get_transactions():
            if self.skip_transaction(ot):
                continue
            metadata = data.new_metadata(file.name, next(counter))
            # metadata['type'] = ot.type # Optional metadata, useful for debugging #TODO
            metadata.update(
                self.build_metadata(
                    file,
                    metatype='transaction',
                    data={'transaction': ot}
                )
            )

            # With Beancount, the grammar is (payee, narration). payee is optional, narration is
            # mandatory. This is a bit unintuitive. In addition, smart_importer relies on
            # narration, so keeping the order unchanged in the call below is important.

            # Build transaction entry
            entry = data.Transaction(
                metadata,
                ot.date.date(),
                self.FLAG,
                self.get_narration(ot),
                self.get_payee(ot),
                data.EMPTY_SET, data.EMPTY_SET, []
            )
            
            # Determines if currency conversion was done and outputs with exchange rate
            if ot.unit_price and ot.target_currency:

                common.create_simple_posting_with_cost_or_price(
                    entry,
                    config['main_account'],
                    ot.amount,
                    self.currency,
                    price_number=ot.unit_price,
                    price_currency=ot.source_currency,
                )

            else:
                data.create_simple_posting(
                    entry, config['main_account'], ot.amount, self.currency
                )

            # TODO: Commented out so smart_importer can fill this in
            # target_acct = self.get_target_acct(ot)
            # data.create_simple_posting(entry, target_acct, None, None)

            new_entries.append(entry)

        new_entries += self.extract_balance(file, counter)
        new_entries += self.extract_custom_entries(file, counter)

        return new_entries

You're welcome!

You shouldn't need the extract() method at all. The latest one in banking.py should do what you want, and helps keep things DRY. If not, feel free to send me a PR for it.

EDIT: ah, you probably wrote it before I updated banking.py. Feel free to check and remove your extract() method, and then send a PR---I'd be happy to include wise in this repo.