Deduping with Zerosum

Zerosum is a Beancount plugin to match two halves of a transaction that appear across accounts, or a matching pair of a transactions that needs to be reconciled across time.

An example of the former is a transfer transaction. Any transfer is going to show up on two accounts. For example, there is going to be a transaction in your checking account showing a received transfer, and one more transaction in your credit card account showing a payment for the same transfer transaction.

An example of a matching pair needing reconciliation is a “reimbursements pending” account: reimbursements that have not yet been received are booked to it, as are the reimbursement payments when they arrive. This pair would need to be reconciled.

The Zerosum README.md explains what it does, and how to use it in detail.

Zerosum for Uncleared Transactions

Zerosum also gives us the ability to see postings where one half has been imported, but the other half is pending an import. For example, a credit card payment that is still “in flight” at the time of import. Fava can be configured to auto-expand just the Assets:Zerosum:.* accounts to emphasize these, until they can be “cleared” in a future import. Meanwhile, since they appear in my account tree, my net worth is correct (eg: a transfer that has not “landed” yet appears in Assets:Zerosum:Transfers until it lands on the other side).

Performance

Zerosum uses an O(Rn) algorithm. It only ever considers a single zerosum account at a time (by filtering) and then restricts the search to the date range you specify in the config. Therefore, this ends up being something like an O(3n) algorithm depending on how many transactions on average you have between two dates. As a reference point, the Zerosum plugin adds about 270ms for some 12 years worth of data for me, which I can totally live with:

Set DEBUG=1 here to get this output:

Zerosum [0.3s]: 5551/5791 postings matched from 26193 transactions. 23 new accounts added.

To minimize the need for Zerosum to run as a plugin, a tool could be written to bake in the matched transactions into the source. This is not a priority though given the low cost of running Zerosum dynamically. See this thread for more.

Notes mentioning this note