Computing Taxes with Beancount

This article is written for users in the US.

Introduction

For several years, I have used Beancount in conjunction with python-taxes (from user davidcmoore) to to generate a draft of my tax forms, including W2s, with very little effort each year. This article explains how.

Python-taxes includes several advanced federal tax forms (and California forms if you live there). I have a bridge script that queries and extracts numbers from Beancount and fees them as input to python-taxes.

I don’t use this setup to file taxes, given the complexity of keeping up with forms and the law each year. Instead, I use it for:

  • estimating taxes (to make estimated payments, avoid penalties, check safe harbor, etc.)
  • tax planning during the year. Eg: charitable donations, IRA contributions, tax-loss harvesting, which may depend on knowing one’s capital gains and tax brackets
  • verification of taxes computed using other means
  • verification of W2s, 1099s, and other forms: to ensure forms are not missed while filing; to sanity check; and to catch the rare error in the forms

In the last few years, this process has yielded with very little time spent (think 15min), numbers that are the same or very close, enough to make it very useful for the above.

Tax filing in the US can be highly individual. This articles details my setup, including ideas and gotchas, with the goal of saving you time if you are setting this up for yourself. The key goal is maintainability, achieved via simplicity. This minimizes the fiddling needed each year to get things right. Let’s jump in.

Approach

The overall idea is simple and straightforward. python-taxes takes in your input numbers, and calculates your taxes. The input numbers will come from these sources:

  1. Beancount queries (most data)
  2. Manual data input for data not stored in Beancount. Eg: foreign taxes paid via investments, qualified dividends
  3. Manually entered from prior years: carryovers

What do you need?

To make this work well, you need:

  1. Data: If you maintain your entire financial picture in Beancount, including your investments, loans, income, and paycheck in full detail, you should be set.

  2. A reliable way to construct queries from your data that works year after year with no fiddling needed. I design my account hierarchies, and use account metadata to keep these queries simple and accomplish this. My account hierarchy looks like this (Midelity is used as an example brokerage):

    Assets:Banks:BankOfUSA
    Assets:Investments:
                    β”œβ”€ HSA:
                    β”œβ”€ Tax-Free:
                    β”œβ”€ Tax-Deferred:
                    └─ Taxable:
                            β”œβ”€ Midelity:AAPL
                            └─ ...
    Income:Investments:
                    β”œβ”€ Tax-Deferred:
                    └─ Taxable:
                          β”œβ”€ Capital-Gains:
                          β”‚             β”œβ”€ Long:[Midelity:AAPL, ...]
                          β”‚             └─ Short:
                          β”œβ”€ Capital-Gains-Distributions:
                          β”‚                           β”œβ”€ Long:
                          β”‚                           └─ Short:
                          β”œβ”€ Dividends:[Midelity:ORNG, ...]
                          └─ Interest:
    

    Designing taxability into your hierarchy helps readily answer most relevant questions:

    • what is my total taxable investment income for the year?
    • what are my capital gains (both long term and short term)?
    • what income was from each person (if multiple people share a ledger)?

    For more on this topic, see Organizing Your Account Hierarchy for Tax Analysis

  3. A way to maintain the hierarchy above with zero effort: see: β€œMaintain the Hierarchy Above with Zero Effort” in Organizing Your Account Hierarchy for Tax Analysis

  4. Booking tax payment/estimated/refunds in a helpful way. I use this hierarchy:

    Expenses:Federal-Income-Tax:
                             β”œβ”€ Payments:
                             β”‚        β”œβ”€ Estimated:
                             β”‚        └─ Filing:
                             β”œβ”€ Refund:
                             └─ Withheld:
    

    I use my effective_date plugin to book every single Expense posting related to a tax year, in that tax year. Doing so is key to keeping queries simple and maintainable. Eg:

    2020-05-15 * "2019 Federal refund received"
      Assets:Bank  10 USD
      Expenses:Federal-Income-Tax:Refund
        effective_date: 2019-12-31
    

Implementation

Let’s dive into the implementation. This is necessarily code that is highly custom to your tax filing situation. So I recommend taking inspiration, ideas, and snippets from my code and building your own code. The code is minimal.

My code is across three files, simple and self-explanatory:

  1. I use a small library to build and execute a bunch of BQL queries

  2. My annual data is collected here

  3. And this is the main script. It first generates a W2 from BQL queries, and then feeds the W2 and more BQL queries to python-taxes.

Here are some finer points:

  • A couple of the queries above use account metadata to figure out what accounts to include/exclude in one’s tax computations (eg: W2 box-1). This makes it possible to make such declarations:

    1821-01-01 open Income:Employment:Benefits:Employer-401k USD
        taxable-income-box1: "False"
    
  • Classifying capital gains:
    • Long-term vs short-term capital gains: see my long_short plugin: It classifies and books gains as long or short term based on length of time held.
    • Gains vs losses: although not necessary, it is helpful to classify these to aid debugging. See my gain_loss plugin.
  • Some data items are not available in Beancount. Eg: foreign taxes paid via investments, qualified dividends. I input these numbers (usually a couple to a handful) from tax forms into (2) above. To make estimates by Dec 31 before tax forms are available, I find that a quick approximation based on past years works surprisingly well. This of course, is highly dependent on your individual investing style, income variability, and such.

  • Year to year carryovers: A few numbers may carry over from year to year. Eg: capital losses (long, short term), AMT. I enter these manually in (2) above. I found automating them by looking up the prior year’s python-taxes output to be prone to frequent breakage as tax form lines can change due to tax laws changing.

End Result

Leveraging your Beancount data to feed into python-taxes enables you to:

  • compute your taxes instantly at any point throughout the year
    • side benefit: the account hierarchy above lets me keep track of realized, taxable gains across brokerages with no effort, which makes tax planning (eg: tax-loss harvesting decisions) easier
  • use this computation for estimation and verification.

Tax and financial laws vary geographically. This article is written for users in the US. Of course, I bear absolutely no liability if you read this article and make errors in your taxes or finances. This is not advice of any sort, legal, financial, or otherwise, yada yada.

Notes mentioning this note