Nothing is certain except death and taxes, the saying goes — but how to make taxes fair and beneficial remains an open question. New research aims to answer it.
What’s new: Stephan Zheng and colleagues at Salesforce built a tax planning model called AI Economist. It observes reinforcement learning agents in an economic simulation and sets tax rates that promote their general welfare.
Key insight: Economic simulations often use pre-programmed agents to keep the computation manageable, but hard-coding makes it difficult to study the impact of tax rates on agent behavior. A reinforcement learning (RL) system that accommodates different types of agents can enable worker agents to optimize their own outcomes in response to tax rates, while a policy-maker agent adjusts tax rates in response to the workers’ actions. This dual optimization setup can find a balanced optimum between the interests of individual workers and the policy maker.
How it works: Four workers inhabited a two-dimensional map, 25 squares per side. One episode spanned 10 tax periods, each lasting 100 time steps. The policy maker changed tax rates after each period. Workers sought high income and low labor individually, while the policy maker pursued social welfare, the product of the average difference in incomes and the sum of all incomes.
- The workers and policy maker were convolutional LSTMs trained using proximal policy optimization (PPO).
- Workers learned whether to move, gather building materials, sell them to each other for coins, build houses, sell houses, or do nothing. Each action consumed a certain amount of effort and accrued a certain amount of income. Their choices were influenced by their neighborhood, wealth, gathering skill (productivity in collecting materials), building skill (which determined the market value of a house), market prices (based on asking prices, bids, and past transactions), and tax rates.
- The policy maker set the tax rates for seven income brackets based on current prices, tax rates, and worker wealth. It distributed tax revenue equally among workers.
Results: The authors observed several realistic phenomena. Workers specialized: Skilled builders constructed houses while others gathered materials. There was a tradeoff between productivity and quality; that is, more-productive builders produced houses of lower quality. And workers developed strategies to game the system by, say, delaying a house sale to a later period when the tax rate might be lower. When it came to promoting general welfare — measured as the product of income equality and productivity — AI Economist achieved 1,664, outperforming three benchmarks: a widely studied tax framework called the Saez formula (1,435), the U.S. Federal Income Tax schedule (1,261), and no taxes (1,278). Its policy also outperformed those baselines when human players stood in for the RL workers.
Why it matters: Reinforcement learning with heterogeneous agents can automate the modeling of incentives in interactions between different parties such as teachers and students, employers and employees, or police and criminals.
We’re thinking: Simulations of this nature make many assumptions about incentives, rate of learning, cost of various actions, and so on. They offer a powerful way to model and make decisions, but validating their conclusions is a key step in mapping them to the real world.