- Researchers used a reinforcement learning model to learn more about why we habituate and compare, despite it making us unhappy.
- They found that habituating and comparing may play an important role in adaptive behavior that aids learning.
- They concluded that their findings could help shed light on why we always tend to want more.
Happiness is one of the most sought-after human emotions. Achieving it over the long term, however, is unattainable for many.
This is because happiness depends on changing expectations that make people quickly habituate to ‘new reasons to be happy’. It also depends on whether people compare what they have with others or what they wish they could have.
Understanding the costs and benefits of habituation and comparison could help researchers develop policies and large-scale interventions to tackle these mental biases.
Recently, researchers used a computational framework known as reinforcement learning to model the effects of different levels of habituation and comparison-making.
They found that while making comparisons reduces happiness, it speeds up learning.
“You might think that building a robot that can choose among different options is easy: you just give everything a score and choose the best. But actually figuring out how to set up that score to get your robot to make good choices is surprisingly tricky. This paper looks at human happiness from this perspective.”
“[In particular, the researchers answer the question:] why is the same outcome delightful today but boring tomorrow? They show it has advantages—if we are never satisfied, we are constantly driven to find better outcomes—but also disadvantages, as this comes at the expense of constantly devaluing what we already have achieved, which the authors suggest might, taken to extremes, relate to depression.”
— Dr. Nathaniel Daw
The study was published in PLOS Computational Biology.
The researchers used a reinforcement learning framework. Rachit Dubey, a fifth-year Ph.D. student at Princeton University and lead author of the study, told MNT:
“Reinforcement learning methods focus on training an agent- for example, a robot—so that the agent learns how to map situations to actions—such as learning how to play chess. The guiding principle of these methods is that they train agents using rewards —they provide positive rewards to desired behaviors and/or negative rewards to undesired ones.”
“This is similar to the way we learn from rewards—we are more likely to take those actions which give us positive rewards like money, praise, etc., and we avoid actions that give us negative rewards like pain, sadness, etc,” he added.
For the study, the researchers trained an agent by giving it a ‘reward’ each time it exceeded its previous expectation and the performance of other agents. They then conducted various experiments in different environments with the agents.
In doing so, they found that agents rewarded for habituation and comparison learned significantly faster than standard reward-based agents, although they were less happy.
This means that habituation and comparison might promote adaptive behavior by serving as a powerful learning signal.
They also found that making comparisons sped up learning as it provided an exploration incentive to agents, and that proper expectations served as useful aids for comparison, especially in environments with sparse rewards.
They further noted, however, that agents were unhappy and performed sub-optimally when comparisons were left unchecked and when there were too many similar options to choose from.
This, noted Dubey, means that when faced with many choices, we should try to make decisions without relying on comparisons.
When asked how the reward function may make agents learn faster but be ‘less happy’, Dubey said:
“Habituation and comparisons induce unhappiness because the agent derives no positive rewards from known scenarios—as it rapidly habituates to good things. It also derives very few positive rewards from good scenarios, as it compares itself to something even better. However, this helps an agent learn faster because they motivate the agent to try new actions and new scenarios, so it may quickly escape unpleasant ones.”
“To illustrate this, suppose that I get 90% on a test, and this makes me very happy. But then I see that a classmate who generally performs worse than me gets 96% on the test. This will make me quite unhappy even though my test results have not changed. This resultant unhappiness might then motivate me to study harder and achieve more in the next text,” he explained.
The researchers concluded that their results help explain why we are prone to being trapped in an endless cycle of wants and desires. They added that their results may “shed light on psychopathologies such as depression, materialism, and overconsumption.”
The researchers noted that as their experiments were conducted in over-simplified environments, they may not be generalizable to real-world scenarios.
They added that future work should also consider other aspects of happiness such as guilt and jealousy and how these interact with affective states, including anxiety and boredom.
“Given how advantageous habituation and comparisons are in promoting adaptive behavior, it could be possible that these features are deeply rooted biases in our minds. This might explain our modern obsession of ‘growth at all costs’ and why our consumption levels have increased so dramatically and are not showing any signs of slowing down.”
— Rachit Dubey, lead author
When asked about the research’s implications for the future, Dubey said:
“If we want to seriously tackle the extremely pressing issue of overconsumption—which is resulting in rapid deterioration of our planet and severely threatens future generations—we need to develop concrete policies and large-scale interventions to tackle these biases of the human mind.”