Habit-forming Products: How Variable Rewards Modify Behaviour

How frequently do you feel the urge to connect with your friends or just to see what’s happening in their lives or play a game on your smartphone, whenever you experience a few moments of downtime?

Before smartphones, you’d likely have wasted time some other way. Thanks to the genius app makers who are using effective techniques to attract your attention so that you engage with their apps whenever you have downtime. This is a classic trigger-response example. App makers or product developers are leveraging the basic human insight i.e. operant conditioning – where the consequences of a response to a particular trigger determine the probability of it being repeated. Two things drive human actions: necessities and rewards.

What is Operant Conditioning (Reinforcement Behavior)

In 1930, a psychologist at Harvard University called B.F. Skinner made a box and placed a hungry rat inside it. The box had a lever on one side. As the rat moved about it would accidentally knock the lever and, when it did so, a food pellet would drop into the box. After a rat had been put in the box a few times, it learned to go straight to the lever and press it: the reward reinforced the behavior. He believed that the best way to understand behavior is to look at the causes of an action and its consequences.

Skinner’s work was based on Thorndike’s (1898) law of effect. According to this principle, behavior that is followed by pleasant consequences is likely to be repeated, and behavior followed by unpleasant consequences is less likely to be repeated. Skinner introduced a new term into the Law of Effect – Reinforcement.

Positive reinforcement – strengthens a behavior by providing a consequence an individual finds rewarding.
Negative reinforcement – strengthens behavior because it stops or removes an unpleasant experience.
Punishment is the opposite of reinforcement since it is designed to weaken or eliminate a response rather than increase it.

Types & Schedules of Reinforcement

For effective habit-forming i.e. for a particular action to be repeated by a user to get rewarded again and again – behaviorists discovered that different patterns (or schedules) of reinforcement had different effects on the speed of learning and extinction. Considering the rat example, the following are the different behavior patterns shown by the rat (and expectedly shown by humans)

1. The Response Rate – The rate at which the rat pressed the lever (i.e., how hard the rat worked).
2. The Extinction Rate – The rate at which lever pressing dies out (i.e., how soon the rat gave up).

Rat particular behavior is modified (reinforced) using the following different approaches.

Different types of Reinforcement

(A) Continuous Reinforcement
Behavior is positively reinforced every time a specific behavior occurs.

Response rate is SLOW
Extinction rate is FAST

(B) Fixed Ratio Reinforcement
Behavior is reinforced only after the behavior occurs a specified number of times. e.g., one reinforcement is given after every so many correct responses, e.g. after every 5th response.

Response rate is FAST
Extinction rate is MEDIUM

(C) Fixed Interval Reinforcement
One reinforcement is given after a fixed time interval providing at least one correct response has been made. An example is being paid by the hour.

Response rate is MEDIUM
Extinction rate is MEDIUM

(D) Variable Ratio Reinforcement
Behavior is reinforced after an unpredictable number of times. For examples gambling or fishing.

Response rate is FAST
Extinction rate is SLOW (very hard to extinguish because of unpredictability)

(E) Variable Interval Reinforcement
Providing one correct response has been made, reinforcement is given after an unpredictable amount of time has passed, e.g., on average every 5 minutes. An example is a self-employed person being paid at unpredictable times.

Response rate is FAST
Extinction rate is SLOW

Skinner says that the principles of operant conditioning can be used to produce extremely complex behavior if rewards and punishments are delivered in such a way as to encourage move a human/animal closer and closer to the desired behavior each time. To do this, the conditions (or contingencies) required to receive the reward should shift each time the human/ animal moves a step closer to the desired behaviour.

Most of the product developers (App makers) are using the Variable Reinforcement approach inside their apps so that the end-user can be encouraged to move closer and closer to the desired action each time.

Why Variable Rewards works

To increase the response and to reduce the extinction rate of user-desired behavior, a variable schedule of rewards (ratio or interval) should shift every time user moves a step closer to the desired behavior. A variable schedule of rewards (or Variable rewards) is the fuel that powers the habit-forming model and drives user engagement. Variable rewards attract its users with unpredictability, because of which extinction rate is very slow.

The objective of variable reward – is to keep the brain occupied as it is busy finding the patterns. How do our brains compute the value of a reward and how is that translated into action? The answer lies in the brain circuitry known as the “reward system.” Neurons in the different regions of the brain comprising the reward system communicate using dopamine. Neurons that release dopamine are activated when we expect to receive a reward. It’s not the reward itself, but the expectation of a reward that most powerfully influences emotional reactions and memories. If a reward is greater than anticipated, dopamine signaling increases. If a reward is less than expected, dopamine signaling decreases. In contrast, correctly predicting a reward does not alter dopamine signaling because we aren’t learning anything new. So, unpredictability is the key – to search endlessly, never satisfied, that creates habitual behavior from many new technologies.

Different types of rewards for different types of audience

Product developers, app makers, and tech companies understand what causes reward-seeking spikes in the brain and they mesh their products with techniques that lure us in and make us habitual to it. The variable rewards release the set of hormones inside our brain & body that makes us happy, seeking back that reward again and again. Nir Eyal, the author of “Hooked”, mentions about 3 types of variable rewards based on human behavior. I’m also going to mention how these human behaviors are linked with the release of the hormones as per the reward sought.

1. Rewards of the hunt (compulsion to collect)

Humans’ most basic instinct – hunting and gathering – occupies more than 90% of human history. Cavemen spent 15-20hrs a week hunting and gathering the food they needed to survive. Humans have had compulsions to collect for centuries. The need to acquire physical things, such as food, money, and supplies, is part of the brain’s operating system and we clearly wouldn’t have survived without this impulse. Once we have hunted for food, today we hunt for deals and knowledge. Let’s understand this with a few examples:

Knowledge: It is a basic human need. We have been fascinated with knowledge since antiquity. Google is answering it like no one else. We make queries each day on Google that had never been asked before. More than anything else, we want to know. The original happy hormone i.e. serotonin — is essential for digestion, sleep, brain function, and circadian rhythm. Serotonin is a reward for hunters that promotes survival in the state of nature.

Money: We need to acquire resources to survive and money is needed for that. The hunt for money releases dopamine i.e. reward-seeking hormone that keeps you alive and alert. This hormone controls the motor and cognitive functions and it’s the brain motivation and decision-making control system. All the products related to money earning fall in this category. Hunt for money releases dopamine whereas receiving money releases serotonin. And it’s a never-ending loop for almost all human beings.

Resources for survival: Food & clothing shopping apps that make us search for our needs and get them on time, work-related emails that push us to complete our job and earn the money, self-improvement apps that engage us to upskill ourselves so that we make sure our survival in today’s competitive world, dating apps pushes us for our mating needs, etc, there is an endless list of variable rewards based products designed to keep us hunting for our next discovery.

2. Rewards of the tribe (attention = survival)

After our survival, we need to feel safe and belonged, validated, and then feel important (Maslow’s hierarchy of needs). We have specially-adapted neurons to help us feel what others feel, which provides evidence that we survive through our empathy for one another. We’re meant to be part of a tribe so our brains seek out rewards that make us feel accepted, important, attractive, and included. Let’s understand this with a few examples:

Acceptance: Our brain equates attention with survival because we’re born helpless. Our brain seeks importance because that promotes survival in the state of nature. Our brain feels satiated once Serotonin is released (we feel it above the neck). We crave for safety and belonging, that Oxytocin is released when we feel bonded with someone, trust someone. Social media apps like Facebook, Whatsapp, Linkedin provide attention to each and belongingness. These attention-seeking apps also act as mood stabilizers and deliver happiness.

Power: Once we feel accepted, we crave to be different and want to control. We seek power. When you achieve a long-sought goal or when you take a step toward a goal, when your efforts are rewarded, or when you invest effort and expect it to be rewarded, you feel a surge of dopamine inside your body. Apps like Linkedin and Twitter power-motivated people to show their influence and authority. Continuous feedback (social rewards) from others validate their social influence and motivate them further to participate and create.

3. Rewards of the self (belong & differentiate)

As we move further up Maslow’s hierarchy of needs, personal gratification is sought after. In a society, every human being primarily has two motivations – belong to a group and differentiate himself from a group. Humans focus on mastery & consistency in their field of work and try to achieve a purposeful life. These people have seen enough dopamine surging inside their bodies and seek more, by being competent, but they also seek endorphin – the body’s natural morphine. Its also known as the ‘runner’s high’ hormone that makes us happy when we exceed our limits. We compete with ourselves and always try to aim higher, and higher. Let’s understand this with a few examples:

Mastery: Mastery is the desire to improve. If you are motivated by mastery, you’ll likely see your potential as being unlimited, and you’ll constantly seek to improve your skills through learning and practice. An athlete who is motivated by mastery might want to run as fast as he possibly can. Any medals that he receives are less important than the process of continuous improvement. Apps or products that push us to gain knowledge on how to exceed our limits e.g. fitness app (Strava), book-reading app (Bookcademy), certain productivity-based apps, etc., relieve the itch to keep improving at something that’s important to us.

Purpose: Human beings have an innate inner drive to be autonomous & self-determined. And when that drive is liberated, people achieve more and live richer lives and ensure their time, their technique, and their team is a pathway to that purposeful destination. The purpose is the sense that what we do produces something transcendent or serves something meaningful beyond than ourselves. They seek to get involved in a “good cause” that they are passionate about.

Choose your Rewards

How do product developers, app makers, or tech companies take advantage of this happy hormones-driven learning strategy? Many products implement a reward pattern optimized to keep you engaged as much as possible. It rewards us for beneficial behaviors and motivates us to repeat them. There are umpteen apps that are creating short-term, dopamine-driven feedback loops. The true drivers of our attachments to these products or apps are the hyper-social environments they provide. Understanding what drives us to particular behavior allows us to build products that are aligned with users’ interests.

Infinite Scrolling: Lets get to the bottom

What is the link between Infinite Scrolling & Variable Rewards? Variable rewards?

References:
sitn.hms.harvard.edu/flash/2018/dopamine-smartphones-battle-time/
simplypsychology.org/operant-conditioning.html
brainfacts.org/thinking-sensing-and-behaving/learning-and-memory/2018/motivation-why-you-do-the-things-you-do-082818