In his book Misbehaving, Nobel Prize winning economist Richard Thaler describes an incident that occurred while he was hosting some guests. While waiting for dinner, Thaler brought out a large bowl of cashew nuts, which he and his guests swiftly devoured half of within minutes. This prompted Thaler to remove the bowl of nuts. As a result, in Thaler’s words, “everyone is happy.” To us, there is a clear logic to this sequence of events. After all, Thaler and guests would not want their appetite ruined before dinner was served. To an economist like Thaler, however, what transpired was highly peculiar. After all, the standard economic model, rational choice theory, espouses evaluating all available options to select the most optimal option. To that end, maximising the number of options available is most desirable. With the bowl of nuts present, whether they chose to eat or not eat would of course perfectly reflect their preference. With the nuts removed, the choice to eat is precluded. Why on earth would that make them happier?
Thaler concluded that failure of self-control was to blame. Humans more often find ourselves mastered by our impulses than masters of them, a simple fact that many economists, in the grip of theory-induced blindness, had failed to grasp. With the availability of immediate gratification, even Thaler, as versed in the arts of utility maximisation as he was, let alone his unsuspecting guests, was not able to make the choice he really wanted to, to save his appetite for the delayed but greater pleasure of dinner.
Living for the Moment
Every human has on some occasion (or many, let’s be real) thought or uttered aloud I should… and proceeded to do the opposite. Given two behaviours that produce outcomes of different values, we often choose the behaviour that produces the less optimal outcome. We choose strawberry ice-cream over strawberries, scroll through TikTok videos at 2 AM on the eve of an important presentation, splurge on new clothes we don’t need, or, you know, climb some really big walls without protection.
Perhaps the most famous example (and possibly most entertaining) of failures of self-control is the marshmallow test, led by Walter Mischel at Stanford University. The basic test is simple—young children were offered a choice between eating a marshmallow immediately or waiting 15 mins to receive two marshmallows instead. Here is a replication of the study:
Just as in the video, the success rate was abysmal. Perhaps the more surprising finding was that some children succeeded, often in agonising fashion. While these findings were intriguing, Mischel led a series of follow-up studies spanning four decades that elevated his status from an accomplished academic to a bona fide superstar. In those studies, Mischel and colleagues found positive correlations between being able to delay gratification and a diverse range of outcomes, including higher SAT scores, higher sense of self-worth, lower BMI, and lower drug use1.
Common among these examples is a systematic devaluation of delayed consequences. When given a choice between a smaller-sooner reward and a larger-later reward, we gravitate towards the former. Conversely, when given a choice between a smaller-sooner aversive outcome and a larger-later aversive outcome, we gravitate towards the latter. In technical terms, we call this delay discounting. Colloquially, impulsivity or failure of self-control. Delay discounting has been observed in numerous studies, in various animals, humans included, across a variety of choice paradigms. Everyone exhibits delay discounting2. It is as close to a principle of behaviour as one can get. But wait, you say, none of this is anything new. Those sages of old, from Aristotle to Marcus Aurelius, knew it, and so did my grandma. In broad strokes, you are certainly not wrong.
We know that at the heart of many of the problems we face today, from the trivial to the existential, lies choices between smaller-sooner consequences and larger-later consequences. However, to have any hope of addressing these problems, we must go beyond knowing as your grandma does (no disrespect to her wisdom), to knowing in a deep sense of the word, based on carefully accumulated evidence. In the following sections, we will outline the current state of knowledge of delay discounting, explore some of its possible functions, and finally, consider some possible applications of this knowledge. Let’s dive in.
What We Know of Delay Discounting
First, a brief summary of how delay discounting is typically studied in humans using hypothetical choice tasks3. Imagine you are offered a choice between $1,000 in a month and $1,000 today. Which will you choose? The latter, duh. What if the choice is between $1,000 in a month and $990 today? What if the choice is between $1,000 in a month and $980 today, or $970? Through a series of such choices, we arrive at a value (let’s say $950) where you are equally likely to accept the $950 today and $1,000 in a month. This value is called the indifference point. In simple terms, the value of the reward has decreased by $50 due to the 1-month delay. The choice task is repeated at different time delays (e.g., 3 months later vs. today; 6 months later vs. today). A picture emerges of how you value rewards at different times in the future. As a final note, you can substitute the rewards for losses (losing $1,000 in a month vs. losing $990 today), or for other stimuli (e.g., food, cigarettes).
Now, let us look at some important findings that have emerged from these studies:
1. Delay discounting follows a hyperbolic function
When we have obtained several indifference points, we can plot a graph of your various indifference points, which looks something like this:
The graph typically follows a hyperbolic function. In simple terms, the subjective value of an outcome falls quickly in the early delay period but more gradually for longer delay periods. If your discounting rate follows the graph above, the subjective value of the reward after a 3-month delay is $770. Remember, this means you are equally likely to choose between $770 today (T0) and $1,000 in 3 months (T3). This represents a steep 23% drop in subjective value! However, the difference in subjective value between the indifference point at T30 and T33, which also happen to be 3 months apart, is much smaller ($250 vs. $233, or a less than 10% drop in subjective value).
For the math-inclined folks, the above graph can be obtained by applying the following formula:
VT is the subjective value, A is the reward amount, k is the discounting rate (0.1 for the above graph), and T for the time delay4.
2. Hyperbolic delay discounting predicts preference reversals
In a perfectly rational world populated by economists, everyone’s preferences should remain consistent over time. Alcoholics, gambling addicts, binge eaters never experience regret. There is no morning-after clarity after a night of excessive indulgence, and solemn vows of never again, all swiftly forgotten as the next opportunity to sin presents itself. What gives?
Consider these two discrete choices:
A choice between $100 today and $110 in 1 year
A choice between $100 in 10 years and $110 in 11 years
If, like many, you find yourself choosing $100 today in the former and $110 in 11 years in the latter, you’ve expressed a preference reversal. This is odd! After all, the choices are of identical values ($100 vs. $110), as is the time window between choices (1 year).
Preference reversals are predicted by hyperbolic delay discounting. Recall that the decrease in subjective value of an outcome is steeper in the beginning, and flattens out over time. Compared to the present, $110 in a year’s time is discounted steeply such that it is valued less than $100 today. Or so the thinking goes, I can buy a lot of popcorn with $100 today. Compared to a decade from now, $110 in 11 years’ time is discounted much less steeply such that it is still valued more than $100 in a decade’s time. I’m already waiting 10 years, what’s another year?
3. We discount less steeply when the stakes are high
Consider these two discrete choices:
A choice between $5 today and $10 in 1 year
A choice between $5,000 today and $10,000 in 1 year
You might find yourself choosing $5 today in the former and $10,000 in 1 year in the latter, even though the durations and comparative difference in values (a 2x return) are identical. Up to a point, as the stakes get higher, we tend to discount less steeply—a case of might as well take the $5, it’s just pocket change vs. $10,000 is A LOT of money.
4. We discount losses less steeply than gains
Consider these two discrete choices:
A choice between gaining $500 today and $600 in 1 year
A choice between losing $500 today and $600 in 1 year
You might find yourself choosing the $500 gain today in the former and $600 loss in the latter, even though the values and time window are identical. That’s because we tend to discount losses less steeply than gains. This is also known as gain-loss asymmetry or sign effect—a case of losing $500 is too painful to think about, let’s kick the can down the road. Again, adding an equivalent and sufficient delay to both loss outcomes (e.g., losing $500 in 10 years vs. $600 in 11 years) can produce a preference reversal.
5. Delay discounting ≠ impulsivity
It is tempting to equate our preference for smaller-sooner over larger-later options with impulsivity. Some scholars of delay discounting have done just that.
However, we should note that impulsivity describes a cluster of behavioural tendencies, while delay discounting describes one possible underlying mechanism that gives rise to such tendencies. There are other processes at work. To illustrate one, we go back to Mischel’s original marshmallow tests. In one of the experiments, the children were again presented with a marshmallow and the same choices, but with a slight tweak. For some of the children, the marshmallow was hidden under an opaque tin. Those children were able to wait longer, with many successfully waiting for the full 15 mins. Many of us will be able to relate similar experiences we had. Turns out that temptations have their greatest influence not within reach, but within sight5.
Also, delay discounting does not just describe those I should scenarios. A good number of smokers take no issue with their smoking habits. Most people do not consider the environmental impact of their single-use plastics usage, or have briefly considered it and moved on with life. Many nations and organisations seem to have no qualms about making present-centred choices that can lead to catastrophic delayed outcomes.
Clearly, we must go beyond equating delay discounting with something those other weak-willed folks do, to understanding that it pervades many aspects of our lives.
Why We Discount the Future
There are several posited reasons why smaller-sooner outcomes dominate larger-later outcomes. Here are some:
First, unlike in most delay discounting experiments, real-world delayed outcomes are rarely guaranteed. A promised future return on an investment does not always bear fruit. Not all smokers develop lung cancer. Insofar as uncertainty and probability is involved, greater uncertainty and lower probability regarding the future outcome should lead to greater discounting. This potentially explains why low-income populations, with cloudy prospects, seem to exhibit higher rates of discounting (more on this in the future). Sometimes, a bird in the hand is truly worth two in the bush.
Many larger-later outcomes result from the aggregation of smaller-sooner choices. One cigarette does not give you lung cancer, nor does one porterhouse steak herald the climate apocalypse. And so we slide down that slippery slope.
Choosing larger-later outcomes can sometimes represent an opportunity cost. If a chimpanzee chooses to receive food earlier rather than wait, it can invest this energy to put itself at a competitive advantage for acquiring food later. An individual who chooses the sooner monetary reward can invest it in tools or education, that can create value greater than the delayed outcome.
Perhaps unique to humans, our complex social organisations can sometimes result in contingencies that favour smaller-sooner outcomes at the group level. Often, far-sighted CEOs who choose to invest in long-term initiatives rather than maximising immediate profits and shareholder value find themselves swiftly replaced, as investors’ patience wear thin. Governments that push painful but necessary policies (e.g., austerity vs. further deficit spending) find themselves voted out come the next election. Solving these challenges require the exceedingly difficult task of restructuring complex group level contingencies.
What We Can Do
Many popular books have been written about improving our self-control and managing our lives better, including dealing with smaller-sooner vs. larger-later choices. Instead of rehashing the many brilliant ideas offered in these books, let’s turn our attention instead to how our newfound knowledge of delay discounting can address some societal level problems. I will use the problem of obesity (in particular, our relationship with food) as a motivating example, but you can start thinking about other kinds of problems where these strategies can apply.
1. Create near-term consequences for choices
Recall how outcomes lose their subjective value over time such that sizeable but delayed outcomes (e.g., type 2 diabetes, heart disease) have little influence relative to immediate choices. It makes sense then that engineering near-term consequences can have a far greater impact on choices. A simple yet robust intervention that has emerged from addictions treatment research, called contingency management, is to provide near-term (and often small) rewards for making desirable choices.
As an example, here in Singapore, the Health Promotion Board has rolled out a healthier choice program, where grocery purchases that carry a healthier choice symbol (including foods lower in salt, sugar, and fat) allow the buyer to earn points that can be exchanged for cash vouchers.
Conversely, one can also provide near-term aversive outcomes for making undesirable choices. One such example is stickK, a platform for users to commit to a well-defined near-term goal (e.g., cook my own meals twice a week) and stick with it. Users pledge a sum of money towards attaining their goal, and failure to attain their goal results in the money being forfeited to a charitable organisation of your choice. One of the most wicked and therefore probably most effective feature is the option for the forfeited money to be donated to an anti-charity, or a cause that the user abhors (e.g., the NRA Foundation for a gun control advocate).
2. Create effective messaging
As a sucker for all things chocolate, when presented with an opportunity to indulge in a delectable, gooey brownie, I can assure you I often do so without even a second thought. Sometimes, despite our ardent New Year resolution to forsake gluttony, the delayed outcome choice doesn’t even enter our consciousness. We fall victim not to a failure of willpower, but to a failure of attention.
In such moments, cues that draw our attention to the delayed outcome (and our goal) can be useful, key being that the cue must be present at the moment of choosing. Other than the healthy choice symbol, the Health Promotion Board has rolled out another nutrition cue for beverages, called Nutri-Grade. Beverages are assigned a grade, printed on their packaging, corresponding with their sugar and saturated fat levels. The goal of such labelling is to increase the salience of the delayed outcome, and hopefully also create an immediate, aversive condition for choosing beverages with poor grades.
As discussed earlier, we discount losses less steeply than gains. A loss influences our behaviours more than an equivalent gain. Consider two frames of identical values:
If you maintain a healthy weight, you can expect to live up to 10.3 years longer
Being obese can reduce your lifespan by up to 10.3 years
For many, the second message will have a greater impact on your food choices. Again, the proximity of the message to the moment of choosing is key in determining its effectiveness.
3. Be on the right side of preference reversals
One common policy that accounts for preference reversal is the implementation of cooling-off (or free-look) periods, typically for big-ticket purchases. A cooling-off period is a time window following a purchase when the consumer can choose to cancel and refund a purchase. Lured by the siren call, consumers are prone to making big purchases only to experience a gut-wrenching preference reversal when they wake up the next morning (I know this feeling too well). Cooling-off periods thus serve to protect consumers from the aftermath of their fundamentally unsound choices.
For food, cooling-off periods are not feasible. You can’t uneat food. Creating conditions for being on the right side of preference reversals are trickier. So, here’s a little thought experiment, indulge me for a moment.
Imagine being at McDonald’s and choosing between 2 food options, a side French fries (the unhealthy option) and a cup of corn (the healthy option). Recall how, all else being equal, we sometimes prefer the French fries and other times the corn, depending on the delay to the choices. Specifically, the immediate gratification of salty, fatty goodness has an outsized influence on our choices. One solution is to introduce a delay between our choices and their outcomes. The longer the delay, the greater the effectiveness6.
Let’s assume that the average waiting time at a McDonald’s outlet is 5 min. Here in Singapore, there is an array of McDonald’s meal options categorised as Eat Light Under 500 Calories. Now imagine a separate priority queue for choosing one of these options compared to other menu options. As the kitchens divert their resources toward fulfilling eat light orders, the average waiting time for the non-priority queue orders creeps up to 10 mins. As a result, more diners switch to eat light options, creating a self-reinforcing loop. The average waiting time for the non-priority queue creeps up further to 15 mins. Ah what the heck, you mumble to yourself, might as well get the healthier option. I really should eat healthier anyway.
In this scenario, we’ve not only introduced a delay between the unhealthy choice and its outcome, but also reduced the delay between the healthy choice and its outcome, creating a compelling set of circumstances to engineer a preference reversal. Of course, such ideas must be tested for effectiveness and feasibility before making any conclusive claims.
Conclusions
By now, we should begin to recognise that many choices we face in our everyday life involve trade-offs between smaller-sooner and larger-later outcomes. In the many cases where smaller-sooner and larger-later face off, smaller-sooner tends to have the prettier footwork and sharper jab. We’ve learnt a little about why this is so, and some things we can do to give larger-later a decent fighting chance.
Thanks for reading! Keep fighting the good fight.
It should be noted that recent studies (including by Mischel himself) call into question the validity of these findings.
There is a rich body of knowledge pertaining to delay discounting in non-humans, but we won’t discuss that. Here’s an excellent summary.
Some argue that a hyperboloid function better describes delay discounting than a hyperbolic function.
Or to be more precise, temptation is greatly increased upon sensory contact.
At an individual level, one hack you can try if you know in advance that you will be visiting McDonald’s is to pre-commit your food options. It is much easier to decide on corn when McDonald’s is tomorrow than when you’re at counter, with the aroma and sizzling of fries cooking overpowering your senses.