In the late 1970s, the physicist and Nobel laureate Richard Feynman sat down for lunch at Indra, an unassuming Thai restaurant in Glendale, California. Across from him sat his friend, drumming partner, and frequent co-conspirator, Ralph Leighton. Leighton stared at the menu, caught in a classic mid-day paralysis. He loved the ginger chicken. It was a known quantity—consistently satisfying, familiar, and safe. Yet his eyes kept wandering to other, unexplored dishes on the menu. They had a chance of being even better, but they also risked being a culinary disaster.
Leighton vocalized his dilemma: Should he order his running favorite, or roll the dice on something new?
For most diners, this is a fleeting moment of trivial indecision. But Feynman was not most diners. To the man who had spent his life mapping the subatomic paths of electrons, the ginger chicken dilemma was not a minor lunch-time annoyance. It was an elegant mathematical optimization problem waiting to be solved. Feynman grabbed a pen, pulled a scrap of paper across the table, and began scribbling a series of equations, integrals, and decreasing thresholds. On that day, he solved the problem of optimal dining.
He never published his findings. The napkin—or scrap of paper—disappeared into Leighton’s private archives, its dense, highly idiosyncratic mathematical shorthand remaining a mystery for nearly half a century.
That mystery was finally solved. In a study published in the Proceedings of the National Academy of Sciences (PNAS), a team of behavioral and cognitive scientists revealed they have fully deciphered Feynman’s long-lost notes. Co-authored by computational cognitive scientist Brian Christian, Princeton University psychology professor Tom Griffiths, and cognitive scientist Evan M. Russek, the paper reconstructs Feynman’s math, proves its absolute mathematical optimality, and tests how real human beings match up against the physicist's ideal dining strategy.
The resulting framework—which the researchers refer to as the definitive formulation of Richard Feynman restaurant math—not only provides a blueprint for how to choose a restaurant or a meal, but also offers profound insights into how the human mind navigates the fundamental, daily tension between the familiar and the unknown.
The Cold Case in the Archives
The journey to decoding the physics legend's culinary calculations began not in a high-tech laboratory, but in the paper-stuffed drawers of Ralph Leighton. Leighton, who co-authored the legendary Feynman anecdote books Surely You're Joking, Mr. Feynman! and What Do You Care What Other People Think?, had kept the scrap of paper from that Glendale lunch as a memento of his friend’s relentless curiosity.
"Ralph Leighton kept almost everything from his conversations with Feynman," says Brian Christian, the study's lead author and a visiting scholar at the University of California, Berkeley. "But this paper was different. It wasn’t a story or a joke. It was a page of pure mathematics, written in Feynman’s notoriously fast, sometimes messy, and highly personalized hand. For decades, it was completely inscrutable."
[ A Restored Glimpse of Feynman’s Note ]
====================================================
Let rating of dish = x ~ U(0, 1)
If we have n meals left, and current best is y:
Compare y to threshold t_n...
Indifference equation:
n * y = E[x] + E[V_{n-1}(max(y, x))]
Solve for t_n -> Non-linear decay curve.
====================================================
Christian and Griffiths first caught wind of the existence of these notes more than a decade ago while researching their book Algorithms to Live By: The Computer Science of Human Decisions. The notes, however, were a tangled web of variable swaps, unlabelled integrals, and half-finished proofs. Feynman had worked out the math "on the fly" over curry, jumping between steps with the intuitive leaps for which he was famous, leaving no explanatory text.
"To anyone else, it looked like random physics scratchings," Griffiths explains. "There were no words like 'restaurant' or 'chicken' on the page. But if you knew what to look for, you could see he was setting up a very specific kind of optimization problem. It was a stopping problem, but with a unique twist that separated it from the standard mathematical canon."
The breakthrough came when Russek, Christian, and Griffiths sat down to systematically rebuild Feynman’s mathematical assumptions from first principles. By mapping his idiosyncratic notation onto modern probability theory, they realized Feynman had developed a highly elegant, recursive equation to solve the "explore-exploit" dilemma under a very specific, real-world constraint: the ability to return to your favorite choice.
Exploring vs. Exploiting: The Anatomy of a Dinner Dilemma
To understand why Feynman’s math is so elegant, one must first understand the broader class of mathematical puzzles to which it belongs. In computer science and decision theory, this is known as the explore-exploit tradeoff.
Every day, we face choices that force us to balance these two states:
- Exploration: Gathering new information by trying an unknown option (e.g., trying a brand-new restaurant in town).
- Exploitation: Using the information we already have to maximize our immediate reward (e.g., returning to the restaurant we already know is fantastic).
The classic mathematical model for this is the Multi-Armed Bandit Problem, named after a hypothetical gambler facing a row of slot machines (the "one-armed bandits"). The gambler must decide which arms to pull, how many times, and in what order, to find the machine with the best payout while minimizing losses on low-paying machines.
But as Christian points out, the standard Multi-Armed Bandit model has a major flaw when applied to real life. "In a classic bandit problem, the feedback you get is noisy," Christian says. "If you pull a slot machine lever and get nothing, you don't know if that's because the machine is terrible, or if it's a great machine that just had a bad spin on that particular turn. You have to keep pulling the lever to get a statistically reliable sample."
DECISION-MAKING TAXONOMY
Are observations noisy?
/ \
YES NO
/ \
[Multi-Armed Bandit] Can you return to previous options?
/ \
YES NO
/ \
[Feynman's Restaurant] [The Secretary Problem]
In the context of choosing a restaurant, however, the observation is often much cleaner. "If you go to a restaurant and eat a meal, you generally know immediately whether the food is excellent, mediocre, or terrible," says Griffiths. "You don't need to eat there twenty times to establish a statistical confidence interval. Feynman’s restaurant math assumes 'perfect observation'—once you sample an option, you know its exact value."
This places Feynman's problem in a category closely related to the famous Secretary Problem (also known as the marriage problem or optimal stopping problem). In the Secretary Problem, an interviewer interviews a sequence of candidates, observing their exact quality relative to one another. The catch? Once you reject a candidate, you can never go back and hire them. The optimal mathematical solution is the famous "37% Rule": interview the first 37% of the candidates without hiring anyone to establish a baseline, and then hire the very next candidate who is better than that baseline.
"But that doesn't fit vacation dining or menu selection either," Christian notes. "If you visit a city for ten days, and you find an amazing Italian bistro on night three, you absolutely can go back there on nights four, five, and six. You have perfect recall. This element of return changes the mathematics completely."
By removing observation noise but allowing perfect recall, Feynman created a pure, highly structured version of the explore-exploit tradeoff. He formalised it as follows:
- You are staying in a city for $M$ nights (or eating $M$ meals at a restaurant).
- There are more restaurants in the city (or dishes on the menu) than nights you have left ($N \ge M$).
- Each restaurant has a fixed, true quality value $x$ between $0$ and $1$ (with $1$ being the ultimate culinary masterpiece and $0$ being completely unpalatable).
- Before you try a restaurant, you don't know its quality, but you know that the qualities of all restaurants are distributed according to a known probability distribution.
- Every night, you must decide whether to try a new, unknown restaurant (explore) or return to the best restaurant you have found so far (exploit).
- Your goal is to maximize the sum of the qualities of the meals you eat over your entire stay.
The Mathematics of the Deciphered Feynman Equation
How do you solve this? Feynman’s genius was in realizing that the optimal strategy does not involve looking forward and trying to calculate every possible future combination of meals. Instead, the problem must be solved backward, a technique known in modern mathematics as dynamic programming or recursive induction.
Let's look at the mathematical mechanics of Richard Feynman restaurant math that the research team decoded from the scribbled notes.
THE DYNAMIC DECISION LOOP
[ Start Night ]
|
What is your current best (y)?
|
Compare y to Night's Threshold (t_n)
/ \
/ \
y > t_n? y <= t_n?
/ \
/ \
[EXPLOIT] [EXPLORE]
Stick with y Try a new place;
for rest of update your y;
the trip. move to next night.
The optimal policy is dictated by a sequence of decreasing quality thresholds, denoted as $t_n$, where $n$ represents the number of nights remaining on your trip.
On any given night, you look at the quality value of the best restaurant you have found so far, which we can call $y$. You then compare $y$ to the threshold $t_n$ for that specific night.
- If your current favorite $y$ is greater than the threshold $t_n$ ($y > t_n$), you stop exploring entirely. You "exploit" by returning to that favorite restaurant for every single remaining night of your trip.
- If your current favorite $y$ is less than or equal to the threshold $t_n$ ($y \le t_n$), you "explore." You try a brand-new restaurant, observe its quality $x$, update your running favorite to $\max(y, x)$, and move to the next night.
The core of the mathematical mystery was finding the exact formula for these thresholds $t_n$.
Step 1: The Final Night ($n = 1$)
Let us start at the end of the trip. Suppose you have only one night left ($n = 1$).
If you choose to explore a new restaurant, you will experience its quality $x$ exactly once. You have zero remaining nights to exploit that information. Therefore, the expected value of exploring on the very last night is simply the average (mean) quality of all restaurants in the city.
Assuming a Uniform Distribution—where any quality score between $0$ and $1$ is equally likely—the average quality is exactly $0.5$ (or 50 on a 100-point scale).
If your current favorite restaurant $y$ is better than this average ($y > 0.5$), you should obviously return to it. If it is worse than average ($y \le 0.5$), you should take a gamble on a new place, because a random draw has a higher expected value than your disappointing favorite.
Thus, the final-night threshold is always equal to the mean of the quality distribution:
$$t_1 = 0.5$$
Step 2: Working Backward ($n = 2$)
Now, suppose you have two nights remaining ($n = 2$).
If you decide to exploit your current favorite $y$, you will eat there for the next two nights, earning a total guaranteed value of:
$$\text{Value of Exploiting} = 2y$$
If you decide to explore a new restaurant, you will eat at a new place tonight, observing its quality $x$. Tomorrow, on your final night ($n = 1$), you will act optimally based on whether this new restaurant $x$ (or your previous favorite $y$) is better than the final-night threshold $t_1$.
Feynman set up an indifference equation. The threshold $t_2$ is the exact quality value of a running favorite $y$ that makes you completely indifferent between exploring and exploiting.
The expected value of exploring when your current favorite is exactly at the threshold $t_2$ can be written as:
$$\text{Expected Value of Exploring} = E[X] + E[\max(t_2, X, t_1)]$$
Where:
- $E[X]$ is the expected quality of the restaurant you try tonight (which is $0.5$).
- $\max(t_2, X, t_1)$ is the value of the choice you will make on the final night. Since $t_1 = 0.5$ and we are finding $t_2$ (which we know intuitively must be greater than $t_1$), this simplifies.
By setting the value of exploiting for two nights ($2 \cdot t_2$) equal to the expected value of exploring, Feynman derived the indifference equation for $t_2$:
$$2 t_2 = \int_{0}^{1} x \, dx + \int_{0}^{1} \max(t_2, x) \, dx$$
Since $t_2$ is a fraction between $0$ and $1$, we can split the second integral at the boundary $t_2$:
$$\int_{0}^{1} \max(t_2, x) \, dx = \int_{0}^{t_2} t_2 \, dx + \int_{t_2}^{1} x \, dx$$
$$\int_{0}^{1} \max(t_2, x) \, dx = t_2^2 + \left[ \frac{x^2}{2} \right]_{t_2}^{1} = t_2^2 + \frac{1}{2} - \frac{t_2^2}{2} = \frac{1}{2} + \frac{t_2^2}{2}$$
Now, plug this back into the indifference equation:
$$2 t_2 = \frac{1}{2} + \left( \frac{1}{2} + \frac{t_2^2}{2} \right)$$
$$2 t_2 = 1 + \frac{t_2^2}{2}$$
Multiply the entire equation by $2$ to clear the fraction:
$$4 t_2 = 2 + t_2^2$$
$$t_2^2 - 4 t_2 + 2 = 0$$
Using the quadratic formula to solve for $t_2$:
$$t_2 = \frac{4 \pm \sqrt{(-4)^2 - 4(1)(2)}}{2} = \frac{4 \pm \sqrt{16 - 8}}{2} = \frac{4 \pm \sqrt{8}}{2} = 2 \pm \sqrt{2}$$
Since $t_2$ must be between $0$ and $1$, we discard the root $2 + \sqrt{2} \approx 3.41$ and keep the valid root:
$$t_2 = 2 - \sqrt{2} \approx 0.5858$$
This means that if you have two nights left, you should only return to your favorite restaurant if its quality rating is higher than $58.6\%$. If your favorite is rated, say, $55\%$, it is mathematically optimal to discard it and try a new place, even though $55\%$ is technically "above average." The future value of potentially finding a $90\%$ restaurant that you can return to outweighs the immediate, modest benefit of your $55\%$ favorite.
Step 3: Generalizing to $n$ Nights Left
As you work backward to $n$ nights remaining, the calculation becomes more complex. Feynman solved this by defining the general recurrence relation for the threshold $t_n$ under a Uniform distribution:
$$t_n = \text{the value of } y \text{ such that } n \cdot y = \frac{1}{2} + \sum_{i=1}^{n-1} E[\max(y, X_i)]$$
When solved, this yields a non-linear decay curve. The threshold $t_n$ starts very high when you have many nights remaining ($n$ is large), because the "upside" of finding an extraordinary restaurant is massive—you get to enjoy it for many nights. As your remaining nights ($n$) dwindle, the value of information drops, and the threshold $t_n$ falls more and more rapidly, cascading down to $0.5$ on the final night.
FEYNMAN'S NON-LINEAR THRESHOLD DECAY (Uniform Distribution)
Quality Threshold (t_n)
1.0 |
| * * (Starts high: hold out for something amazing)
0.8 | *
| *
0.6 | * (Threshold drops faster as time runs out)
| *
0.5 | * (n=1: settle for anything above average)
+---------------------------------------------
10 9 8 7 6 5 4 3 2 1 Nights Left (n)
"Feynman’s solution is completely uncompromising," says Russek. "It shows that if you have ten days in a city, you should start with an incredibly high standard. If your first restaurant is a solid $75\%$, Feynman’s math says: Forget it, keep exploring. Only when you get closer to the end of your trip should you lower your standards and settle."
Moving Beyond Feynman: The "Gems and Garbage" Scenario
While Feynman’s math was pristine, it relied on a major simplifying assumption: that the quality of restaurants is distributed uniformly between $0$ and $1$. In other words, he assumed that a city is equally likely to have terrible, mediocre, good, or world-class dining.
But anyone who has traveled knows that food landscapes vary wildly by geography. Some cities are highly consistent, filled with moderately good bistros but lacking true culinary genius. Other cities are highly volatile: block after block of terrible tourist traps, interspersed with a few hidden, Michelin-starred masterpieces.
In their PNAS paper, Christian, Russek, and Griffiths extended Richard Feynman restaurant math by calculating closed-form solutions for several alternative probability distributions of quality. The results reveal how a traveler's optimal strategy must adapt to local reality.
GEOGRAPHIC DISTRIBUTION SHIFTS
Uniform City "Gems & Garbage" Uniformly Mediocre
(Feynman's Model) (High Variance, Bimodal) (Low Variance, Tight)
Quality Range Quality Range Quality Range
[0.0 ------- 1.0] [0.0 -- (x) -- 1.0] [0.4 --- (x) --- 0.6]
| | |
Standard Decay Threshold starts Threshold starts
Curve (t_1=0.5) VERY high; explore lower; exploit
aggressively. early.
1. The "Gems and Garbage" City (High Variance)
Imagine a city where the quality distribution is bimodal or has high variance—meaning there are many awful places, but a few rare gems.
Under this distribution, the researchers proved that the optimal threshold $t_n$ starts much higher than it does in Feynman’s uniform model.
"If you are in a city with extreme quality variance, the value of exploring is elevated," Griffiths explains. "Because if you do find one of those rare gems early on, the payoff over the remaining nights of your trip is astronomically high. Therefore, you must set an incredibly high bar. You should reject even very good restaurants ($85\%$ or $90\%$) early in the trip, because the marginal benefit of finding that $99\%$ gem is worth the risk of eating garbage for a few nights."
2. The "Uniformly Mediocre" City (Low Variance)
Now, imagine a city where almost all restaurants are of similar, slightly above-average quality—say, every restaurant is a solid $7$ out of $10$, with very little deviation.
In this environment, the optimal threshold starts much lower and decays slowly.
"If there is no variance, there is almost nothing to be gained from exploring," Christian says. "You cannot find a culinary masterpiece because they don't exist in this distribution. If you find a restaurant that is a $7.1$ on day one, you should immediately stop exploring and exploit it for the rest of your trip. The threshold drops quickly because the risk of landing a terrible meal is low, but the reward of finding a significantly better meal is also non-existent."
The 2,520-Diner Experiment: How Real Humans Decide
Having successfully decoded Feynman’s math and generalized it to different food landscapes, the research team was left with a natural, empirical question: What do real people actually do when faced with this dilemma?
To find out, they designed a large-scale, preregistered behavioral experiment involving $2,520$ human participants.
THE VIRTUAL CITY EXPERIMENT DESIGN
[ Participant Dashboard ]
======================================================
Your Trip: 10 Days Remaining
Current Favorite Restaurant Score: 72 / 100
Choose Action:
[A] Return to Favorite (Earns 72 points tonight)
[B] Explore New Restaurant (Select a square on the grid)
======================================================
* Grid of 100 Squares (Each square = 1 Hidden Restaurant)
* Clicking a square reveals its exact quality rating (0-100)
* Distribution of ratings (Uniform, High-Variance, Low-Variance)
is explained to the participant beforehand.
Each participant was introduced to a simulated vacation scenario: they were staying in a virtual city for a set number of nights. They were shown a grid of squares, where each square represented an undiscovered restaurant. On each "night" of their stay, they had to choose between returning to their favorite restaurant discovered so far (exploiting) or clicking a new square to reveal a new restaurant's quality score (exploring).
The researchers split the participants into different groups, with the underlying restaurant quality ratings drawn from one of the four probability distributions they had modeled (including Feynman's Uniform distribution, a high-variance "Gems and Garbage" distribution, and a low-variance distribution).
The results of the behavioral study revealed two major, surprising findings:
Finding 1: Humans Exploit too Early, but Adapt to the Landscape
When analyzing the raw choices, the researchers found that real humans do not act like perfect "Feynman diners." Specifically, humans have a strong bias toward premature exploitation—they settle on a favorite restaurant much earlier in their stay than pure mathematical optimality dictates.
"If Feynman had ten days in a uniform city, his math says he should hold out for a restaurant scoring above $85\%$ on the first few nights," Christian says. "But we found that real people get nervous. If they find a restaurant scoring $70\%$ on night two, they often lock it in and stop exploring. Humans are risk-averse; we hate the idea of wasting a night of vacation on a terrible meal, even if the long-term mathematical expectation favors exploration."
However, the researchers observed that humans are incredibly sensitive to the distribution of quality.
- When participants were placed in the "Gems and Garbage" city, they intuitively realized they needed to be more adventurous. They set higher personal thresholds and explored for much longer.
- When placed in the low-variance, consistent city, they quickly settled on a familiar option, recognizing that searching was a waste of time.
"Real people might not be writing down integrals, but their intuition is remarkably well-attuned to the environmental statistics," says Griffiths. "They instantly understand when a food scene is volatile and requires more exploration, versus when it is safe and permits early exploitation."
The Surprise Twist: The Triumph of the Linear Heuristic
The second behavioral finding was the most striking of all. When the researchers mathematically modeled the decision thresholds used by the human participants, they discovered that people were not using Feynman’s complex, non-linear decay curve.
Instead, humans use a linear threshold heuristic.
FEYNMAN'S NON-LINEAR VS. HUMAN LINEAR THRESHOLDS
Threshold
1.0 |
| * * <- Feynman's Optimal (Starts high, decays rapidly)
0.8 | \ *
| \ *
0.6 | \ * o
| \ * o <- Human Linear Heuristic (Consistent slope)
0.5 | \ *
+---------------------------------------------
10 9 8 7 6 5 4 3 2 1 Nights Left (n)
In the human mind, the quality threshold for settling does not drop in an accelerating cascade. Instead, it drops by a fixed, steady percentage for every night that passes.
"We found definitive evidence that human decision-making in this context is guided by a threshold that decreases linearly with the proportion of trials remaining," Christian explains. "It is a beautifully simple rule: if you have $10$ nights left, your threshold is high; if you have $5$ nights left, your threshold is exactly halfway between your starting threshold and the baseline average."
The natural assumption of any mathematician would be that this linear shortcut is vastly inferior to Feynman's absolute optimal curve. But when the researchers ran computer simulations to compare the actual points earned by the human linear strategy against Feynman's mathematical ideal, they hit a stunning result.
The linear heuristic is extraordinarily, almost uncannily efficient. TOTAL SIMULATED DINING SATISFACTION (Efficiency Comparison)
Strategy | % of Theoretical Maximum Score
---------------------------------------------------------
Feynman's Optimal | [====================================] 100.0%
Human Linear Heuristic | [==================================] 99.1%
Random Settle | [====================] 58.2%
"The performance difference is negligible," Griffiths says. "The linear threshold strategy routinely captures over $99\%$ of the total dining value that you would get from running Feynman's incredibly complex non-linear equations. It is a classic example of what cognitive scientists call 'resource rationality.' The human brain has evolved to find cognitive shortcuts that avoid intense mental calculations, yet still yield nearly perfect real-world outcomes."
By using a linear decay threshold, a traveler can navigate a new city’s food scene with zero scratch paper, zero integrals, and almost exactly the same level of culinary satisfaction as a Nobel Prize-winning physicist.
Why Feynman’s Restaurant Math Matters for AI Alignment
While optimizing a vacation dining schedule is a fun intellectual exercise, the researchers behind the PNAS study point out that the implications of deciphering Richard Feynman restaurant math run much deeper. In fact, it touches on one of the most critical challenges in modern computer science: the AI Alignment Problem.
Brian Christian, who has written extensively on the subject in his book The Alignment Problem, explains that modern artificial intelligence agents, particularly those trained via reinforcement learning, face the exact same explore-exploit dilemma when trying to learn human preferences.
THE ALIGNMENT FEEDBACK LOOP
[ Human User ] <--- (What do you want to eat/do/read?) ---> [ AI Agent ]
^ |
| |
Provides feedback Must choose:
on preferences * Explore new query styles
* Exploit current reward model
"When we train an AI system using Reinforcement Learning from Human Feedback (RLHF), the system has to build a mathematical model of what the human actually values," Christian says. "To do that, the AI has to decide when to ask the human for clarification—which is a form of exploration—and when to simply execute what it thinks the human wants—which is exploitation."
If the AI exploits too early, it suffers from "premature alignment". It locks in on a flawed, surface-level understanding of the human's desires, missing deeper, more nuanced preferences. If it explores too much, it becomes annoying, constantly asking the user for feedback without ever delivering useful results.
"By studying how humans naturally solve these stopping problems using linear thresholds, we can design AI systems that model human preferences more accurately," Christian says. "Feynman’s restaurant problem provides a clean, mathematically tractable environment to study this exact dynamic. It lets us look under the hood of both human cognition and machine learning, showing us how to balance curiosity with utility."
How to Dine Like Feynman: A Practical Field Guide
For the modern traveler looking to put this academic breakthrough into practice, how should one navigate a dining schedule?
Based on the deciphered equations and the behavioral findings from the Princeton and UC Berkeley team, here is the scientifically validated, step-by-step guide to choosing where to eat on your next vacation.
YOUR NEXT VACATION FLOWCHART
How many nights are you staying in the city? (N)
|
Establish your local environment distribution:
* "Gems & Garbage" (Volatile) -> Set high initial bar
* "Uniformly Mediocre" -> Set modest initial bar
|
[ Night 1 ] -> Try a brand-new place.
|
===================[ NIGHTS 2 through N-1 ]===================
* Calculate your linear threshold for tonight:
t_n = (Remaining Nights / Total Nights) * (100 - Mean) + Mean
* Is your current favorite rating > t_n?
/ \
[YES] [NO]
/ \
Stop exploring! Explore a new place.
Eat at favorite Update favorite.
for rest of trip. Move to next night.
==============================================================
|
[ Night N ] -> Return to your all-time favorite.
Step 1: Assess the Local Landscape
Before you unpack your bags, make a quick assessment of the food scene in your destination.
- High-Variance Scene (e.g., Tokyo or New York): If the city is famous for having both world-class dining and terrible tourist traps, prepare to be mathematically aggressive. Set your initial standards exceptionally high and don't settle too quickly.
- Low-Variance Scene (e.g., a small French village): If almost every bistro serves a highly consistent, good-but-not-transcendent meal, lower your exploration threshold. Once you find a solid place on night one or two, stick with it.
Step 2: Set a Decreasing Linear Standard
Do not try to calculate Feynman's non-linear decay curve in your head—your brain isn't wired for it, and the cognitive effort isn't worth the $0.9\%$ efficiency gain. Instead, use the Linear Heuristic.
- On your first night, always try somewhere new to establish a baseline.
- On subsequent nights, let your acceptable threshold for "settling" decline linearly with the proportion of your trip remaining.
- If you have a 10-day trip, and you find an $8.5/10$ restaurant on day two, check if it exceeds your day-two standard (which should be very high, around $9/10$). If it doesn't, force yourself to try a new spot on day three.
- By day eight, your standard should have dropped significantly (perhaps to a $6.5/10$). If your current favorite is better than that, stop exploring and eat there for the remaining three nights.
Step 3: Settle on the Final Night
On your very last night, exploration has zero mathematical value because you cannot return to exploit any new information you discover.
- If your current favorite is better than the local average, return to it.
- If your current favorite is mediocre or terrible, take a final, blind gamble on a new place.
The Legacy of a Scrap of Paper
When Richard Feynman passed away in 1988, he left behind a legacy defined by his contributions to quantum mechanics, the development of the atomic bomb, and his famous, clear-headed explanation of the Space Shuttle Challenger disaster. But to those who knew him personally, his true genius was his refusal to accept that any part of the human experience was too mundane for rigorous thought.
The deciphering of his restaurant math fifty years after it was scribbled on a Glendale napkin is a testament to that legacy. It shows that the same mathematical principles that govern the decay of radioactive isotopes or the optimization of computational search engines are quietly at play every time we open a menu or walk down a new city street.
"Feynman looked at a dinner menu and saw the universe," says Brian Christian. "And what’s beautiful is that, half a century later, his math has shown us that our own messy, intuitive, human shortcuts are almost exactly as smart as his physics."
So the next time you find yourself paralyzed by a restaurant menu on vacation, remember that you don't need to be a Nobel laureate to make the perfect choice. Your brain is already running a simplified, highly elegant version of Feynman's math—all you have to do is trust the linear decline, take a deep breath, and order the ginger chicken.
What to Watch For Next
While the June 2026 PNAS paper provides a definitive mathematical resolution to the static restaurant problem, cognitive scientists are already looking toward more dynamic frontiers.
- The "Fatigue" Variable: The current model assumes a diner never gets tired of their favorite dish. Future research is underway to incorporate "utility decay"—modeling how the threshold changes when the satisfaction of eating the same ginger chicken drops with each consecutive visit.
- Social Coordination: How does the math change when a group of travelers with different individual preferences must agree on a single, shared dining strategy?
- Dynamic Environments: Re-calculating the thresholds for situations where restaurant quality is not static, but can change over time due to seasonal menus, rotating chefs, or shifting staff.
Reference:
- https://www.pnas.org/doi/10.1073/pnas.2509612123
- https://www.independent.co.uk/travel/news-and-advice/feynman-formula-best-holiday-restaurant-b2987704.html
- https://www.pnas.org/doi/10.1073/pnas.2509612123
- https://www.quora.com/What-is-Feynman-s-restaurant-problem
- https://www.sciencenews.org/article/math-restaurant-meal-feynman
- https://www.pnas.org/doi/abs/10.1073/pnas.2509612123?af=R
- https://www.theguardian.com/science/2026/jun/01/scientists-uncover-feynmans-formula-for-finding-best-holiday-restaurant
- https://www.perplexity.ai/page/75656076-f246-4f96-b428-a803d4af89b6
- https://www.sciencenews.org/article/math-restaurant-meal-feynman
- https://brianchristian.org/research/
- https://www.bookey.app/book/algorithms-to-live-by-by-tom-griffiths
- https://3quarksdaily.com/3quarksdaily/2026/06/feynman-solved-restaurant-dilemma-50-years-ago-now-a-study-confirms-his-mathematics.html
- https://www.pnas.org/doi/10.1073/pnas.2509612123
- https://github.com/raffg/multi_armed_bandit
- https://en.wikipedia.org/wiki/Multi-armed_bandit
- https://www.youtube.com/shorts/3UhtCsrJwU4
- https://www.theguardian.com/science/2026/jun/01/scientists-uncover-feynmans-formula-for-finding-best-holiday-restaurant
- https://medium.com/data-science/feynmans-restaurant-problem-57121af0bb92
- https://www.pnas.org/doi/abs/10.1073/pnas.2509612123?af=R