Algorithms to Live By: The Computer Science of Human Decisions

Page: 5, Location: 76

Note: C.

1 Optimal Stopping When to Stop Looking

Page: 13, Location: 196-197

Note: C.

A 63% failure rate, when following the best possible strategy, is a sobering fact. Even when we act optimally in the secretary problem, we will still fail most of the time—that is, we won’t end up with the single best applicant in the pool. This is bad news for those of us who would frame romance as a search for “the one.”

Page: 19, Location: 288-291

Note: The one

but perhaps the most appropriate to Berezovsky’s case—with apologies to Russian oligarchs—is known as the “burglar problem.” In this problem, a burglar has the opportunity to carry out a sequence of robberies. Each robbery provides some reward, and there’s a chance of getting away with it each time. But if the burglar is caught, he gets arrested and loses all his accumulated gains. What algorithm should he follow to maximize his expected take?

Page: 33, Location: 498-501

Note: Burglar problem

Because for people there’s always a time cost. It doesn’t come from the design of the experiment. It comes from people’s lives. The “endogenous” time costs of searching, which aren’t usually captured by optimal stopping models, might thus provide an explanation for why human decision-making routinely diverges from the prescriptions of those models. As optimal stopping researcher Neil Bearden puts it, “After searching for a while, we humans just tend to get bored. It’s not irrational to get bored, but it’s hard to model that rigorously.”

Page: 36, Location: 547-551

Note: Time cost of searching

As such, the explicit premise of the optimal stopping problem is the implicit premise of what it is to be alive. It’s this that forces us to decide based on possibilities we’ve not yet seen, this that forces us to embrace high rates of failure even when acting optimally. No choice recurs. We may get similar choices again, but never that exact one. Hesitation—inaction—is just as irrevocable as action. What the motorist, locked on the one-way road, is to space, we are to the fourth dimension: we truly pass this way but once.

Page: 37, Location: 558-562

Note: Intuition And logic

2 Explore/Exploit The Latest vs. the Greatest

Page: 39, Location: 595-597

Note: C.

Seizing a day and seizing a lifetime are two entirely different endeavors. We have the expression “Eat, drink, and be merry, for tomorrow we die,” but perhaps we should also have its inverse: “Start learning a new language or an instrument, and make small talk with a stranger, because life is long, and who knows what joy could blossom over many years’ time.” When balancing favorite experiences and new ones, nothing matters as much as the interval over which we plan to enjoy them.

Page: 43, Location: 655-659

Note: Inversion of seizing the day

So explore when you will have time to use the resulting knowledge, exploit when you’re ready to cash in. The interval makes the strategy.

Page: 44, Location: 668-669

Note: The strategy

intransigence.”

Page: 47, Location: 711-712

Note: T.

For every slot machine we know little or nothing about, there is some guaranteed payout rate which, if offered to us in lieu of that machine, will make us quite content never to pull its handle again. This number—which Gittins called the “dynamic allocation index,” and which the world now knows as the Gittins index—suggests an obvious strategy on the casino floor: always play the arm with the highest index.

Page: 49, Location: 743-746

Note: Gittins index

The success of Upper Confidence Bound algorithms offers a formal justification for the benefit of the doubt. Following the advice of these algorithms, you should be excited to meet new people and try new things—to assume the best about them, in the absence of evidence to the contrary. In the long run, optimism is the best prevention for regret.

Page: 56, Location: 850-852

Note: T.

In some cases, it’s unlikely that any pair of users will have the exact same experience.

Page: 58, Location: 877-877

Note: A/B testing

In general, it seems that people tend to over-explore—to favor the new disproportionately over the best.

Page: 65, Location: 983-984

Note: Study reault

What Carstensen and her colleagues found is that the shrinking of social networks with aging is due primarily to “pruning” peripheral relationships and focusing attention instead on a core of close friends and family members. This process seems to be a deliberate choice: as people approach the end of their lives, they want to focus more on the connections that are the most meaningful.

Page: 70, Location: 1065-1067

Note: Important point

Being sensitive to how much time you have left is exactly what the computer science of the explore/exploit dilemma suggests. We think of the young as stereotypically fickle; the old, stereotypically set in their ways. In fact, both are behaving completely appropriately with respect to their intervals. The deliberate honing of a social network down to the most meaningful relationships is the rational response to having less time to enjoy them.

Page: 71, Location: 1075-1078

Note: Age and people

Big-O: A Yardstick for the Worst Case

Page: 77, Location: 1170-1171

Note: T.

It gets worse from there. There’s “exponential time,” O(2n), where each additional guest doubles your work. Even worse is “factorial time,” O(n!), a class of problems so truly hellish that computer scientists only talk about it when they’re joking—as we were in imagining shuffling a deck until it’s sorted—or when they really, really wish they were.

Page: 80, Location: 1220-1224

Note: T.

Breaking the Quadratic Barrier: Divide and Conquer

Page: 82, Location: 1250-1250

Note: T.

In Bucket Sort, items are grouped together into a number of sorted categories, with no regard for finer, intracategory sorting; that can come later. (In computer science the term “bucket” simply refers to a chunk of unsorted data, but some of the most powerful real-world uses of Bucket Sort, as at the KCLS, take the name entirely literally.)

Page: 86, Location: 1316-1318

Note: Bucket Sort

But in fact it isn’t Bubble Sort that emerges as the single best algorithm in the face of a noisy comparator. The winner of that particular honor is an algorithm called Comparison Counting Sort. In this algorithm, each item is compared to all the others, generating a tally of how many items it is bigger than. This number can then be used directly as the item’s rank. Since it compares all pairs, Comparison Counting Sort is a quadratic-time algorithm, like Bubble Sort. Thus it’s not a popular choice in traditional computer science applications, but it’s exceptionally fault-tolerant.

Page: 97, Location: 1477-1481

Note: Robustness is sometimes preferred than efficiency

Blood Sort: Pecking Orders and Dominance Hierarchies

Page: 98, Location: 1488-1489

Note: T.

Much as we bemoan the daily rat race, the fact that it’s a race rather than a fight is a key part of what sets us apart from the monkeys, the chickens—and, for that matter, the rats.

Page: 103, Location: 1573-1575

Note: Important Point

4 Caching Forget About

Page: 105, Location: 1608-1609

Note: C.

In an ideal world, they wrote, the machine would of course have limitless quantities of lightning-fast storage, but in practice this wasn’t possible. (It still isn’t.) Instead, the trio proposed what they believed to be the next best thing: “a hierarchy of memories, each of which has greater capacity than the preceding but which is less quickly accessible.” By having effectively a pyramid of different forms of memory—a small, fast memory and a large, slow one—maybe we could somehow get the best of both.

Page: 108, Location: 1643-1646

Note: Memory Hierarchy

if you want to store anything else, and in computer science this making of room is called “cache replacement” or “cache eviction.” As Wilkes wrote, “Since the [cache] can only be a fraction of the size of the main memory, words cannot be preserved in it indefinitely, and there must be wired into the system an algorithm by which they are progressively overwritten.” These algorithms are known as “replacement policies” or “eviction policies,” or simply as caching algorithms.

Page: 110, Location: 1681-1685

Note: Cache evictioln

Turning the Library Inside

Page: 113, Location: 1728-1728

Note: T.

Here’s why: if temporal locality holds, then the rough-sorting shelves contain the most important books in the whole building. These are the books that were most recently used, so they are the ones that patrons are most likely to be looking for. It seems a crime that arguably the juiciest and most browseworthy shelf of the libraries’ miles of stacks is both hidden away and constantly eroded by earnest library staff just doing their jobs.

Page: 114, Location: 1742-1745

Note: Temporal Locality

A quarter of all Internet traffic at present is handled by a single corporation, one that manages to stay almost entirely out of the headlines. This Massachusetts-based company is called Akamai, and they’re in the caching business.

Page: 116, Location: 1766-1767

Note: Akamai

In short, the mathematics of self-organizing lists suggests something radical: the big pile of papers on your desk, far from being a guilt-inducing fester of chaos, is actually one of the most well-designed and efficient structures available. What might appear to others to be an unorganized mess is, in fact, a self-organizing mess. Tossing things back on the top of the pile is the very

Page: 123, Location: 1880-1882

Note: Noguchi filing system and LRU

The key to a good human memory then becomes the same as the key to a good computer cache: predicting which items are most likely to be wanted in the future.

Page: 125, Location: 1908-1909

Note: Practice and recall

Why don’t they make the whole plane out of that black box stuff? —STEVEN WRIGHT

Page: 127, Location: 1933-1934

Note: Wow

Recent work by a team of psychologists and linguists led by Michael Ramscar at the University of Tübingen has suggested that what we call “cognitive decline”—lags and retrieval errors—may not be about the search process slowing or deteriorating, but (at least partly) an unavoidable consequence of the amount of information we have to navigate getting bigger and bigger. Regardless of whatever other challenges aging brings, older brains—which must manage a greater store of memories—are literally solving harder computational problems with every passing day. The old can mock the young for their speed: “It’s because you don’t know anything yet!”

Page: 128, Location: 1955-1960

Note: Important

5 Scheduling First Things First

Page: 130, Location: 1989-1991

Note: C.

Spending Time Becomes a Science

Page: 132, Location: 2010-2010

Note: T.

Priority Inversion and Precedence Constraints

Page: 141, Location: 2153-2153

Note: T.

The comedian Mitch Hedberg recounts a time when “I was at a casino, I was minding my own business, and this guy came up and said, ‘You’re gonna have to move, you’re blocking the fire exit.’ As though if there was a fire, I wasn’t gonna run.” The bouncer’s argument was priority inversion; Hedberg’s rebuttal was priority inheritance. Hedberg lounging casually in front of a fleeing mob puts his low-priority loitering ahead of their high-priority running for their lives—but not if he inherits their priority. And an onrushing mob has a way of making one inherit their priority rather quickly. As Hedberg explains, “If you’re flammable and have legs, you are never blocking a fire exit.”

Page: 142, Location: 2172-2177

Note: Priority inversion and inheritance

“Things which matter most must never be at the mercy of things which matter least,”

Page: 143, Location: 2181-2181

Note: Very important

When a certain task can’t be started until another one is finished, scheduling theorists call that a “precedence constraint.”

Page: 143, Location: 2183-2184

Note: Term

Drop Everything: Preemption and Uncertainty

Page: 146, Location: 2234-2234

Note: T.

Preemption Isn’t Free: The Context Switch

Page: 148, Location: 2265-2266

Note: T.

only. The truth is, there’s always overhead—time lost to metawork, to the logistics of bookkeeping and task management. This is one of the fundamental tradeoffs of scheduling. And the more you take on, the more overhead there is. At its nightmarish extreme, this turns into a phenomenon called thrashing.

Page: 150, Location: 2294-2296

Note: T.

But if a task requires keeping track of so many things that they won’t all fit into memory, then you might well end up spending more time swapping information in and out of memory than doing the actual work. What’s more, when you switch tasks, the newly active task might make space for its working set by evicting portions of other working sets from memory. The next task, upon reactivation, would then reacquire parts of its working set from the hard disk and muscle them back into memory, again displacing others. This problem—tasks stealing space from each other—can get even worse in systems with hierarchies of caches between the processor and the memory.

Page: 152, Location: 2319-2324

Note: Cache fighting

This is thrashing: a system running full-tilt and accomplishing nothing at

Page: 152, Location: 2327-2328

Note: Term. Thrashing

If you find yourself doing a lot of context switching because you’re tackling a heterogeneous collection of short tasks, you can also employ another idea from computer science: “interrupt coalescing.” If you have five credit card bills, for instance, don’t pay them as they arrive; take care of them all in one go when the fifth bill comes.

Page: 157, Location: 2397-2399

Note: Ter.

6 Bayes’s Rule Predicting the Future

Page: 161, Location: 2454-2455

Note: C.

Reasoning Backward with the Reverend Bayes

Page: 162, Location: 2475-2475

Note: T.

In fact, for any possible drawing of w winning tickets in n attempts, the expectation is simply the number of wins plus one, divided by the number of attempts plus two: (w+1)⁄(n+2).

Page: 166, Location: 2534-2537

Note: Laplace law

Laplace’s Law, and it is easy to apply in any situation where you need to assess the chances of an event based on its history. If you make ten attempts at something and five of them succeed, Laplace’s Law estimates your overall chances to be 6/12 or 50%, consistent with our intuitions. If you try only once and it works out, Laplace’s estimate of 2/3 is both more reasonable than assuming you’ll win every time, and more actionable than Price’s guidance (which would tell us that there is a 75% metaprobability of a 50% or greater chance of success).

Page: 166, Location: 2538-2541

Note: Important

Bayes’s Rule and Prior Beliefs

Page: 167, Location: 2551-2551

Note: T.

The mathematical formula that describes this relationship, tying together our previously held ideas and the evidence before our eyes, has come to be known—ironically, as the real heavy lifting was done by Laplace—as Bayes’s Rule. And it gives a remarkably straightforward solution to the problem of how to combine preexisting beliefs with observed evidence: multiply their probabilities together.

Page: 168, Location: 2568-2571

Note: Bayes rule

And if we assume that we’re arriving precisely halfway into something’s duration, the best guess we can make for how long it will last into the future becomes obvious: exactly as long as it’s lasted already. Gott saw the Berlin Wall eight years after it was built, so his best guess was that it would stand for eight years more. (It ended up being twenty.) This straightforward reasoning, which Gott named the Copernican Principle, results in a simple algorithm that can be used to make predictions about all sorts of topics.

Page: 170, Location: 2594-2598

Note: Copernician principle

When Bayes’s Rule combines all these probabilities—the more-probable short time spans pushing down the average forecast, the less-probable yet still possible long ones pushing it up—the Copernican Principle emerges: if we want to predict how long something will last, and have no other knowledge about it whatsoever, the best guess we can make is that it will continue just as long as it’s gone on so far.

Page: 172, Location: 2629-2632

Note: Important

The Copernican Principle seems reasonable exactly in those situations where we know nothing at all—such as looking at the Berlin Wall in 1969, when we’re not even sure what timescale is appropriate. And it feels completely wrong in those cases where we do know something about the subject matter. Predicting that a 90-year-old man will live to 180 years seems unreasonable precisely because we go into the problem already knowing a lot about human life spans—and so we can do better.

Page: 173, Location: 2639-2642

Note: Important

The richer the prior information we bring to Bayes’s Rule, the more useful the predictions we can get out of it.

Page: 173, Location: 2642-2643

Note: T.

Human life spans are clearly in the former category. They roughly follow what’s termed a “normal” distribution—also known as the “Gaussian” distribution, after the German mathematician Carl Friedrich Gauss, and informally called the “bell curve” for its characteristic shape.

Page: 173, Location: 2645-2647

Note: Term.

This kind of pattern typifies what are called “power-law distributions.” These are also known as “scale-free distributions” because they characterize quantities that can plausibly range over many scales: a town can have tens, hundreds, thousands, tens of thousands, hundreds of thousands, or millions of residents, so we can’t pin down a single value for how big a “normal” town should be.

Page: 174, Location: 2655-2658

Note: Term.

Page: 175, Location: 2677-2680

Note: Term. Important

When we apply Bayes’s Rule with a normal distribution as a prior, on the other hand, we obtain a very different kind of guidance. Instead of a multiplicative rule, we get an Average Rule: use the distribution’s “natural” average—its single, specific scale—as your guide.

Page: 176, Location: 2687-2689

Note: Term.

Between those two extremes, there’s actually a third category of things in life: those that are neither more nor less likely to end just because they’ve gone on for a while. Sometimes things are simply … invariant.

Page: 177, Location: 2700-2701

Note: Term.

The Danish mathematician Agner Krarup Erlang, who studied such phenomena, formalized the spread of intervals between independent events into the function that now carries his name: the Erlang distribution.

Page: 177, Location: 2701-2703

Note: Term.

The Erlang distribution gives us a third kind of prediction rule, the Additive Rule: always predict that things will go on just a constant amount longer.

Page: 177, Location: 2710-2711

Note: Term.

These three very different patterns of optimal prediction—the Multiplicative, Average, and Additive Rules—all result directly from applying Bayes’s Rule to the power-law, normal, and Erlang distributions, respectively.

Page: 178, Location: 2722-2723

Note: Important

Our judgments betray our expectations, and our expectations betray our experience. What we project about the future reveals a lot—about the world we live in, and about our own past.

Page: 182, Location: 2779-2780

Note: Good point

Priors in the Age of Mechanical Reproduction

Page: 184, Location: 2810-2811

Note: T.

He is careful of what he reads, for that is what he will write. He is careful of what he learns, for that is what he will know. —ANNIE DILLARD The best way to make good predictions, as Bayes’s Rule shows us, is to be accurately informed about the things you’re predicting.

Page: 184, Location: 2813-2816

Note: Important

There’s a curious tension, then, between communicating with others and maintaining accurate priors about the world. When people talk about what interests them—and offer stories they think their listeners will find interesting—it skews the statistics of our experience. That makes it hard to maintain appropriate prior distributions. And the challenge has only increased with the development of the printing press, the nightly news, and social media—innovations that allow our species to spread language mechanically.

Page: 185, Location: 2826-2830

Note: Base of availability bias

Page: 186, Location: 2837-2839

Note: Point

7 Overfitting When to Think Less

Page: 187, Location: 2858-2860

Note: C.

If the study were repeated with different people, producing slight variations on the same essential pattern, the one- and two-factor models would remain more or less steady—but the nine-factor model would gyrate wildly from one instance of the study to the next. This is what statisticians call overfitting.

Page: 191, Location: 2927-2929

Note: Term.

The Idolatry of Data

Page: 192, Location: 2932-2932

Note: T.

we had copious data, drawn from a perfectly representative sample, completely mistake-free, and representing exactly what we’re trying to evaluate, then using the most complex model available would indeed be the best approach. But if we try to perfectly fit our model to the data when any of these factors fails to hold, we risk overfitting.

Page: 192, Location: 2932-2935

Note: Imp.

How to Combat Overfitting: Penalizing Complexity

Page: 198, Location: 3028-3028

Note: T.

Occam’s razor principle, which suggests that, all things being equal, the simplest possible hypothesis is probably the correct one.

Page: 198, Location: 3034-3035

Note: Term.

Computer scientists refer to this principle—using constraints that penalize models for their complexity—as Regularization.

Page: 199, Location: 3038-3039

Note: Term.

Language forms yet another natural Lasso: complexity is punished by the labor of speaking at greater length and the taxing of our listener’s attention span. Business plans get compressed to an elevator pitch; life advice becomes proverbial wisdom only if it is sufficiently concise and catchy. And anything that needs to be remembered has to pass through the inherent Lasso of memory.

Page: 200, Location: 3057-3059

Note: Imp.

As a species, being constrained by the past makes us less perfectly adjusted to the present we know but helps keep us robust for the future we don’t.

Page: 204, Location: 3115-3116

Note: Imp.

8 Relaxation Let It Slide

Page: 208, Location: 3185-3186

Note: C.

This is an instance of what’s known to mathematicians and computer scientists as a “constrained optimization” problem: how to find the single best arrangement of a set of variables, given particular rules and a scorekeeping measure.

Page: 211, Location: 3224-3225

Note: Term.

But like the secretary problem, it emerged in the mid-twentieth century, a period unmistakably evoked by its canonical name: “the traveling salesman problem.”

Page: 211, Location: 3227-3228

Note: Term.

One of the simplest forms of relaxation in computer science is known as Constraint Relaxation. In this technique, researchers remove some of the problem’s constraints and set about solving the problem they wish they had. Then, after they’ve made a certain amount of headway, they try to add the constraints back in. That is, they make the problem temporarily easier to handle before bringing it back to reality.

Page: 214, Location: 3271-3274

Note: Term.

Just a Speeding Ticket: Lagrangian Relaxation

Page: 218, Location: 3341-3341

Note: T.

When an optimization problem’s constraints say “Do it, or else!,” Lagrangian Relaxation replies, “Or else what?” Once we can color outside the lines—even just a little bit, and even at a steep cost—problems become tractable that weren’t tractable before.

Page: 219, Location: 3353-3355

Note: Term.

There are many ways to relax a problem, and we’ve seen three of the most important. The first, Constraint Relaxation, simply removes some constraints altogether and makes progress on a looser form of the problem before coming back to reality. The second, Continuous Relaxation, turns discrete or binary choices into continua: when deciding between iced tea and lemonade, first imagine a 50–50 “Arnold Palmer” blend and then round it up or down. The third, Lagrangian Relaxation, turns impossibilities into mere penalties, teaching the art of bending the rules (or breaking them and accepting the consequences). A rock band deciding which songs to cram into a limited set, for instance, is up against what computer scientists call the “knapsack problem”—a puzzle that asks one to decide which of a set of items of different bulk and importance to pack into a confined volume. In its strict formulation the knapsack problem is famously intractable, but that needn’t discourage our relaxed rock stars. As demonstrated in several celebrated examples, sometimes it’s better to simply play a bit past the city curfew and incur the related fines than to limit the show to the available slot. In fact, even when you don’t commit the infraction, simply imagining it can be illuminating.

Page: 222, Location: 3396-3404

Note: Relaxation types

Scott Fitzgerald once wrote that “the test of a first-rate intelligence is the ability to hold two opposing ideas in mind at the same time and still retain the ability to function.”

Page: 227, Location: 3473-3474

Note: Quote.

Ulam developed the idea further with John von Neumann, and worked with Nicholas Metropolis, another of the physicists from the Manhattan Project, on implementing the method on the Los Alamos computer. Metropolis named this approach—replacing exhaustive probability calculations with sample simulations—the Monte Carlo Method, after the Monte Carlo casino in Monaco, a place equally dependent on the vagaries of chance.

Page: 228, Location: 3490-3493

Note: Term.

For millennia, the study of prime numbers was believed to be, as G. H. Hardy put it, “one of the most obviously useless branches” of mathematics.

Page: 230, Location: 3514-3515

Note: Hahd

what if we only needed to be mostly sure this URL was new to us? That’s where the Bloom filter comes in. Named for its inventor, Burton H. Bloom, a Bloom filter works much like the Rabin-Miller primality test: the URL is entered into a set of equations that esssentially check for “witnesses” to its novelty. (Rather than proclaim “n is not prime,” these equations say “I have not seen n before.”) If you’re willing to tolerate an error rate of just 1% or 2%, storing your findings in a probabilistic data structure like a Bloom filter will save you significant amounts of both time and space. And the usefulness of such filters is not confined to search engines: Bloom filters have shipped with a number of recent web browsers to check URLs against a list of known malicious websites, and they are also an important part of cryptocurrencies like Bitcoin.

Page: 239, Location: 3654-3661

Note: Term. Bloom filter

This is an algorithm known as Hill Climbing—since the search through a space of solutions, some better and some worse, is commonly thought of in terms of a landscape with hills and valleys, where your goal is to reach the highest peak.

Page: 241, Location: 3685-3687

Note: Term.

for instance, the physics of annealing, the way that materials change state as they are heated and cooled.

Page: 243, Location: 3724-3725

Note: Term.

Over the next few decades, that paper would be cited a whopping thirty-two thousand times. To this day, simulated annealing remains one of the most promising approaches to optimization problems known to the field.

Page: 245, Location: 3756-3757

Note: Term.

it’s a common enough phenomenon that a word was invented to capture it: in 1754, Horace Walpole coined the term “serendipity,” based on the fairy tale adventures of The Three Princes of Serendip (Serendip being the archaic name of Sri Lanka), who “were always making discoveries, by accidents and sagacity, of things they were not in quest of.”

Page: 247, Location: 3780-3782

Note: Word.

Being randomly jittered, thrown out of the frame and focused on a larger scale, provides a way to leave what might be locally good and get back to the pursuit of what might be globally optimal.

Page: 249, Location: 3816-3817

Note: Impo.

Protocol is how we get on the same page; in fact, the word is rooted in the Greek protokollon, “first glue,” which referred to the outer page attached to a book or manuscript.

Page: 255, Location: 3901-3902

Note: Term.

Exponential Backoff: The Algorithm of Forgiveness

Page: 262, Location: 4017-4018

Note: T.

Since the maximum delay length (2, 4, 8, 16…) forms an exponential progression, it’s become known as Exponential Backoff.

Page: 264, Location: 4047-4048

Note: Term.

Flow Control and Congestion Avoidance

Page: 267, Location: 4087-4088

Note: T.

Packet switching is radically different. The phone system gets full; the mail system gets slow. There’s nothing in the network to explicitly tell a sender how many other senders there are, or how congested the network is at any given moment, and the amount of congestion is constantly changing. Therefore, the sender and receiver must not only communicate but metacommunicate: they need to figure out how fast the data should be sent. Somehow, assorted packet flows—without explicit management or coordination—must both get out of each other’s way and quickly take advantage of any newly available space.

Page: 268, Location: 4101-4105

Note: Imp.

Backchannels: Flow Control in Linguistics

Page: 272, Location: 4160-4160

Note: T.

When a networking buffer fills up, what typically happens is called Tail Drop: an unceremonious way of saying that every packet arriving after that point is simply rejected, and effectively deleted. (Turning new customers away from the crêpe stand once the line gets too long would be a version of Tail Drop in a human context.) Given the postal metaphor for packet switching,

Page: 276, Location: 4229-4231

Note: Term.

The most prevalent critique of modern communications is that we are “always connected.” But the problem isn’t that we’re always connected; we’re not. The problem is that we’re always buffered. The difference is enormous.

Page: 279, Location: 4264-4266

Note: Imp.

The feeling that one needs to look at everything on the Internet, or read all possible books, or see all possible shows, is bufferbloat.

Page: 279, Location: 4266-4267

Note: Yolo

There’s also a proposal for a new backchannel for TCP, the first such modification in many years: Explicit Congestion Notification, or ECN. Fully extricating the Internet from bufferbloat will draw on all of these changes and require the patience of many years. “This is a long-term swamp,” says Gettys.

Page: 280, Location: 4289-4291

Note: Term.

11 Game Theory The Minds of Others

Page: 282, Location: 4314-4315

Note: C.

Computer science illustrates the fundamental limitations of this kind of reasoning with what’s called the “halting problem.”

Page: 284, Location: 4349-4350

Note: Term.

Dominant Strategies, for Better or Worse

Page: 290, Location: 4433-4433

Note: T.

In fact, this makes defection not merely the equilibrium strategy but what’s known as a dominant strategy. A dominant strategy avoids recursion altogether, by being the best response to all of your opponent’s possible strategies—so you don’t even need to trouble yourself getting inside their head at all. A dominant strategy is a powerful thing.

Page: 291, Location: 4448-4451

Note: Term.

But from a congestion standpoint, the fact that anarchy is only 4/3 as congested as perfect coordination means that perfectly coordinated commutes will only be 3/4 as congested as they are now. It’s a bit like the famous line by James Branch Cabell: “The optimist proclaims that we live in the best of all possible worlds; and the pessimist fears this is true.” Congestion will always be a problem solvable more by planners and by overall demand than by the decisions of individual drivers, human or computer, selfish or cooperative.

Page: 292, Location: 4473-4477

Note: Imp.

The Tragedy of the Commons

Page: 293, Location: 4482-4482

Note: T.

In theory, all the villagers should graze only as many animals as would leave some grass for everyone. In practice, though, the benefits of grazing a little bit more than that accrue directly to you, while the harms seem too small to be of consequence. Yet if everyone follows this logic of using just slightly more of the commons than they should, a dreadful equilibrium results: a completely devastated lawn, and no grass for anyone’s livestock thereafter. Hardin called this the “tragedy of the commons,”

Page: 293, Location: 4485-4489

Note: Tragedy of commmons

Mechanism Design: Change the Game

Page: 295, Location: 4523-4523

Note: T.

If the forest could only somehow agree to a kind of truce, the ecosystem could enjoy the photosynthetic bounty without the wood-making arms race wasting it all. But as we’ve seen, good outcomes in these scenarios tend only to arise in the context of an authority outside the game—someone changing the payoffs from the top down. It would seem as though in nature, then, there is simply no way of establishing good equilibria between individuals.

Page: 300, Location: 4591-4594

Note: Imp.

Just as with the tragedy of the commons, this failure is not necessarily the players’ fault. An enormously influential paper by the economists Sushil Bikhchandani, David Hirshleifer, and Ivo Welch has demonstrated that under the right circumstances, a group of agents who are all behaving perfectly rationally and perfectly appropriately can nonetheless fall prey to what is effectively infinite misinformation. This has come to be known as an “information cascade.”

Page: 307, Location: 4702-4705

Note: Term.

Named for Nobel Prize–winning economist William Vickrey, the Vickrey auction, just like the first-price auction, is a “sealed bid” auction process. That is, every participant simply writes down a single number in secret, and the highest bidder wins. However, in a Vickrey auction, the winner ends up paying not the amount of their own bid, but that of the second-place bidder. That is to say, if you bid $25 and I bid$ 10, you win the item at my price: you only have to pay $10.

Page: 311, Location: 4755-4758

Note: Term.

In fact, a game-theoretic principle called “revenue equivalence” establishes that over time, the average expected sale price in a first-price auction will converge to precisely the same as in a Vickrey auction. Thus the Vickrey equilibrium involves the same bidder winning the item for the same price—without any strategizing by any of the bidders whatsoever. As Tim Roughgarden tells his Stanford students, the Vickrey auction is “awesome.”

Page: 312, Location: 4771-4774

Note: Term.

In fact, the lesson here goes far beyond auctions. In a landmark finding called the “revelation principle,” Nobel laureate Roger Myerson proved that any game that requires strategically masking the truth can be transformed into a game that requires nothing but simple honesty.

Page: 312, Location: 4779-4781

Note: Term.

Introduction

Page: 5, Location: 76-76

How are you to know that an apartment is indeed the best unless you have a baseline to judge it by? And how are you to establish that baseline unless you look at (and lose) a number of apartments? The more information you gather, the better you’ll know the right opportunity when you see it—but the more likely you are to have already passed it by.

Page: 6, Location: 86-89

Thirty-seven percent. If you want the best odds of getting the best apartment, spend 37% of your apartment hunt (eleven days, if you’ve given yourself a month for the search) noncommittally exploring options. Leave the checkbook at home; you’re just calibrating. But after that point, be prepared to immediately commit—deposit and all—to the very first place you see that beats whatever you’ve already seen. This is not merely an intuitively satisfying compromise between looking and leaping. It is the provably optimal solution.

Page: 7, Location: 94-98

But an algorithm is just a finite sequence of steps used to solve a problem, and algorithms are much broader—and older by far—than the computer. Long before algorithms were ever used by machines, they were used by people.

Page: 8, Location: 122-123

The word “algorithm” comes from the name of Persian mathematician al-Khwārizmī, author of a ninth-century book of techniques for doing mathematics by hand. (His book was called al-Jabr wa’l-Muqābala—and the “al-jabr” of the title in turn provides the source of our word “algebra.”)

Page: 9, Location: 123-126

Optimal stopping tells us when to look and when to leap. The explore/exploit tradeoff tells us how to find the balance between trying new things and enjoying our favorites. Sorting theory tells us how (and whether) to arrange our offices. Caching theory tells us how to fill our closets. Scheduling theory tells us how to fill our time.

Page: 9, Location: 133-136

As Carl Sagan put it, “Science is a way of thinking much more than it is a body of knowledge.”

Page: 9, Location: 137-137

Alan Turing defined the very notion of computation by an analogy to a human mathematician who carefully works through the steps of a lengthy calculation, yielding an unmistakably right answer.

Page: 10, Location: 147-149

Though all Christians start a wedding invitation by solemnly declaring their marriage is due to special Divine arrangement, I, as a philosopher, would like to talk in greater detail about this … —JOHANNES KEPLER

Page: 13, Location: 197-199

It’s such a common phenomenon that college guidance counselors even have a slang term for it: the “turkey drop.” High-school sweethearts come home for Thanksgiving of their freshman year of college and, four days later, return to campus single.

Page: 14, Location: 202-204

The nature of serial monogamy, writ large, is that its practitioners are confronted with a fundamental, unavoidable problem. When have you met enough people to know who your best match is? And what if acquiring the data costs you that very match? It seems the ultimate Catch-22 of the heart.

Page: 14, Location: 208-210

In any optimal stopping problem, the crucial dilemma is not which option to pick, but how many options to even consider. These problems turn out to have implications not only for lovers and renters, but also for drivers, homeowners, burglars, and beyond.

Page: 14, Location: 212-215

Whence 37%? In your search for a secretary, there are two ways you can fail: stopping early and stopping late. When you stop too early, you leave the best applicant undiscovered. When you stop too late, you hold out for a better applicant who doesn’t exist. The optimal strategy will clearly require finding the right balance between the two, walking the tightrope between looking too much and not enough.

Page: 17, Location: 249-252

As a result, best-yet applicants will become steadily more impressive as the search continues (by definition, again, they’re better than all those who came before)—but they will also become more and more infrequent.

Page: 17, Location: 257-258

the optimal solution takes the form of what we’ll call the Look-Then-Leap Rule: You set a predetermined amount of time for “looking”—that is, exploring your options, gathering data—in which you categorically don’t choose anyone, no matter how impressive. After that point, you enter the “leap” phase, prepared to instantly commit to anyone who outshines the best applicant you saw in the look phase.

Page: 18, Location: 264-267

As the applicant pool grows, the exact place to draw the line between looking and leaping settles to 37% of the pool, yielding the 37% Rule: look at the first 37% of the applicants,* choosing none, then be ready to leap for anyone better than all those you’ve seen so far.

Page: 19, Location: 280-283

As it turns out, following this optimal strategy ultimately gives us a 37% chance of hiring the best applicant; it’s one of the problem’s curious mathematical symmetries that the strategy itself and its chance of success work out to the very same number.

Page: 19, Location: 285-286

It’s true that you’re unlikely to find the needle the majority of the time, but optimal stopping is your best defense against the haystack, no matter how large.

Page: 20, Location: 295-297

married the first man I ever kissed. When I tell this to my children they just about throw up. —BARBARA BUSH

Page: 20, Location: 299-301

I married the first man I ever kissed. When I tell this to my children they just about throw up. —BARBARA BUSH

Page: 20, Location: 299-301

Both Kepler and Trick—in opposite ways—experienced firsthand some of the ways that the secretary problem oversimplifies the search for love.

Page: 21, Location: 320-321

In the classical secretary problem, applicants always accept the position, preventing the rejection experienced by Trick. And they cannot be “recalled” once passed over, contrary to the strategy followed by Kepler.

Page: 21, Location: 321-323

For example, assume an immediate proposal is a sure thing but belated proposals are rejected half the time. Then the math says you should keep looking noncommittally until you’ve seen 61% of applicants, and then only leap if someone in the remaining 39% of the pool proves to be the best yet. If you’re still single after considering all the possibilities—as Kepler was—then go back to the best one that got away. The symmetry between strategy and outcome holds in this case once again, with your chances of ending up with the best applicant under this second-chances-allowed scenario also being 61%.

Page: 22, Location: 335-339

It’s this fact that gives rise to the unavoidable “look” phase, in which we risk passing up a superb early applicant while we calibrate our expectations and standards. Mathematicians refer to this genre of optimal stopping problems as “no-information games.”

Page: 23, Location: 349-350

We can instead use the Threshold Rule, where we immediately accept an applicant if she is above a certain percentile. We don’t need to look at an initial group of candidates to set this threshold—but we do, however, need to be keenly aware of how much looking remains available.

Page: 24, Location: 363-365

The math shows that when there are a lot of applicants left in the pool, you should pass up even a very good applicant in the hopes of finding someone still better than that—but as your options dwindle, you should be prepared to hire anyone who’s simply better than average. It’s a familiar, if not exactly inspiring, message: in the face of slim pickings, lower your standards. It also makes clear the converse: with more fish in the sea, raise them. In both cases, crucially, the math tells you exactly by how much.

Page: 24, Location: 366-369

being more choosy the more applicants are left.

Page: 25, Location: 374-374

The chance of ending up with the single best applicant in this full-information version of the secretary problem comes to 58%—still far from a guarantee, but considerably better than the 37% success rate offered by the 37% Rule in the no-information game. If you have all the facts, you can succeed more often than not, even as the applicant pool grows arbitrarily large.

Page: 25, Location: 376-378

The full-information game thus offers an unexpected and somewhat bizarre takeaway. Gold digging is more likely to succeed than a quest for love.

Page: 25, Location: 380-382

The full-information game thus offers an unexpected and somewhat bizarre takeaway. Gold digging is more likely to succeed than a quest for love. If you’re evaluating your partners based on any kind of objective criterion—say, their income percentile—then you’ve got a lot more information at your disposal than if you’re after a nebulous emotional response (“love”) that might require both experience and comparison to calibrate.

Page: 25, Location: 380-384

But if neither concern leads us to believe that our backs are against the wall, then we can simply focus on a cost-benefit analysis of the waiting game.

Page: 27, Location: 405-406

First, if the cost of waiting is trivial, we’re able to be almost infinitely choosy.

Page: 27, Location: 414-415

Finally, if waiting costs half or more of our expected range of offers—in this case, $50,000—then there’s no advantage whatsoever to holding out; we’ll do best by taking the very first offer that comes along and calling it done. Beggars can’t be choosers.

Page: 28, Location: 417-419

The critical thing to note in this problem is that our threshold depends only on the cost of search. Since the chances of the next offer being a good one—and the cost of finding out—never change, our stopping price has no reason to ever get lower as the search goes on, regardless of our luck. We set it once, before we even begin, and then we quite simply hold fast.

Page: 28, Location: 421-424

But in house selling and job hunting, even if it’s possible to reconsider an earlier offer, and even if that offer is guaranteed to still be on the table, you should nonetheless never do so. If it wasn’t above your threshold then, it won’t be above your threshold now. What you’ve paid to keep searching is a sunk cost. Don’t compromise, don’t second-guess. And don’t look back.

Page: 29, Location: 433-436

The key impact that occupancy rate has on parking strategy becomes clear once we recognize that parking is an optimal stopping problem. As you drive along the street, every time you see the occasional empty spot you have to make a decision: should you take this spot, or go a little closer to your destination and try your luck?

Page: 31, Location: 462-464

When to Quit

Page: 32, Location: 483-483

the number of robberies you should carry out is roughly equal to the chance you get away, divided by the chance you get caught. If you’re a skilled burglar and have a 90% chance of pulling off each robbery (and a 10% chance of losing it all), then retire after 90/10 = 9 robberies. A ham-fisted amateur with a 50/50 chance of success? The first time you have nothing to lose, but don’t push your luck more than once.

Page: 33, Location: 503-506

Always Be Stopping

Page: 35, Location: 524-525

So the irresistible question is whether—by evolution or education or intuition—we actually do follow the best strategies. At first glance, the answer is no. About a dozen studies have produced the same result: people tend to stop early, leaving better applicants unseen.

Page: 35, Location: 531-533

But in practice, when the clock—or the ticker—is ticking, few aspects of decision-making (or of thinking more generally) are as important as this one: when to

Page: 37, Location: 563-564

But in practice, when the clock—or the ticker—is ticking, few aspects of decision-making (or of thinking more generally) are as important as this one: when to stop.

Page: 37, Location: 563-564

Every day we are constantly forced to make decisions between options that differ in a very specific dimension: do we try new things or stick with our favorite ones? We intuitively understand that life is a balance between novelty and tradition, between the latest and the greatest, between taking risks and savoring what we know and love.

Page: 40, Location: 602-604

But the reality is not so simple. Remembering that every “best” song and restaurant among your favorites began humbly as something merely “new” to you is a reminder that there may be yet-unknown bests still out there—and thus that the new is indeed worthy of at least some of our attention.

Page: 40, Location: 608-610

balance for more than fifty years. They even have a name for it: the explore/exploit tradeoff. Explore/

Page: 40, Location: 614-615

Computer scientists have been working on finding this balance for more than fifty years. They even have a name for it: the explore/exploit tradeoff.

Page: 40, Location: 613-614

Explore/Exploit

Page: 41, Location: 615-615

Simply put, exploration is gathering information, and exploitation is using the information you have to get a known good result.

Page: 41, Location: 616-618

What’s more, exploration can be a curse. Part of what’s nice about music, for instance, is that there are constantly new things to listen to. Or, if you’re a music journalist, part of what’s terrible about music is that there are constantly new things to listen to. Being a music journalist means turning the exploration dial all the way to 11, where it’s nothing but new things all the time.

Page: 41, Location: 622-625

Journalists are martyrs, exploring so that others may exploit.

Page: 42, Location: 630-631

In computer science, the tension between exploration and exploitation takes its most concrete form in a scenario called the “multi-armed bandit problem.” The odd name comes from the colloquial term for a casino slot machine, the “one-armed bandit.”

Page: 42, Location: 631-633

People tend to treat decisions in isolation, to focus on finding each time the outcome with the highest expected value. But decisions are almost never isolated, and expected value isn’t the end of the story. If you’re thinking not just about the next decision, but about all the decisions you are going to make about the same options in the future, the explore/exploit tradeoff is crucial to the process.

Page: 43, Location: 647-650

Seize the Interval

Page: 43, Location: 653-653

A sobering property of trying new things is that the value of exploration, of finding a new favorite, can only go down over time, as the remaining opportunities to savor it dwindle. Discovering an enchanting café on your last night in town doesn’t give you the opportunity to return.

Page: 44, Location: 664-666

Win-Stay

Page: 45, Location: 687-687

Robbins specifically considered the case where there are exactly two slot machines, and proposed a solution called the Win-Stay, Lose-Shift algorithm: choose an arm at random, and keep pulling it as long as it keeps paying off. If the arm doesn’t pay off after a particular pull, then switch to the other one. Although this simple strategy is far from a complete solution, Robbins proved in 1952 that it performs reliably better than chance.

Page: 46, Location: 692-695

Good options shouldn’t be penalized too strongly for being imperfect.

Page: 46, Location: 700-701

For these reasons, the multi-armed bandit problem effectively stayed unsolved. In Whittle’s words, “it quickly became a classic, and a byword for intransigence.”

Page: 47, Location: 710-712

The Gittins Index

Page: 47, Location: 712-712

Economists refer to this idea, of valuing the present more highly than the future, as “discounting.”

Page: 48, Location: 726-726

The Gittins index, then, provides a formal, rigorous justification for preferring the unknown, provided we have some opportunity to exploit the results of what we learn from exploring.

Page: 51, Location: 777-778

The old adage tells us that “the grass is always greener on the other side of the fence,” but the math tells us why: the unknown has a chance of being better, even if we actually expect it to be no different, or if it’s just as likely to be worse. The untested rookie is worth more (early in the season, anyway) than the veteran of seemingly equal ability, precisely because we know less about him. Exploration in itself has value, since trying new things increases our chances of finding the best. So taking the future into account, rather than focusing just on the present, drives us toward novelty.

Page: 51, Location: 778-782

Regret and Optimism

Page: 52, Location: 795-795

For myself I am an optimist. It does not seem to be much use being anything else. —WINSTON CHURCHILL

Page: 52, Location: 797-798

In the memorable words of management theorist Chester Barnard, “To try and fail is at least to learn; to fail to try is to suffer the inestimable loss of what might have been.”

Page: 53, Location: 802-803

Regret is the result of comparing what we actually did with what would have been best in hindsight.

Page: 53, Location: 813-813

In general we can’t realistically expect someday to never have any more regrets. But if we’re following a regret-minimizing algorithm, every year we can expect to have fewer new regrets than we did the year before.

Page: 54, Location: 825-827

Of the ones they’ve discovered, the most popular are known as Upper Confidence Bound algorithms.

Page: 54, Location: 828-829

In a multi-armed bandit problem, an Upper Confidence Bound algorithm says, quite simply, to pick the option for which the top of the confidence interval is highest.

Page: 55, Location: 833-834

Bandits Online

Page: 56, Location: 852-852

Clinical Trials on Trial

Page: 59, Location: 900-900

Clinical testing of new drugs and treatments, the report notes, often requires risking harm to some patients, even if steps are taken to minimize that risk.

Page: 60, Location: 914-915

The widespread difficulty with accepting results from adaptive clinical trials might seem incomprehensible. But consider that part of what the advent of statistics did for medicine, at the start of the twentieth century, was to transform it from a field in which doctors had to persuade each other in ad hoc ways about every new treatment into one where they had clear guidelines about what sorts of evidence were and were not persuasive. Changes to accepted standard statistical practice have the potential to upset this balance, at least temporarily.

Page: 64, Location: 970-973

The Restless World

Page: 64, Location: 979-979

In that case, people chose to observe 505 times, on average, placing bets the other 495 times. But the math says they should have started to bet after just 38 observations—leaving 962 chances to cash in.

Page: 65, Location: 991-992

One of the curious things about human beings, which any developmental psychologist aspires to understand and explain, is that we take years to become competent and autonomous. Caribou and gazelles must be prepared to run from predators the day they’re born, but humans take more than a year to make their first steps.

Page: 68, Location: 1031-1033

What we take to be the caprice of children may be wiser than we know.

Page: 69, Location: 1051-1051

I had reached a juncture in my reading life that is familiar to those who have been there: in the allotted time left to me on earth, should I read more and more new books, or should I cease with that vain consumption—vain because it is endless—and begin to reread those books that had given me the intensest pleasure in my past. —LYDIA DAVIS

Page: 69, Location: 1052-1055

In an experiment testing this hypothesis, Carstensen and her collaborator Barbara Fredrickson asked people to choose who they’d rather spend thirty minutes with: an immediate family member, the author of a book they’d recently read, or somebody they had met recently who seemed to share their interests. Older people preferred the family member; young people were just as excited to meet the author or make a new friend. But in a critical twist, if the young people were asked to imagine that they were about to move across the country, they preferred the family member too.

Page: 70, Location: 1067-1071

The explore/exploit tradeoff also tells us how to think about advice from our elders. When your grandfather tells you which restaurants are good, you should listen—these are pearls gleaned from decades of searching. But when he only goes to the same restaurant at 5:00 p.m. every day, you should feel free to explore other options,

Page: 71, Location: 1082-1084

Shifting the bulk of one’s attention to one’s favorite things should increase quality of life. And it seems like it does: Carstensen has found that older people are generally more satisfied with their social networks, and often report levels of emotional well-being that are higher than those of younger adults.

Page: 71, Location: 1088-1090

The basic summary of this section: git while the Gittins’s good.

Page: 72, Location: 1096-1096

3 Sorting Making Order

Page: 72, Location: 1101-1102

The Ecstasy of Sorting

Page: 73, Location: 1119-1119

The first code ever written for a “stored program” computer was a program for efficient sorting.

Page: 74, Location: 1132-1132

The truncated top of an immense, sorted list is in many ways the universal user interface.

Page: 75, Location: 1146-1147

The Agony of Sorting

Page: 76, Location: 1155-1155

“To lower costs per unit of output, people usually increase the size of their operations,” wrote J. C. Hosken in 1955, in the first scientific article published on sorting. This is the economy of scale familiar to any business student.

Page: 76, Location: 1155-1156

But with sorting, size is a recipe for disaster: perversely, as a sort grows larger, “the unit cost of sorting, instead of falling, rises.” Sorting involves steep diseconomies of scale, violating our normal intuitions about the virtues of doing things

Page: 76, Location: 1156-1158

Page: 76, Location: 1156-1159

This is the first and most fundamental insight of sorting theory. Scale hurts.

Page: 76, Location: 1161-1162

All we need are about 80,658,175,170,943,878,571,660,636,856,403,766,975,289,505,440,883,277,824,000,000,000,000 attempts at the title. This number, a bit over 80 unvigintillion, is 52 factorial, or “52!” in mathematical notation—the number of ways

Page: 77, Location: 1176-1178

Moreover, a computer scientist would want to know the worst sort time. Worst-case analysis lets us make hard guarantees: that a critical process will finish in time, that deadlines won’t be blown.

Page: 78, Location: 1186-1188

This is the rosiest class of problems there is: called “Big-O of one,” written O(1), it is also known as “constant time.”

Page: 78, Location: 1195-1196

Now, the time required to pass the roast around the table will be “Big-O of n,” written O(n), also known as “linear time”—with twice the guests, you’ll wait twice as long for the dish to come around.

Page: 79, Location: 1198-1201

This turns out to be “Big-O of n-squared,” written O(n2) and also known as “quadratic time.” Here again, we only care about the basic contours of the relationship between n and time. There’s no O(2n2) for two hugs apiece, or O(n2 + n) for hugs plus passing the food around, or O(n2 + 1) for hugs plus home cleaning. It’s all quadratic time, so O(n2) covers everything.

Page: 79, Location: 1210-1217

The Squares: Bubble Sort and Insertion Sort

Page: 80, Location: 1224-1224

Obama was right to eschew Bubble Sort, an algorithm which has become something of a punching bag for computer science students: it’s simple, it’s

Page: 81, Location: 1228-1229

Obama was right to eschew Bubble Sort, an algorithm which has become something of a punching bag for computer science students: it’s simple, it’s intuitive, and it’s extremely inefficient.

Page: 81, Location: 1228-1229

Computer scientists call this, appropriately enough, Insertion Sort. The good news is that it’s arguably even more intuitive than Bubble Sort and doesn’t have quite the bad reputation.

Page: 82, Location: 1245-1247

Breaking the Quadratic Barrier: Divide and Conquer At this point, having seen two entirely sensible approaches fall into unsustainable quadratic time, it’s natural to wonder whether faster sorting is even possible.

Page: 82, Location: 1250-1251

This approach is known today as Mergesort, one of the legendary algorithms in computer science. As a 1997 paper put it, “Mergesort is as important in the history of sorting as sorting in the history of computing.”

Page: 84, Location: 1282-1284

The power of Mergesort comes from the fact that it indeed ends up with a complexity between linear and quadratic time—specifically, O(n log n), known as “linearithmic” time.

Page: 84, Location: 1284-1286

No wonder it’s the method of choice for large-scale industrial sorting problems.

Page: 85, Location: 1292-1292

Computer science, as undergraduates are taught, is all about tradeoffs.

Page: 88, Location: 1346-1346

And one of the most central tradeoffs is between sorting and searching. The basic principle is this: the effort expended on sorting materials is just a preemptive strike against the effort it’ll take to search through them later. What the precise balance should be depends on the exact parameters of the situation, but thinking about sorting as valuable only to support future search tells us something surprising: Err on the side of messiness. Sorting something that you will never search is a complete waste; searching something you never sorted is merely inefficient.

Page: 88, Location: 1347-1352

The verdict is clear: ordering your bookshelf will take more time and energy than scanning through it ever will.

Page: 89, Location: 1364-1365

Computer science shows that the hazards of mess and the hazards of order are quantifiable and that their costs can be measured in the same currency: time. Leaving something unsorted might be thought of as an act of procrastination—passing the buck to one’s future self, who’ll have to pay off with interest what we chose not to pay up front. But the whole story is subtler than that. Sometimes mess is more than just the easy choice. It’s the optimal choice.

Page: 90, Location: 1374-1378

Charles Lutwidge Dodgson

Page: 90, Location: 1380-1380

“The present method of assigning prizes is, except in the case of the first prize, entirely unmeaning.” Said plainly, the silver medal is a lie. “As a mathematical fact,” he continued, “the chance that the 2nd best Player will get the prize he deserves is only 16/31sts; while the chance that the best 4 shall get their proper prizes is so small, that the odds are 12 to 1 against its happening!”

Page: 92, Location: 1396-1399

One of the most familiar algorithms in sports is the Round-Robin format, where each of n teams eventually plays every one of the other n − 1 teams. While arguably the most comprehensive, it’s also one of the most laborious. Having every team grapple with every other team is like having guests exchange hugs at our dinner party: the dreaded O(n2), quadratic time.

Page: 92, Location: 1405-1409

Ladder tournaments—popular in sports like badminton, squash, and racquetball—put players in a linear ranking, with each player allowed to issue a direct challenge to the player immediately above them, exchanging places if they prevail. Ladders are the Bubble Sorts of the athletic world and are thus also quadratic, requiring O(n2) games to reach a stable ranking.

Page: 92, Location: 1410-1413

Computer scientists call this phenomenon noise. All of the sorting algorithms that we’ve considered thus far assume perfect, flawless, foolproof comparisons, ones that never mess up and mistakenly judge the lesser of two quantities to be the greater. Once you allow for a “noisy comparator,” some of computer science’s most hallowed algorithms go out the window—and some of its most maligned have their day of redemption.

Page: 96, Location: 1462-1465

That Comparison Counting Sort is the single most robust sorting algorithm known, quadratic or better, should offer something very specific to sports fans: if your team doesn’t make the playoffs, don’t whine.

Page: 97, Location: 1483-1485

Put differently, if your team is eliminated early in the postseason, it’s tough luck. But if your team fails to get to the postseason, it’s tough truth. You may get sports-bar sympathy from your fellow disappointed fans, but you won’t get any from a computer scientist.

Page: 97, Location: 1486-1488

Displacement happens when an animal uses its knowledge of the hierarchy to determine that a particular confrontation simply isn’t worth it. In many animal societies, resources and opportunities—food, mates, preferred spaces, and so forth—are scarce, and somehow it must be decided who gets what. Establishing an order ahead of time is less violent than coming to blows every time a mating opportunity or a prime spot of grass becomes available. Though we may cringe when we see creatures turning their claws and beaks on each other, biologists tend to think of pecking orders as the violence that preempts violence.

Page: 99, Location: 1511-1516

We’ve now seen two separate downsides to the desire of any group to sort itself. You have, at minimum, a linearithmic number of confrontations, making everyone’s life more combative as the group grows—and you also oblige every competitor to keep track of the ever-shifting status of everyone else, otherwise they’ll find themselves fighting battles they didn’t need to. It taxes not only the body but the mind.

Page: 101, Location: 1535-1537

The only caveat is that the time required for the event is determined by its slowest competitors. This sporting contest is the marathon, and it suggests something critical: a race is fundamentally different from a fight.

Page: 101, Location: 1540-1542

This move from “ordinal” numbers (which only express rank) to “cardinal” ones (which directly assign a measure to something’s caliber) naturally orders a set without requiring pairwise comparisons. Accordingly, it makes possible dominance hierarchies that don’t require direct head-to-head matchups.

Page: 101, Location: 1548-1550

It’s possible for the individuals to resent the basis of this hierarchy, but not really to contest its verdict. As a result, individual pairwise interactions take place with a minimum of jockeying for status. By and large, any pair of people can tell, without needing to negotiate, who is supposed to show what level of respect to whom. Everyone knows where to meet.

Page: 102, Location: 1556-1559

in practice one straightforward principle determines which ships give way to which: the “Law of Gross Tonnage.” Quite simply, the smaller ship gets out of the way of the larger one.

Page: 102, Location: 1559-1561

Linearithmic numbers of fights might work fine for small-scale groups; they do in nature. But in a world where status is established through pairwise comparisons—whether they involve exchanging rhetoric or gunfire—the amount of confrontation quickly spirals out of control as society grows. Operating at industrial scale, with many thousands or millions of individuals sharing the same space, requires a leap beyond. A leap from ordinal to cardinal.

Page: 103, Location: 1570-1573

4 Caching Forget About It

Page: 105, Location: 1608-1609

In the practical use of our intellect, forgetting is as important a function as remembering. —WILLIAM JAMES

Page: 105, Location: 1609-1611

But just as the tidiness of a scholar’s desk may hide the messiness of their mind, so does the apparent tidiness of a computer’s file system obscure the highly engineered chaos of how data is actually being stored underneath the nested-folder veneer.

Page: 107, Location: 1628-1630

The Memory Hierarchy

Page: 107, Location: 1633-1633

In 1946, Arthur Burks, Herman Goldstine, and John von Neumann, working at the Institute for Advanced Study in Princeton, laid out a design proposal for what they called an electrical “memory organ.”

Page: 108, Location: 1641-1643

Wilkes’s proposal was implemented in the IBM 360/85 supercomputer later in the 1960s, where it acquired the name of the “cache.”

Page: 109, Location: 1662-1663

Likewise, a factory that doubles its manufacturing speed each year—but has the same number of parts shipped to it from overseas at the same sluggish pace—will mean little more than a factory that’s twice as idle.

Page: 109, Location: 1671-1673

For a while it seemed that Moore’s Law was yielding little except processors that twiddled their thumbs ever faster and ever more of the time. In the 1990s this began to be known as the “memory wall.”

Page: 110, Location: 1673-1674

Modern consumer laptops, tablets, and smartphones have on the order of a six-layer memory hierarchy, and managing memory smartly has never been as important to computer science as it is today.

Page: 110, Location: 1675-1677

Eviction and Clairvoyance

Page: 110, Location: 1678-1678

Bélády’s 1966 paper on caching algorithms would become the most cited piece of computer science research for fifteen years.

Page: 111, Location: 1690-1691

As it explains, the goal of cache management is to minimize the number of times you can’t find what you’re looking for in the cache and must go to the slower main memory to find it; these are known as “page faults” or “cache misses.” The optimal cache eviction policy—essentially by definition, Bélády wrote—is, when the cache is full, to evict whichever item we’ll need again the longest from now.

Page: 111, Location: 1691-1694

The hypothetical all-knowing, prescient algorithm that would look ahead and execute the optimal policy is known today in tribute as Bélády’s Algorithm. Bélády’s Algorithm is an instance of what computer scientists call a “clairvoyant” algorithm: one informed by data from the future.

Page: 111, Location: 1695-1698

We could just try Random Eviction, adding new data to the cache and overwriting old data at random.

Page: 111, Location: 1701-1702

Another simple strategy is First-In, First-Out (FIFO), where you evict or overwrite whatever has been sitting in the cache the longest (as in Martha Stewart’s question “How long have I had it?”).

Page: 112, Location: 1704-1706

third approach is Least Recently Used (LRU): evicting the item that’s gone the longest untouched (Stewart’s “When was the last time I wore it or used it?”).

Page: 112, Location: 1706-1707

The LRU principle is effective because of something computer scientists call “temporal locality”: if a program has called for a particular piece of information once, it’s likely to do so again in the near future.

Page: 112, Location: 1710-1711

LRU teaches us that the next thing we can expect to need is the last one we needed, while the thing we’ll need after that is probably the second-most-recent one. And the last thing we can expect to need is the one we’ve already gone longest without. Unless we have good reason to think otherwise, it seems that our best guide to the future is a mirror image of the past. The nearest thing to clairvoyance is to assume that history repeats itself—backward.

Page: 113, Location: 1724-1727

Turning the Library Inside Out

Page: 113, Location: 1728-1728

ordered. Here’s why: if temporal locality holds, then the rough-sorting shelves contain the most important books in the whole building. These are the books that were most recently used, so they are the ones that patrons are most likely to be looking for. It seems a crime that arguably the juiciest and most browseworthy

Page: 114, Location: 1741-1744

Says Akamai’s chief architect, Stephen Ludin, “It’s our belief—and we build the company around the fact—that distance matters.”

Page: 117, Location: 1781-1782

Caching is just as useful when it’s proximity, rather than performance, that’s the scarce resource.

Page: 117, Location: 1784-1785

“I have to emphasize,” says Noguchi, “that a very fundamental principle in my method is not to group files according to content.”

Page: 120, Location: 1840-1841

But computer science gives us something that most efficiency gurus don’t: guarantees.

Page: 122, Location: 1858-1859

LRU tells us that when we add something to our cache we should discard the oldest item—but it doesn’t tell us where we should put the new item.

Page: 122, Location: 1859-1861

Page: 123, Location: 1880-1883

His results mapped out a graph of how memory fades over time, known today by psychologists as “the forgetting curve.” Ebbinghaus’s results established the credibility of a quantitative science of human memory, but they left open something of a mystery.

Page: 124, Location: 1892-1894

The key idea behind Anderson’s new account of human memory is that the problem might be not one of storage, but of organization. According to his theory, the mind has essentially infinite capacity for memories, but we have only a finite amount of time in which to search for them. Anderson made the analogy to a library with a single, arbitrarily long shelf—the Noguchi Filing System at Library of Congress scale. You can fit as many items as you want on that shelf, but the closer something is to the front the faster it will be to find.

Page: 125, Location: 1903-1908

The Tyranny of Experience

Page: 126, Location: 1931-1931

big book is a big nuisance. —CALLIMACHUS (305–410 BC), LIBRARIAN AT ALEXANDRIA

Page: 126, Location: 1931-1932

A big book is a big nuisance. —CALLIMACHUS (305–410 BC), LIBRARIAN AT ALEXANDRIA

Page: 126, Location: 1931-1932

The fastest cache on current computers, for instance, is made with what’s called SRAM, which costs roughly a thousand times as much per byte as the flash memory in solid-state drives.

Page: 127, Location: 1936-1937

Unavoidably, the larger a memory is, the more time it takes to search for and extract a piece of information from

Page: 127, Location: 1945-1945

Unavoidably, the larger a memory is, the more time it takes to search for and extract a piece of information from it.

Page: 127, Location: 1945-1945

And when it comes to episodic memory, well, every year adds a third of a million waking minutes to one’s total lived experience. Considered this way, it’s a wonder that the two of us—or anyone—can mentally keep up at all. What’s surprising is not memory’s slowdown, but the fact that the mind can possibly stay afloat and responsive as so much data accumulates.

Page: 128, Location: 1951-1954

Caching gives us the language to understand what’s happening. We say “brain fart” when we should really say “cache miss.”

Page: 129, Location: 1968-1969

Though we always manage to find some way to order the things we do in our days, as a rule we don’t consider ourselves particularly good at it—hence the perennial bestseller status of time-management guides.

Page: 131, Location: 2001-2003

William James, the “father of American psychology,” asserts that “there’s nothing so fatiguing as the eternal hanging on of an uncompleted task,”

Page: 131, Location: 2007-2008

This practice would be built upon by Taylor’s colleague Henry Gantt, who in the 1910s developed the Gantt charts that would help organize many of the twentieth century’s most ambitious construction projects, from the Hoover Dam to the Interstate Highway System. A century later, Gantt charts still adorn the walls and screens of project managers at firms like Amazon, IKEA, and SpaceX.

Page: 132, Location: 2017-2020

Thus you can keep the total amount of time spent doing laundry to the absolute minimum. Johnson’s analysis had yielded scheduling’s first optimal algorithm: start with the lightest wash, end with the smallest hamper.

Page: 133, Location: 2034-2035

This is a sufficiently fundamental and counterintuitive point that it’s worth repeating. If you have only a single machine, and you’re going to do all of your tasks, then any ordering of the tasks will take you the same amount of time.

Page: 134, Location: 2043-2045

Thus we encounter the first lesson in single-machine scheduling literally before we even begin: make your goals explicit.

Page: 134, Location: 2045-2046

If you’re concerned with minimizing maximum lateness, then the best strategy is to start with the task due soonest and work your way toward the task due last. This strategy, known as Earliest Due Date, is fairly intuitive. (For instance, in a service-sector context, where each arriving patron’s “due date” is effectively the instant they walk in the door, it just means serving customers in order of arrival.)

Page: 134, Location: 2054-2057

Earliest Due Date is optimal for reducing maximum lateness, which means it will minimize the rottenness of the single most rotten thing you’ll have to eat; that may not be the most appetizing metric to eat by.

Page: 135, Location: 2065-2067

Moore’s Algorithm says that we start out just like with Earliest Due Date—by scheduling out our produce in order of spoilage date, earliest first, one item at a time. However, as soon as it looks like we won’t get to eating the next item in time, we pause, look back over the meals we’ve already planned, and throw out the biggest item (that is, the one that would take the most days to consume).

Page: 135, Location: 2069-2072

Getting Things Done

Page: 136, Location: 2079-2080

Minimizing the sum of completion times leads to a very simple optimal algorithm called Shortest Processing Time: always do the quickest task you can.

Page: 137, Location: 2090-2091

Its sum-of-completion-times metric can be expressed another way: it’s like focusing above all on reducing the length of your to-do list. If each piece of unfinished business is like a thorn in your side, then racing through the easiest items may bring some measure of relief.

Page: 137, Location: 2095-2097

this strategy nonetheless offers a nice rule of thumb: only prioritize a task that takes twice as long if it’s twice as important.

Page: 138, Location: 2106-2106

the “debt avalanche.” This debt-reduction strategy says to ignore the number and size of your debts entirely, and simply funnel your money toward the debt with the single highest interest rate. This corresponds rather neatly to working through jobs in order of importance per unit time. And it’s the strategy that will reduce the total burden of your debt as quickly as possible.

Page: 138, Location: 2113-2115

It’s said that “a man with one watch knows what time it is; a man with two watches is never sure.”

Page: 139, Location: 2122-2123

Computer scientists would call this a “ping attack” or a “denial of service” attack: give a system an overwhelming number of trivial things to do, and the important things get lost in the chaos.

Page: 139, Location: 2129-2131

Live by the metric, die by the metric. If all tasks are indeed of equal weight, then that’s exactly what we should be doing. But if we don’t want to become slaves to minutiae, then we need to take measures toward that end.

Page: 140, Location: 2146-2147

Once JPL engineers had identified the Pathfinder problem as a case of priority inversion, they wrote up a fix and beamed the new code across millions of miles to Pathfinder. What was the solution they sent flying across the solar system? Priority inheritance. If a low-priority task is found to be blocking a high-priority resource, well, then all of a sudden that low-priority task should momentarily become the highest-priority thing on the system, “inheriting” the priority of the thing it’s blocking.

Page: 142, Location: 2169-2172

A commitment to fastidiously doing the most important thing you can, if pursued in a head-down, myopic fashion, can lead to what looks for all the world like procrastination.

Page: 143, Location: 2179-2180

this problem belongs to a class that most computer scientists believe has no efficient solution—it’s what the field calls “intractable.”

Page: 145, Location: 2213-2215

The drawing of the borders of scheduling theory continues to this day. A recent survey showed that the status of about 7% of all problems is still unknown, scheduling’s terra incognita. Of the 93% of the problems that we do understand, however, the news isn’t great: only 9% can be solved efficiently, and the other 84% have been proven intractable.

Page: 146, Location: 2228-2231

The best time to plant a tree is twenty years ago. The second best time is now. —PROVERB

Page: 146, Location: 2234-2236

But there is one twist that can make it easier: being able to stop one task partway through and switch to another. This property, “preemption,” turns out to change the game dramatically.

Page: 146, Location: 2236-2238

In fact, the weighted version of Shortest Processing Time is a pretty good candidate for best general-purpose scheduling strategy in the face of uncertainty.

Page: 147, Location: 2253-2254

It offers a simple prescription for time management: each time a new piece of work comes in, divide its importance by the amount of time it will take to complete. If that figure is higher than for the task you’re currently doing, switch to the new one; otherwise stick with the current task. This algorithm is the closest thing that scheduling theory has to a skeleton key or Swiss Army knife, the optimal strategy not just for one flavor of problem but for many.

Page: 147, Location: 2254-2257

When the future is foggy, it turns out you don’t need a calendar—just a to-do list.

Page: 148, Location: 2265-2265

The hurrieder I go / The behinder I get —NEEDLEPOINT SEEN IN BOONVILLE, CA

Page: 148, Location: 2266-2267

Psychologists have shown that for us, the effects of switching tasks can include both delays and errors—at the scale of minutes rather than microseconds.

Page: 149, Location: 2283-2284

Brian, for his part, thinks of writing as a kind of blacksmithing, where it takes a while just to heat up the metal before it’s malleable.

Page: 150, Location: 2288-2289

Thrashing

Page: 150, Location: 2296-2296

Computers multitask through a process called “threading,” which you can think of as being like juggling a set of balls. Just as a juggler only hurls one ball at a time into the air but keeps three aloft, a CPU only works on one program at a time, but by swapping between them quickly enough (on the scale of ten-thousandths of a second) it appears to be playing a movie, navigating the web, and alerting you to incoming email all at once.

Page: 150, Location: 2299-2303

Think again about our image of a juggler. With one ball in the air, there’s enough spare time while that ball is aloft for the juggler to toss some others upward as well. But what if the juggler takes on one more ball than he can handle? He doesn’t drop that ball; he drops everything. The whole system, quite literally, goes down.

Page: 151, Location: 2311-2314

As Peter Zijlstra, one of the head developers on the Linux operating system scheduler, puts it, “The caches are warm for the current workload, and when you context switch you pretty much invalidate all caches. And that hurts.”

Page: 152, Location: 2324-2326

At the extreme, a program may run just long enough to swap its needed items into memory, before giving way to another program that runs just long enough to overwrite them in turn. This is thrashing: a system running full-tilt and accomplishing nothing at all.

Page: 152, Location: 2326-2328

Interrupt Coalescing

Page: 154, Location: 2359-2359

Part of what makes real-time scheduling so complex and interesting is that it is fundamentally a negotiation between two principles that aren’t fully compatible. These two principles are called responsiveness and throughput: how quickly you can respond to things, and how much you can get done overall.

Page: 154, Location: 2359-2362

Establishing a minimum amount of time to spend on any one task helps to prevent a commitment to responsiveness from obliterating throughput entirely: if the minimum slice is longer than the time it takes to context-switch, then the system can never get into a state where context switching is the only thing it’s doing. It’s also a principle that is easy to translate into a recommendation for human lives.

Page: 155, Location: 2375-2378

Methods such as “timeboxing” or “pomodoros,” where you literally set a kitchen timer and commit to doing a single task until it runs out, are one embodiment of this idea.

Page: 156, Location: 2378-2379

So the rule that computer operating systems follow when deciding how long they can afford to dedicate themselves to some task is simple: as long as possible without seeming jittery or slow to the user.

Page: 156, Location: 2385-2386

When we humans leave the house to run a quick errand, we might say something like, “You won’t even notice I’m gone.” When our machines context-switch into a computation, they must literally return to us before we notice they’re gone.

Page: 156, Location: 2386-2388

The moral is that you should try to stay on a single task as long as possible without decreasing your responsiveness below the minimum acceptable limit. Decide how responsive you need to be—and then, if you want to get things done, be no more responsive than that.

Page: 157, Location: 2395-2396

Perhaps the patron saint of the minimal-context-switching lifestyle is the legendary programmer Donald Knuth. “I do one thing at a time,” he says. “This is what computer scientists call batch processing—the alternative is swapping in and out.

Page: 158, Location: 2414-2416

But often the problems most germane to daily human life are at the opposite extreme. Our days are full of “small data.” In fact, like Gott standing at the Berlin Wall, we often have to make an inference from the smallest amount of data we could possibly have: a single observation.

Page: 162, Location: 2471-2473

More than 250 years ago, the question of making predictions from small data weighed heavily on the mind of the Reverend Thomas Bayes, a Presbyterian minister in the charming spa town of Tunbridge Wells, England.

Page: 162, Location: 2478-2479

And in either 1746, ’47, ’48, or ’49 he would write one of the most influential papers in all of mathematics, abandon it unpublished, and move on to other things.

Page: 162, Location: 2484-2485

The essay concerned exactly the kind of raffle problem under discussion: Let us then imagine a person present at the drawing of a lottery, who knows nothing of its scheme or of the proportion of Blanks to Prizes in it. Let it further be supposed, that he is obliged to infer this from the number of blanks he hears drawn compared with the number of prizes; and that it is enquired what conclusions in these circumstances he may reasonably make.

Page: 163, Location: 2491-2496

Bayes’s critical insight was that trying to use the winning and losing tickets we see to figure out the overall ticket pool that they came from is essentially reasoning backward. And to do that, he argued, we need to first reason forward from hypotheticals. In other words, we need to first determine how probable it is that we would have drawn the tickets we did if various scenarios were true. This probability—known to modern statisticians as the “likelihood”—gives us the information we need to solve the problem.

Page: 163, Location: 2496-2500

This is the crux of Bayes’s argument. Reasoning forward from hypothetical pasts lays the foundation for us to then work backward to the most probable one.

Page: 165, Location: 2516-2517

Laplace’s Law

Page: 165, Location: 2523-2524

In 1774, completely unaware of the previous work by Bayes, Laplace published an ambitious paper called “Treatise on the Probability of the Causes of Events.” In it, Laplace finally solved the problem of how to make inferences backward from observed effects to their probable causes.

Page: 165, Location: 2526-2528

Count the number of times it has happened in the past plus one, then divide by the number of opportunities plus two.

Page: 167, Location: 2548-2548

This sense of what was “in the bag” before the coin flip—the chances for each hypothesis to have been true before you saw any data—is known as the prior probabilities, or “prior” for short. And Bayes’s Rule always needs some prior from you, even if it’s only a guess.

Page: 168, Location: 2575-2577

The Copernican Principle

Page: 169, Location: 2583-2583

It’s difficult to make predictions, especially about the future. —DANISH PROVERB

Page: 169, Location: 2584-2585

And if a municipal transit system cannot afford the incredibly useful but expensive real-time signs that tell riders when the next bus is going to arrive, the Copernican Principle suggests that there might be a dramatically simpler and cheaper alternative. Simply displaying how long it’s been since the previous bus arrived at that stop offers a substantial hint about when the next one will.

Page: 170, Location: 2607-2610

And it turns out that the Copernican Principle is exactly what results from applying Bayes’s Rule using what is known as an uninformative prior.

Page: 171, Location: 2617-2619

but if the wall were going to be around for a million years, it would be a big coincidence that we happened to bump into it so very close to the start of its existence. Therefore, even though enormously long life spans cannot be ruled out, neither are they very likely.

Page: 172, Location: 2627-2629

Real-World Priors

Page: 173, Location: 2643-2644

In the broadest sense, there are two types of things in the world: things that tend toward (or cluster around) some kind of “natural” value, and things that don’t.

Page: 173, Location: 2644-2645

It’s often lamented that “the rich get richer,” and indeed the process of “preferential attachment” is one of the surest ways to produce a power-law distribution.

Page: 174, Location: 2665-2667

The most popular websites are the most likely to get incoming links; the most followed online celebrities are the ones most likely to gain new fans; the most prestigious firms are the ones most likely to attract new clients; the biggest cities are the ones most likely to draw new residents.

Page: 174, Location: 2667-2668

Page: 174, Location: 2667-2669

And for any power-law distribution, Bayes’s Rule indicates that the appropriate prediction strategy is a Multiplicative Rule: multiply the quantity observed so far by some constant factor. For an uninformative prior, that constant factor happens to be 2, hence the Copernican prediction; in other power-law cases, the multiplier will depend on the exact distribution you’re working with. For the grosses of movies, for instance, it happens to be about 1.4. So if you hear a movie has made $6 million so far, you can guess it will make about$ 8.4 million overall; if it’s made $90 million, guess it will top out at$ 126 million. This multiplicative rule is a direct consequence of the fact that power-law distributions do not specify a natural scale for the phenomenon they’re describing. The only thing that gives us a sense of scale for our prediction, therefore, is the single data point we have—such as the fact that the Berlin Wall has stood for eight years. The larger the value

Page: 175, Location: 2677-2684

Indeed, distributions that yield the same prediction, no matter their history or current state, are known to statisticians as “memoryless.”

Page: 178, Location: 2719-2720

In a power-law distribution, the longer something has gone on, the longer we expect it to continue going on. So a power-law event is more surprising the longer we’ve been waiting for it—and maximally surprising right before it happens.

Page: 178, Location: 2725-2727

In a normal distribution, events are surprising when they’re early—since we expected them to reach the average—but not when they’re late. Indeed, by that point they seem overdue to happen, so the longer we wait, the more we expect them.

Page: 178, Location: 2728-2729

And in an Erlang distribution, events by definition are never any more or less surprising no matter when they occur. Any state of affairs is always equally likely to end regardless of how long it’s lasted. No wonder politicians are always thinking about their next election.

Page: 178, Location: 2730-2732

In “The Gambler,” Kenny Rogers famously advised that you’ve got to “Know when to walk away / Know when to run”—but for a memoryless distribution, there is no right time to quit. This may in part explain these games’ addictiveness.

Page: 179, Location: 2739-2741

Whether we know it or not, we appear to carry around in our heads surprisingly accurate priors about movie grosses and running times, poem lengths, and political terms of office, not to mention human life spans. We don’t need to gather them explicitly; we absorb them from the world.

Page: 181, Location: 2763-2765

Good predictions require good priors.

Page: 182, Location: 2778-2779

Learning self-control is important, but it’s equally important to grow up in an environment where adults are consistently present and trustworthy.

Page: 184, Location: 2809-2810

He is careful of what he reads, for that is what he will write. He is careful of what he learns, for that is what he will know. —ANNIE DILLARD

Page: 184, Location: 2813-2814

The best way to make good predictions, as Bayes’s Rule shows us, is to be accurately informed about the things you’re predicting.

Page: 184, Location: 2815-2816

If you want to be a good intuitive Bayesian—if you want to naturally make good predictions, without having to think about what kind of prediction rule is appropriate—you need to protect your priors. Counterintuitively, that might mean turning off the news. *There’s a certain irony here: when it comes to time, assuming that there’s nothing special about our arrival does result in us imagining ourselves at the very center after all.

Page: 186, Location: 2837-2845

Q.E.D.” Quod erat demonstrandum,

Page: 187, Location: 2864-2865

The question of how hard to think, and how many factors to consider, is at the heart of a knotty problem that statisticians and machine-learning researchers call “overfitting.”

Page: 188, Location: 2882-2883

The Case Against Complexity

Page: 189, Location: 2885-2885

And every prediction, crucially, involves thinking about two distinct things: what you know and what you don’t.

Page: 189, Location: 2889-2890

Page: 191, Location: 2927-2928

So one of the deepest truths of machine learning is that, in fact, it’s not always better to use a more complex model, one that takes a greater number of factors into account. And the issue is not just that the extra factors might offer diminishing returns—performing better than a simpler model, but not enough to justify the added complexity. Rather, they might make our predictions dramatically worse.

Page: 191, Location: 2929-2932

Fundamentally, overfitting is a kind of idolatry of data, a consequence of focusing on what we’ve been able to measure rather than what matters.

Page: 192, Location: 2942-2943

Overfitting Everywhere

Page: 193, Location: 2951-2951

The original goal of fencing was to teach people how to defend themselves in a duel, hence the name: “defencing.”

Page: 194, Location: 2965-2966

It’s as exciting a sport as ever, but as athletes overfit their tactics to the quirks of scorekeeping, it becomes less useful in instilling the skills of real-world swordsmanship.

Page: 194, Location: 2970-2972

Avinash Kaushik, digital marketing evangelist at Google, warns that trying to get website users to see as many ads as possible naturally devolves into trying to cram sites with ads:

Page: 195, Location: 2983-2984

Kaushik’s conclusion: “Friends don’t let friends measure Page Views. Ever.”

Page: 195, Location: 2988-2988

Mistakes like these are known in law enforcement and the military as “training scars,” and they reflect the fact that it’s possible to overfit one’s own preparation.

Page: 196, Location: 2997-2998

Detecting Overfitting: Cross-Validation

Page: 196, Location: 3000-3000

Research in machine learning has yielded several concrete strategies for detecting overfitting, and one of the most important is what’s known as Cross-Validation.

Page: 196, Location: 3005-3006

Simply put, Cross-Validation means assessing not only how well a model fits the data it’s given, but how well it generalizes to data it hasn’t seen. Paradoxically, this may involve using less data.

Page: 197, Location: 3007-3008

How to Combat Overfitting: Penalizing Complexity If you can’t explain it simply, you don’t understand it well enough. —ANONYMOUS We’ve seen some of the ways that overfitting can rear its head, and we’ve looked at some of the methods to detect and measure it.

Page: 198, Location: 3028-3031

If you can’t explain it simply, you don’t understand it well enough. —ANONYMOUS

Page: 198, Location: 3029-3030

From a statistics viewpoint, overfitting is a symptom of being too sensitive to the actual data we’ve seen. The solution, then, is straightforward: we must balance our desire to find a good fit against the complexity of the models we use to do so.

Page: 198, Location: 3031-3033

So what do these complexity penalties look like? One algorithm, discovered in 1996 by biostatistician Robert Tibshirani, is called the Lasso and uses as its penalty the total weight of the different factors in the model.* By putting this downward pressure on the weights of the factors, the Lasso drives as many of them as possible completely to zero. Only the factors that have a big impact on the results remain in the equation—thus potentially transforming, say, an overfitted nine-factor model into a simpler, more robust formula

Page: 199, Location: 3040-3044

Page: 199, Location: 3040-3045

Neuroscientists have suggested, for instance, that brains try to minimize the number of neurons that are firing at any given moment—implementing the same kind of downward pressure on complexity as the Lasso.

Page: 200, Location: 3055-3056

The Upside of Heuristics

Page: 200, Location: 3059-3060

When it comes to portfolio management, it turns out that unless you’re highly confident in the information you have about the markets, you may actually be better off ignoring that information altogether.

Page: 201, Location: 3069-3070

The Weight of History

Page: 202, Location: 3085-3085

For example, the oddly cross-wired arrangement of our nervous system (the left side of our body controlled by the right side of our brain and vice versa) reflects the evolutionary history of vertebrates. This phenomenon, called “decussation,”

Page: 203, Location: 3103-3104

This kind of setup—where more time means more complexity—characterizes a lot of human endeavors. Giving yourself more time to decide about something does not necessarily mean that you’ll make a better decision. But it does guarantee that you’ll end up considering more factors, more hypotheticals, more pros and cons, and thus risk overfitting.

Page: 205, Location: 3130-3132

The greater the uncertainty, the bigger the gap between what you can measure and what matters, the more you should watch out for overfitting—that is, the more you should prefer simplicity, and the earlier you should stop.

Page: 206, Location: 3149-3151

When you’re truly in the dark, the best-laid plans will be the simplest. When our expectations are uncertain and the data are noisy, the best bet is to paint with a broad brush, to think in broad strokes.

Page: 206, Location: 3151-3152

The upshot of Early Stopping is that sometimes it’s not a matter of choosing between being rational and going with our first instinct. Going with our first instinct can be the rational solution. The more complex, unstable, and uncertain the decision, the more rational an approach that

Page: 207, Location: 3160-3162

nuptial

Page: 208, Location: 3190-3190

But as computer scientists have discovered over the past few decades, there are entire classes of problems where a perfect solution is essentially unreachable, no matter how fast we make our computers or how cleverly we program them. In fact, no one understands as well as a computer scientist that in the face of a seemingly unmanageable challenge, you should neither toil forever nor give up, but—as we’ll see—try a third thing entirely.

Page: 210, Location: 3216-3219

In fact, it’s the most famous optimization problem of them all.

Page: 211, Location: 3225-3226

They asserted what’s now known as the Cobham–Edmonds thesis: an algorithm should be considered “efficient” if it runs in what’s called “polynomial time”—that is, O(n2), O(n3), or in fact n to the power of any number at all. A problem, in turn, is considered “tractable” if we know how to solve it using an efficient algorithm. A problem we don’t know how to solve in polynomial time, on the other hand, is considered “intractable.” And at anything but the smallest scales, intractable problems are beyond the reach of solution by computers, no matter how powerful.

Page: 212, Location: 3247-3253

As scheduling expert Jan Karel Lenstra told us, “When the problem is hard, it doesn’t mean that you can forget about it, it means that it’s just in a different status. It’s a serious enemy, but you still have to fight it.”

Page: 213, Location: 3263-3264

If you can’t solve the problem in front of you, solve an easier version of it—and then see if that solution offers you a starting point, or a beacon, in the full-blown problem. Maybe it does.

Page: 215, Location: 3295-3296

The message is simple but profound: if we’re willing to accept solutions that are close enough, then even some of the hairiest problems around can be tamed with the right techniques.

Page: 216, Location: 3299-3300

And it’s also the problem that epidemiologists study in thinking about, say, the minimum number of people in a population—and which people—to vaccinate to protect a society from communicable diseases.

Page: 217, Location: 3319-3321

Ever since Cobham and Edmonds’s work, this chasm between “polynomials” (n-to-the-something) and “exponentials” (something-to-the-n) has served as the field’s de facto out-of-bounds marker.

Page: 224, Location: 3428-3430

9 Randomness When to Leave It to Chance

Page: 224, Location: 3435-3436

F. Scott Fitzgerald once wrote that “the test of a first-rate intelligence is the ability to hold two opposing ideas in mind at the same time and still retain the ability to function.” That may be true, but no first-rate

Page: 227, Location: 3473-3474

Scott Fitzgerald once wrote that “the test of a first-rate intelligence is the ability to hold two opposing ideas

Page: 227, Location: 3473-3474

F. Scott Fitzgerald once wrote that “the test of a first-rate intelligence is the ability to hold two opposing ideas in mind at the same time and still retain the ability to function.”

Page: 227, Location: 3473-3474

In modern encryption, for instance, secret primes known only to the sender and recipient get multiplied together to create huge composite numbers that can be transmitted publicly without fear, since factoring the product would take any eavesdropper way too long to be worth attempting. Thus virtually all secure communication online—be it commerce, banking, or email—begins with a hunt for prime numbers.

Page: 230, Location: 3518-3520

Though you may have never heard of the Miller-Rabin test, your laptop, tablet, and phone know it well. Several decades after its discovery, it is still the standard method used to find and check primes in many domains. It’s working behind the scenes whenever you use your credit card online, and almost any time secure communications are sent through the air or over wires.

Page: 232, Location: 3556-3559

Here again randomness offers a way forward: just generate some random xs and plug them in. If the two expressions are not the same, it would be a big coincidence if they gave the same answer for some randomly generated input. And an even bigger coincidence if they also gave identical answers for a second random input. And a bigger coincidence still if they did it for three random inputs in a row. Since there is no known deterministic algorithm for efficiently testing polynomial identity, this randomized method—with multiple observations quickly giving rise to near-certainty—is the only practical one we

Page: 233, Location: 3568-3573

The polynomial identity test shows that sometimes our effort is better spent checking random values—sampling from the two expressions we want to know about—than trying to untangle their inner workings. To some extent this seems reasonably intuitive.

Page: 234, Location: 3573-3575

When we need to make sense of, say, national health care reform—a vast apparatus too complex to be readily understood—our political leaders typically offer us two things: cherry-picked personal anecdotes and aggregate summary statistics.

Page: 236, Location: 3609-3611

There is no such thing as absolute certainty, but there is assurance sufficient for the purposes of human life. —JOHN STUART MILL

Page: 238, Location: 3639-3641

This is an example of a so-called greedy algorithm, which you can also think of as a “myopic algorithm”: one that shortsightedly takes the best thing available every step of the way.

Page: 240, Location: 3677-3678

This technique, developed by the same Los Alamos team that came up with the Monte Carlo Method, is called the Metropolis Algorithm. The Metropolis Algorithm is like Hill Climbing, trying out different small-scale tweaks on a solution, but with one important difference: at any given point, it will potentially accept bad tweaks as well as good ones.

Page: 242, Location: 3710-3713

apocryphal)

Page: 247, Location: 3779-3779

with Mach going so far as to declare that “thus are to be explained the statements of Newton, Mozart, Richard Wagner, and others, when they say that thought, melodies, and harmonies had poured in upon them, and that they had simply retained the right ones.”

Page: 249, Location: 3806-3808

But perhaps it’s just a case of a little knowledge being a dangerous thing. If the Dice Man had only had a deeper grasp of computer science, he’d have had some guidance. First, from Hill Climbing: even if you’re in the habit of sometimes acting on bad ideas, you should always act on good ones. Second, from the Metropolis Algorithm: your likelihood of following a bad idea should be inversely proportional to how bad an idea it is. Third, from Simulated Annealing: you should front-load randomness, rapidly cooling out of a totally random state, using ever less and less randomness as time goes on, lingering longest as you approach freezing. Temper yourself—literally.

Page: 251, Location: 3835-3839

“Once you got somewhere you were happy,” he told the Guardian, “you’d be stupid to shake it up any further.”

Page: 251, Location: 3843-3844

10 Networking How We Connect

Page: 253, Location: 3879-3880

portent—

Page: 254, Location: 3886-3886

The long-distance telegraph began with a portent—Samuel F. B. Morse, standing in the chambers of the US Supreme Court on May 24, 1844, wiring his assistant Alfred Vail in Baltimore a verse from the Old Testament: “WHAT HATH GOD WROUGHT.” The first thing we ask of any new connection is how it began, and from that origin can’t help trying to augur its future.

Page: 254, Location: 3886-3888

The first telephone call in history, made by Alexander Graham Bell to his assistant on March 10, 1876, began with a bit of a paradox. “Mr. Watson, come here; I want to see you”—a simultaneous testament to its ability and inability to overcome physical distance.

Page: 254, Location: 3888-3890

The cell phone began with a boast—Motorola’s Martin Cooper walking down Sixth Avenue on April 3, 1973, as Manhattan pedestrians gawked, calling his rival Joel Engel at AT&T: “Joel, I’m calling you from a cellular phone. A real cellular phone: a handheld, portable, real cellular phone.” (“I don’t remember exactly what he said,” Cooper recalls, “but it was really quiet for a while. My assumption was that he was grinding his teeth.”)

Page: 254, Location: 3891-3894

And the text message began, on December 3, 1992, with cheer: Neil Papworth at Sema Group Telecoms wishing Vodafone’s Richard Jarvis an early “Merry Christmas.”

Page: 254, Location: 3894-3895

The beginnings of the Internet were, somehow fittingly, much humbler and more inauspicious than all of that. It was October 29, 1969, and Charley Kline at UCLA sent to Bill Duvall at the Stanford Research Institute the first message ever transmitted from one computer to another via the ARPANET. The message was “login”—or would have been, had the receiving machine not crashed after “lo.”

Page: 255, Location: 3895-3898

What we now think of as “the Internet” is actually a collection of many protocols, but the chief among them (so much so that it’s often referred to more or less synonymously with the Internet) is what’s known as Transmission Control Protocol, or TCP. It was born from a 1973 talk and a 1974 paper by Vinton “Vint” Cerf and Robert “Bob” Kahn, who laid out a proposal for the language of—as they imagined calling it—an “internetwork.”

Page: 255, Location: 3909-3912

Phone calls use what’s called “circuit switching”: the system opens a channel between the sender and the receiver, which supplies constant bandwidth between the parties in both directions as long as the call lasts.

Page: 256, Location: 3914-3915

In a packet-switched network, rather than using a dedicated channel for each connection, senders and receivers atomize their messages into tiny shards known as “packets,” and merge them into the communal flow of data—a bit like postcards moving at the speed of light.

Page: 257, Location: 3929-3930

In circuit-switched networks, a call fails if any one of its links gets disrupted—which means that reliability goes down exponentially as a network grows larger. In packet switching, on the other hand, the proliferation of paths in a growing network becomes a virtue: there are now that many more ways for data to flow, so the reliability of the network increases exponentially with its size.

Page: 257, Location: 3941-3944

Computer scientists know this concept as the “Byzantine generals problem.” Imagine two generals, on opposite sides of a valley that contains their common enemy, attempting to coordinate an attack. Only by perfect synchronization will they succeed; for either to attack alone is suicide. What’s worse, any messages from one general to the other must be delivered by hand across the very terrain that contains the enemy, meaning there’s a chance that any given message will never arrive.

Page: 259, Location: 3961-3964

Communication is one of those delightful things that work only in practice; in theory it’s impossible.

Page: 259, Location: 3968-3969

so it’s considered enough for a session to begin with what’s called a “triple handshake.” The visitor says hello, the server acknowledges the hello and says hello back, the visitor acknowledges that, and if the server receives this third message, then no further confirmation is needed and they’re off to the races.

Page: 259, Location: 3970-3973

A report from the second half of 2014 showed that almost 10% of upstream Internet traffic during peak hours was due to Netflix—which we tend to think of as sending data almost exclusively downstream, to users. But all that video generates an awful lot of ACKs.

Page: 260, Location: 3987-3989

For this reason, phone services that automatically reduce background noise to silence are doing their users a major disservice. Background static is a continual reassurance that the call is still connected and that any silence is a deliberate choice by the other party. Without it, one must constantly confront the possibility that the call has dropped, and constantly offer reassurances that it has not.

Page: 262, Location: 4006-4008

Ironically, one of the few exceptions to this is in transmitting the human voice. Real-time voice communications, such as Skype, typically do not use TCP, which underlies most of the rest of the Internet. As researchers discovered in the early days of networking, using reliable, robust protocols—with all their ACKs and retransmission of lost packets—to transmit the human voice is overkill. The humans provide the robustness themselves. As Cerf explains, “In the case of voice, if you lose a packet, you just say, ‘Say that again, I missed something.’”

Page: 261, Location: 4001-4005

Every message could be the last, and there is often no telling the difference between someone taking their time to respond and someone who has long since ended the conversation.

Page: 262, Location: 4010-4011

The world’s most difficult word to translate has been identified as “ilunga,” from the Tshiluba language spoken in south-eastern DR Congo.… Ilunga means “a person who is ready to forgive any abuse for the first time, to tolerate it a second time, but never a third time.” —BBC NEWS

Page: 263, Location: 4018-4021

Clearly the ALOHAnet protocol was going to need to tell competing signals how to give each other space, how to yield and make way for one another.

Page: 263, Location: 4031-4031

The first thing that the senders need to do here is what’s called “breaking symmetry.” As any sidewalk pedestrian knows, dodging right as an oncoming walker dodges left, and then having both of you simultaneously dodge back the other way, doesn’t solve anything.

Page: 263, Location: 4032-4034

As one influential paper puts it, “For a transport endpoint embedded in a network of unknown topology and with an unknown, unknowable and constantly changing population of competing conversations, only one scheme has any hope of working—exponential backoff.”

Page: 265, Location: 4050-4052

Beyond just collision avoidance, Exponential Backoff has become the default way of handling almost all cases of networking failure or unreliability. For instance, when your computer is trying to reach a website that appears to be down, it uses Exponential Backoff—trying again one second later, again a few seconds after that, and so forth. This is good for everyone: it prevents a host server that’s down from getting slammed with requests as soon as it comes back online, and it prevents your own machine from wasting too much effort trying to get blood from a stone.

Page: 265, Location: 4052-4056

The first efforts at computer networking focused on establishing reliable transmissions over unreliable links. These efforts proved to be so successful that a second concern immediately arose: making sure that an overloaded network could avoid catastrophic meltdown. No sooner had TCP solved the problem of getting data from point A to point B than it was confronted with the problem of gridlock.

Page: 267, Location: 4088-4091

At the heart of TCP congestion control is an algorithm called Additive Increase, Multiplicative Decrease, or AIMD. Before AIMD kicks in, a new connection will ramp up its transmission rate aggressively: if the first packet is received successfully it sends out two more, if both of those get through it sends out a batch of four, and so on. But as soon as any packet’s ACK does not come back to the sender, the AIMD algorithm takes over. Under AIMD, any fully received batch of packets causes the number of packets in flight not to double but merely to increase by 1, and dropped packets cause the transmission rate to cut back by half (hence the name Additive Increase, Multiplicative Decrease). Essentially, AIMD takes the form of someone saying, “A little more, a little more, a little more, whoa, too much, cut way back, okay a little more, a little more…” Thus it leads to a characteristic bandwidth shape known as the “TCP sawtooth”—steady upward climbs punctuated by steep drops.

Page: 268, Location: 4107-4114

And it may be that human communications themselves mirror the very protocols that transmit them: every text message or email reply encourages yet another, while every unreturned message stanches the flow.

Page: 270, Location: 4131-4132

Some fifty years before Peter’s formulation, Spanish philosopher José Ortega y Gasset in 1910 voiced the same sentiment. “Every public servant should be demoted to the immediately lower rank,” he wrote, “because they were advanced until they became incompetent.”

Page: 270, Location: 4139-4141

Some organizations have attempted to remediate the Peter Principle by simply firing employees who don’t advance. The so-called Cravath System, devised by leading law firm Cravath, Swaine & Moore, involves hiring almost exclusively recent graduates, placing them into the bottom ranks, and then routinely either promoting or firing them over the following years. In 1980, the US Armed Forces adopted a similar “up or out” policy with the Defense Officer Personnel Management Act. The United Kingdom has likewise pursued what they call “manning control,” to great controversy.

Page: 271, Location: 4141-4145

Laurence J. Peter saw it, the insidious Peter Principle arises in corporations because of “the first commandment of hierarchical life: the hierarchy must be preserved.” TCP, in contrast, teaches the virtues of flexibility. Companies speak of “flat” hierarchies and “tall” hierarchies, but they might consider speaking of dynamic ones. Under an AIMD system, no one is long anxious about being overtaxed, nor long resentful about a missed promotion; both are temporary and frequent correctives, and the system hovers near its equilibrium despite everything changing all the time. Perhaps one day we’ll speak not of the arc of one’s career, but rather of its sawtooth.

Page: 271, Location: 4155-4160

In TCP, as we’ve seen, there’s no such thing as a one-way transmission: without consistent feedback, the sender will slow down almost immediately.

Page: 272, Location: 4163-4164

As linguist Victor Yngve would write in 1970, “In fact, both the person who has the turn and his partner are simultaneously engaged in both speaking and listening. This is because of the existence of what I call the back channel, over which the person who has the turn receives short messages such as ‘yes’ and ‘uh-huh’ without relinquishing the turn.”

Page: 272, Location: 4170-4172

We’ve all had the experience of talking to someone whose eyes drifted away—to their phone, perhaps—making us wonder whether our lackluster storytelling was to blame. In fact, it’s now clear that the cause and effect are often the reverse: a poor listener destroys the tale.

Page: 273, Location: 4179-4181

As the saying goes, “the most exciting phrase to hear in science, the one that heralds new discoveries, is not ‘Eureka!’ but ‘That’s funny.’”

Page: 274, Location: 4200-4201

A buffer is essentially a queue whose function is to smooth out bursts. If you walked into a doughnut shop at roughly the same time as another customer, it wouldn’t do for the very momentarily overwhelmed cashier to make one of you leave the store and come back another time. Customers wouldn’t have it, of course, but neither would management: such a policy is virtually guaranteed to underutilize the cashier. Putting the customers in a queue instead ensures that the average throughput of the store approaches its maximum throughput. That’s a good thing.

Page: 275, Location: 4208-4213

This superior resource utilization comes with a very real cost, however: delay.

Page: 275, Location: 4213-4213

It was like trying to have a conversation where every time you say “uh-huh” it is delayed by ten or twenty seconds. The speaker is going to slow way down, assuming you aren’t comprehending them, and there’s nothing you can do about it.

Page: 276, Location: 4227-4228

Dropped packets are the Internet’s primary feedback mechanism.

Page: 277, Location: 4233-4234

Fundamentally, buffers use delay—known in networking as “latency”—in order to maximize throughput. That is, they cause packets (or customers) to wait, to take advantage of later periods when things are slow.

Page: 277, Location: 4236-4237

One of the fundamental principles of buffers, be they for packets or patrons, is that they only work correctly when they are routinely zeroed

Page: 277, Location: 4240-4241

One of the fundamental principles of buffers, be they for packets or patrons, is that they only work correctly when they are routinely zeroed out.

Page: 277, Location: 4240-4241

Photons that miss the retina aren’t queued for later viewing. In real life, packet loss is almost total.

Page: 278, Location: 4262-4262

things done under overload. The most prevalent critique of modern

Page: 279, Location: 4264-4265

It used to be that people knocked on your door, got no response, and went away. Now they’re effectively waiting in line when you come home.

Page: 279, Location: 4268-4269

We used to request dedicated circuits with others; now we send them packets and wait expectantly for ACKs. We used to reject; now we defer.

Page: 279, Location: 4275-4277

The much-lamented “lack of idleness” one reads about is, perversely, the primary feature of buffers: to bring average throughput up to peak throughput. Preventing idleness is what they do. You check email from the road, from vacation, on the toilet, in the middle of the night. You are never, ever bored. This is the mixed blessing of buffers, operating as advertised.

Page: 279, Location: 4277-4280

in fact, gaming is so sensitive to latency that all important gaming honors are still contested in person, with players boarding airplanes to gather and compete over a network serving just a single room.

Page: 281, Location: 4294-4295

And indeed, bringing latencies down is one of the current frontiers of networking research, and it will be interesting to see what that brings.

Page: 281, Location: 4305-4306

Schoolchildren are taught to conceive of literary plots as belonging to one of several categories: man vs. nature, man vs. self, man vs. man, man vs. society. Thus far in this book we have considered primarily cases in the first two categories—that is to say, computer science has thus far been our guide to problems created by the fundamental structure of the world, and by our limited capacities for processing information. Optimal stopping problems spring from

Page: 282, Location: 4324-4327

Arguably the most influential economist of the twentieth century, John Maynard Keynes, once said that “successful investing is anticipating the anticipations of others.”

Page: 283, Location: 4337-4338

In this way, the value of a stock isn’t what people think it’s worth but what people think people think

Page: 284, Location: 4340-4341

In this way, the value of a stock isn’t what people think it’s worth but what people think people think it’s worth.

Page: 284, Location: 4340-4341

As Alan Turing proved in 1936, a computer program can never tell you for sure whether another program might end up calculating forever without end—except by simulating the operation of that program and thus potentially going off the deep end itself. (Accordingly, programmers will never have automated tools that can tell them whether their software will freeze.)

Page: 284, Location: 4350-4352

“I don’t know if this is an actual game-theory term,” says the world’s top-rated poker player, Dan Smith, “but poker players call it ‘leveling.’ Level one is ‘I know.’ Two is ‘you know that I know.’ Three, ‘I know that you know that I know.’

Page: 285, Location: 4359-4361

Reaching Equilibrium

Page: 286, Location: 4383-4383

Mathematicians analyzing these games seek to identify a so-called equilibrium: that is, a set of strategies that both players can follow such that neither player would want to change their own play, given the play of their opponent. It’s called an equilibrium because it’s stable—no amount of further reflection by either player will bring them to different choices. I’m content with my strategy, given yours, and you’re content with your strategy, given mine.

Page: 287, Location: 4387-4391

the mathematician John Nash proved in 1951 that every two-player game has at least one equilibrium. This major discovery would earn Nash the Nobel Prize in Economics in 1994 (and lead to the book and film A Beautiful Mind, about Nash’s life). Such an equilibrium is now often spoken of as the “Nash equilibrium”—the “Nash” that Dan Smith always tries to keep track

Page: 287, Location: 4400-4403

As UC Berkeley computer scientist Christos Papadimitriou

Page: 288, Location: 4415-4415

In a game-theory context, knowing that an equilibrium exists doesn’t actually tell us what it is—or how to get there. As UC Berkeley computer scientist Christos Papadimitriou writes, game theory “predicts the agents’ equilibrium behavior typically with no regard to the ways in which such a state will be reached—a consideration that would be a computer scientist’s foremost concern.”

Page: 288, Location: 4414-4417

it.” And so, the original field of game theory begat

Page: 289, Location: 4418-4419

MIT’s Scott Aaronson agrees. “In my opinion,” he says, “if the theorem that Nash equilibria exist is considered relevant to debates about (say) free markets versus government intervention, then the theorem that finding those equilibria is [intractable] should be considered relevant also.”

Page: 289, Location: 4429-4431

Even when we can reach an equilibrium, just because it’s stable doesn’t make it good. It may seem paradoxical, but the equilibrium strategy—where neither player is willing to change tack—is by no means necessarily the strategy that leads to the best outcomes for the players. Nowhere is that better illustrated than in game theory’s most famous, provocative, and controversial two-player game: “the prisoner’s dilemma.”

Page: 290, Location: 4433-4437

Here’s the problem. No matter what your accomplice does, it’s always better for you to defect.

Page: 290, Location: 4444-4444

This has emerged as one of the major insights of traditional game theory: the equilibrium for a set of players, all acting rationally in their own interest, may not be the outcome that is actually best for those players.

Page: 291, Location: 4454-4455

Surprisingly, Tim Roughgarden and Cornell’s Éva Tardos proved in 2002 that the “selfish routing” approach has a price of anarchy that’s a mere 4/3. That is, a free-for-all is only 33% worse than perfect top-down coordination.

Page: 292, Location: 4465-4466

Every corporation (and, to some degree, every nation) is better off being a bit more reckless than their peers for the sake of competitive advantage. Yet if they all act more recklessly, it leads to a ravaged Earth, and all for nothing: there’s no economic advantage for anyone relative to where they started.

Page: 294, Location: 4494-4496

Imagine two shopkeepers in a small town. Each of them can choose either to stay open seven days a week or to be open only six days a week, taking Sunday off to relax with their friends and family. If both of them take a day off, they’ll retain their existing market share and experience less stress. However, if one shopkeeper decides to open his shop seven days a week, he’ll draw extra customers—taking them away from his competitor and threatening his livelihood. The Nash equilibrium, again, is for everyone to work all the time.

Page: 295, Location: 4511-4515

either the two-party prisoner’s dilemma, or the multi-party tragedy of the commons?

Page: 295, Location: 4520-4520

This brings us to a branch of game theory known as “mechanism design.” While game theory asks what behavior will emerge given a set of rules, mechanism design (sometimes called “reverse game theory”) works in the other direction, asking: what rules will give us the behavior we want to see?

Page: 296, Location: 4532-4534

For the small-town shopkeepers, a verbal truce to take Sundays off would be unstable: as soon as either shopkeeper needed some extra cash he’d be liable to violate it, prompting the other to start working Sundays as well so as not to lose market share. This would land them right back in the bad equilibrium where they get the worst of both worlds—they’re exhausted and don’t get any competitive advantage for it. But they might be able to act as their own don by signing a legally binding contract to the effect that, say, any proceeds earned by either shop on a Sunday go to the other shop. By worsening the unsatisfactory equilibrium, they’d make a new and better one.

Page: 297, Location: 4543-4547

Scaling up this logic results in a potent argument for the role of government. In fact, many governments do have laws on the books mandating minimum vacations and limiting shop hours. And while the United States is one of the only developed nations without federal requirements for paid vacation, Massachusetts, Maine, and Rhode Island do have state-level prohibitions on Thanksgiving commerce.

Page: 298, Location: 4567-4569

God happens to be even better than government in this respect, since omniscience and omnipotence provide a particularly strong guarantee that taking bad actions will have dire consequences. It turns out there’s no Godfather quite like God the Father.

Page: 299, Location: 4574-4576

Mechanism Design by Evolution

Page: 299, Location: 4579-4579

The heart has its reasons which reason knows nothing of. —BLAISE PASCAL

Page: 299, Location: 4583-4584

Whether prompted by a shoddy business or a petty thief, outrage can override rationality. And in these instances, it may be that the hand of evolution has done what it would otherwise have taken an authority outside the game to accomplish.

Page: 301, Location: 4611-4613

Emotion, for the bitter, retaliatory consumer and for the convenience-store hero alike, is our own species taking over the controls for a minute. “Morality is herd instinct in the individual,” wrote Nietzsche.

Page: 302, Location: 4617-4618

As Cornell economist Robert Frank puts it, “If people expect us to respond irrationally to the theft of our property, we will seldom need to, because it will not be in their interests to steal it. Being predisposed to respond irrationally serves much better here than being guided only by material self-interest.”

Page: 302, Location: 4621-4624

In both cases this so-called commitment problem can be at least partially addressed by a contract. But game theory suggests that in the case of dating, the voluntary bonds of the law are less relevant to an enduring partnership than the involuntary bonds of love itself.

Page: 303, Location: 4633-4635

As Robert Frank puts it, “The worry that people will leave relationships because it may later become rational for them to do so is largely erased if it is not rational assessment that binds them in the first place.”

Page: 303, Location: 4635-4636

Yes, people search for objective characteristics they care about. Everybody wants somebody who’s kind and intelligent and interesting and healthy and maybe physically attractive, good earning power, the whole laundry list of features, but that’s the first pass.… After you’ve spent enough time together, it’s not those things that make you want to stay together. It’s just the fact that it’s that particular person—that is what’s valuable to you, so you don’t really need the contract so much as you need a feeling that makes you not want to separate, even though objectively there might be a better option available to you.

Page: 303, Location: 4637-4641

Playwright George Bernard Shaw once wrote of marriage that “If the prisoner is happy, why lock him in? If he is not, why pretend that he is?” Game theory offers a subtle answer to this particular riddle. Happiness is the lock.

Page: 303, Location: 4643-4645

One of the simplest auction formats has each participant write down their bid in secret, and the one whose bid is highest wins the item for whatever price they wrote down. This is known as a “sealed-bid first-price auction,”

Page: 305, Location: 4672-4673

Another classic auction format, the “Dutch auction” or “descending auction,” gradually lowers an item’s price until someone is willing to buy it. The name references the Aalsmeer Flower Auction, the largest flower auction in the world, which takes place daily in the Netherlands—but Dutch auctions are more prevalent than they might initially seem.

Page: 306, Location: 4679-4681

The inverse of a Dutch or descending auction is what’s known as an “English auction” or “ascending auction”—the most familiar auction format. In an English auction, bidders alternate raising the price until all but one of them drop out.

Page: 306, Location: 4686-4688

It’s easy to imagine a bunch of people all going over a cliff together because “everyone else” was acting as though it’d all be fine—when in reality each person had qualms, but suppressed them because of the apparent confidence of everyone else in the group.

Page: 307, Location: 4700-4702

Investors are said to fall into two broad camps: “fundamental” investors, who trade on what they perceive as the underlying value of a company, and “technical” investors, who trade on the fluctuations of the market.

Page: 309, Location: 4730-4731

For one, be wary of cases where public information seems to exceed private information, where you know more about what people are doing than why they’re doing it, where you’re more concerned with your judgments fitting the consensus than fitting the facts. When you’re mostly looking to others to set a course, they may well be looking right back at you to do the same.

Page: 309, Location: 4737-4740

Second, remember that actions are not beliefs; cascades get caused in part when we misinterpret what others think based on what they do. We should be especially hesitant to overrule our own doubts—and if we do, we might want to find some way to broadcast those doubts even as we move forward, lest others fail to distinguish the reluctance in our minds from the implied enthusiasm in our actions.

Page: 310, Location: 4740-4743

Last, we should remember from the prisoner’s dilemma that sometimes a game can have irredeemably lousy rules. There may be nothing we can do once we’re in it, but the theory of information cascades may help us to avoid such a game in the first place.

Page: 310, Location: 4743-4744

And to an algorithmic game theorist in particular, one property especially stands out: the participants are incentivized to be honest.

Page: 311, Location: 4759-4760

French existentialist philosopher Jean-Paul Sartre famously wrote that “Hell is other people.” He didn’t mean that others are inherently malicious or unpleasant, but rather that they complicate our own thoughts and beliefs: When

Page: 313, Location: 4796-4798

The road to hell is paved with intractable recursions, bad equilibria, and information cascades. Seek out games where honesty is the dominant strategy. Then just be yourself.

Page: 314, Location: 4807-4808

Binmore adds another insight: games like the prisoner’s dilemma seemingly obliterate Immanuel Kant’s argument that rationality consists of what he called the “categorical imperative,” acting the way you wish everyone else would act. The categorical imperative would give us a better outcome in the prisoner’s dilemma than the equilibrium strategy, but there’s no getting around the fact that this outcome isn’t a stable one.

Page: 315, Location: 4819-4822

Page: 5, Location: 76

Note: C.

1 Optimal Stopping When to Stop Looking

Page: 13, Location: 196-197

Note: C.

Page: 19, Location: 288-291

Note: The one

Page: 33, Location: 498-501

Note: Burglar problem

Page: 36, Location: 547-551

Note: Time cost of searching

Page: 37, Location: 558-562

Note: Intuition And logic

2 Explore/Exploit The Latest vs. the Greatest

Page: 39, Location: 595-597

Note: C.

Page: 43, Location: 655-659

Note: Inversion of seizing the day

So explore when you will have time to use the resulting knowledge, exploit when you’re ready to cash in. The interval makes the strategy.

Page: 44, Location: 668-669

Note: The strategy

intransigence.”

Page: 47, Location: 711-712

Note: T.

Page: 49, Location: 743-746

Note: Gittins index

Page: 56, Location: 850-852

Note: T.

In some cases, it’s unlikely that any pair of users will have the exact same experience.

Page: 58, Location: 877-877

Note: A/B testing

In general, it seems that people tend to over-explore—to favor the new disproportionately over the best.

Page: 65, Location: 983-984

Note: Study reault

Page: 70, Location: 1065-1067

Note: Important point

Page: 71, Location: 1075-1078

Note: Age and people

Big-O: A Yardstick for the Worst Case

Page: 77, Location: 1170-1171

Note: T.

Page: 80, Location: 1220-1224

Note: T.

Breaking the Quadratic Barrier: Divide and Conquer

Page: 82, Location: 1250-1250

Note: T.

Page: 86, Location: 1316-1318

Note: Bucket Sort

Page: 97, Location: 1477-1481

Note: Robustness is sometimes preferred than efficiency

Blood Sort: Pecking Orders and Dominance Hierarchies

Page: 98, Location: 1488-1489

Note: T.

Much as we bemoan the daily rat race, the fact that it’s a race rather than a fight is a key part of what sets us apart from the monkeys, the chickens—and, for that matter, the rats.

Page: 103, Location: 1573-1575

Note: Important Point

4 Caching Forget About

Page: 105, Location: 1608-1609

Note: C.

Page: 108, Location: 1643-1646

Note: Memory Hierarchy

Page: 110, Location: 1681-1685

Note: Cache evictioln

Turning the Library Inside

Page: 113, Location: 1728-1728

Note: T.

Page: 114, Location: 1742-1745

Note: Temporal Locality

Page: 116, Location: 1766-1767

Note: Akamai

Page: 123, Location: 1880-1882

Note: Noguchi filing system and LRU

The key to a good human memory then becomes the same as the key to a good computer cache: predicting which items are most likely to be wanted in the future.

Page: 125, Location: 1908-1909

Note: Practice and recall

Why don’t they make the whole plane out of that black box stuff? —STEVEN WRIGHT

Page: 127, Location: 1933-1934

Note: Wow

Page: 128, Location: 1955-1960

Note: Important

5 Scheduling First Things First

Page: 130, Location: 1989-1991

Note: C.

Spending Time Becomes a Science

Page: 132, Location: 2010-2010

Note: T.

Priority Inversion and Precedence Constraints

Page: 141, Location: 2153-2153

Note: T.

Page: 142, Location: 2172-2177

Note: Priority inversion and inheritance

“Things which matter most must never be at the mercy of things which matter least,”

Page: 143, Location: 2181-2181

Note: Very important

When a certain task can’t be started until another one is finished, scheduling theorists call that a “precedence constraint.”

Page: 143, Location: 2183-2184

Note: Term

Drop Everything: Preemption and Uncertainty

Page: 146, Location: 2234-2234

Note: T.

Preemption Isn’t Free: The Context Switch

Page: 148, Location: 2265-2266

Note: T.

Page: 150, Location: 2294-2296

Note: T.

Page: 152, Location: 2319-2324

Note: Cache fighting

This is thrashing: a system running full-tilt and accomplishing nothing at

Page: 152, Location: 2327-2328

Note: Term. Thrashing

Page: 157, Location: 2397-2399

Note: Ter.

6 Bayes’s Rule Predicting the Future

Page: 161, Location: 2454-2455

Note: C.

Reasoning Backward with the Reverend Bayes

Page: 162, Location: 2475-2475

Note: T.

In fact, for any possible drawing of w winning tickets in n attempts, the expectation is simply the number of wins plus one, divided by the number of attempts plus two: (w+1)⁄(n+2).

Page: 166, Location: 2534-2537

Note: Laplace law

Page: 166, Location: 2538-2541

Note: Important

Bayes’s Rule and Prior Beliefs

Page: 167, Location: 2551-2551

Note: T.

Page: 168, Location: 2568-2571

Note: Bayes rule

Page: 170, Location: 2594-2598

Note: Copernician principle

Page: 172, Location: 2629-2632

Note: Important

Page: 173, Location: 2639-2642

Note: Important

The richer the prior information we bring to Bayes’s Rule, the more useful the predictions we can get out of it.

Page: 173, Location: 2642-2643

Note: T.

Page: 173, Location: 2645-2647

Note: Term.

Page: 174, Location: 2655-2658

Note: Term.

Page: 175, Location: 2677-2680

Note: Term. Important

Page: 176, Location: 2687-2689

Note: Term.

Page: 177, Location: 2700-2701

Note: Term.

Page: 177, Location: 2701-2703

Note: Term.

The Erlang distribution gives us a third kind of prediction rule, the Additive Rule: always predict that things will go on just a constant amount longer.

Page: 177, Location: 2710-2711

Note: Term.

Page: 178, Location: 2722-2723

Note: Important

Our judgments betray our expectations, and our expectations betray our experience. What we project about the future reveals a lot—about the world we live in, and about our own past.

Page: 182, Location: 2779-2780

Note: Good point

Priors in the Age of Mechanical Reproduction

Page: 184, Location: 2810-2811

Note: T.

Page: 184, Location: 2813-2816

Note: Important

Page: 185, Location: 2826-2830

Note: Base of availability bias

Page: 186, Location: 2837-2839

Note: Point

7 Overfitting When to Think Less

Page: 187, Location: 2858-2860

Note: C.

Page: 191, Location: 2927-2929

Note: Term.

The Idolatry of Data

Page: 192, Location: 2932-2932

Note: T.

Page: 192, Location: 2932-2935

Note: Imp.

How to Combat Overfitting: Penalizing Complexity

Page: 198, Location: 3028-3028

Note: T.

Occam’s razor principle, which suggests that, all things being equal, the simplest possible hypothesis is probably the correct one.

Page: 198, Location: 3034-3035

Note: Term.

Computer scientists refer to this principle—using constraints that penalize models for their complexity—as Regularization.

Page: 199, Location: 3038-3039

Note: Term.

Page: 200, Location: 3057-3059

Note: Imp.

As a species, being constrained by the past makes us less perfectly adjusted to the present we know but helps keep us robust for the future we don’t.

Page: 204, Location: 3115-3116

Note: Imp.

8 Relaxation Let It Slide

Page: 208, Location: 3185-3186

Note: C.

Page: 211, Location: 3224-3225

Note: Term.

But like the secretary problem, it emerged in the mid-twentieth century, a period unmistakably evoked by its canonical name: “the traveling salesman problem.”

Page: 211, Location: 3227-3228

Note: Term.

Page: 214, Location: 3271-3274

Note: Term.

Just a Speeding Ticket: Lagrangian Relaxation

Page: 218, Location: 3341-3341

Note: T.

Page: 219, Location: 3353-3355

Note: Term.

Page: 222, Location: 3396-3404

Note: Relaxation types

Scott Fitzgerald once wrote that “the test of a first-rate intelligence is the ability to hold two opposing ideas in mind at the same time and still retain the ability to function.”

Page: 227, Location: 3473-3474

Note: Quote.

Page: 228, Location: 3490-3493

Note: Term.

For millennia, the study of prime numbers was believed to be, as G. H. Hardy put it, “one of the most obviously useless branches” of mathematics.

Page: 230, Location: 3514-3515

Note: Hahd

Page: 239, Location: 3654-3661

Note: Term. Bloom filter

Page: 241, Location: 3685-3687

Note: Term.

for instance, the physics of annealing, the way that materials change state as they are heated and cooled.

Page: 243, Location: 3724-3725

Note: Term.

Page: 245, Location: 3756-3757

Note: Term.

Page: 247, Location: 3780-3782

Note: Word.

Being randomly jittered, thrown out of the frame and focused on a larger scale, provides a way to leave what might be locally good and get back to the pursuit of what might be globally optimal.

Page: 249, Location: 3816-3817

Note: Impo.

Protocol is how we get on the same page; in fact, the word is rooted in the Greek protokollon, “first glue,” which referred to the outer page attached to a book or manuscript.

Page: 255, Location: 3901-3902

Note: Term.

Exponential Backoff: The Algorithm of Forgiveness

Page: 262, Location: 4017-4018

Note: T.

Since the maximum delay length (2, 4, 8, 16…) forms an exponential progression, it’s become known as Exponential Backoff.

Page: 264, Location: 4047-4048

Note: Term.

Flow Control and Congestion Avoidance

Page: 267, Location: 4087-4088

Note: T.

Page: 268, Location: 4101-4105

Note: Imp.

Backchannels: Flow Control in Linguistics

Page: 272, Location: 4160-4160

Note: T.

Page: 276, Location: 4229-4231

Note: Term.

Page: 279, Location: 4264-4266

Note: Imp.

The feeling that one needs to look at everything on the Internet, or read all possible books, or see all possible shows, is bufferbloat.

Page: 279, Location: 4266-4267

Note: Yolo

Page: 280, Location: 4289-4291

Note: Term.

11 Game Theory The Minds of Others

Page: 282, Location: 4314-4315

Note: C.

Computer science illustrates the fundamental limitations of this kind of reasoning with what’s called the “halting problem.”

Page: 284, Location: 4349-4350

Note: Term.

Dominant Strategies, for Better or Worse

Page: 290, Location: 4433-4433

Note: T.

Page: 291, Location: 4448-4451

Note: Term.

Page: 292, Location: 4473-4477

Note: Imp.

The Tragedy of the Commons

Page: 293, Location: 4482-4482

Note: T.

Page: 293, Location: 4485-4489

Note: Tragedy of commmons

Mechanism Design: Change the Game

Page: 295, Location: 4523-4523

Note: T.

Page: 300, Location: 4591-4594

Note: Imp.

Page: 307, Location: 4702-4705

Note: Term.

Page: 311, Location: 4755-4758

Note: Term.

Page: 312, Location: 4771-4774

Note: Term.

Page: 312, Location: 4779-4781

Note: Term.

Introduction

Page: 5, Location: 76-76

Page: 6, Location: 86-89

Page: 7, Location: 94-98

Page: 8, Location: 122-123

Page: 9, Location: 123-126

Page: 9, Location: 133-136

As Carl Sagan put it, “Science is a way of thinking much more than it is a body of knowledge.”

Page: 9, Location: 137-137

Alan Turing defined the very notion of computation by an analogy to a human mathematician who carefully works through the steps of a lengthy calculation, yielding an unmistakably right answer.

Page: 10, Location: 147-149

Page: 13, Location: 197-199

Page: 14, Location: 202-204

Page: 14, Location: 208-210

Page: 14, Location: 212-215

Page: 17, Location: 249-252

Page: 17, Location: 257-258

Page: 18, Location: 264-267

Page: 19, Location: 280-283

Page: 19, Location: 285-286

It’s true that you’re unlikely to find the needle the majority of the time, but optimal stopping is your best defense against the haystack, no matter how large.

Page: 20, Location: 295-297

married the first man I ever kissed. When I tell this to my children they just about throw up. —BARBARA BUSH

Page: 20, Location: 299-301

I married the first man I ever kissed. When I tell this to my children they just about throw up. —BARBARA BUSH

Page: 20, Location: 299-301

Both Kepler and Trick—in opposite ways—experienced firsthand some of the ways that the secretary problem oversimplifies the search for love.

Page: 21, Location: 320-321

Page: 21, Location: 321-323

Page: 22, Location: 335-339

Page: 23, Location: 349-350

Page: 24, Location: 363-365

Page: 24, Location: 366-369

being more choosy the more applicants are left.

Page: 25, Location: 374-374

Page: 25, Location: 376-378

The full-information game thus offers an unexpected and somewhat bizarre takeaway. Gold digging is more likely to succeed than a quest for love.

Page: 25, Location: 380-382

Page: 25, Location: 380-384

But if neither concern leads us to believe that our backs are against the wall, then we can simply focus on a cost-benefit analysis of the waiting game.

Page: 27, Location: 405-406

First, if the cost of waiting is trivial, we’re able to be almost infinitely choosy.

Page: 27, Location: 414-415

Page: 28, Location: 417-419

Page: 28, Location: 421-424

Page: 29, Location: 433-436

Page: 31, Location: 462-464

When to Quit

Page: 32, Location: 483-483

Page: 33, Location: 503-506

Always Be Stopping

Page: 35, Location: 524-525

Page: 35, Location: 531-533

But in practice, when the clock—or the ticker—is ticking, few aspects of decision-making (or of thinking more generally) are as important as this one: when to

Page: 37, Location: 563-564

But in practice, when the clock—or the ticker—is ticking, few aspects of decision-making (or of thinking more generally) are as important as this one: when to stop.

Page: 37, Location: 563-564

Page: 40, Location: 602-604

Page: 40, Location: 608-610

balance for more than fifty years. They even have a name for it: the explore/exploit tradeoff. Explore/

Page: 40, Location: 614-615

Computer scientists have been working on finding this balance for more than fifty years. They even have a name for it: the explore/exploit tradeoff.

Page: 40, Location: 613-614

Explore/Exploit

Page: 41, Location: 615-615

Simply put, exploration is gathering information, and exploitation is using the information you have to get a known good result.

Page: 41, Location: 616-618

Page: 41, Location: 622-625

Journalists are martyrs, exploring so that others may exploit.

Page: 42, Location: 630-631

Page: 42, Location: 631-633

Page: 43, Location: 647-650

Seize the Interval

Page: 43, Location: 653-653

Page: 44, Location: 664-666

Win-Stay

Page: 45, Location: 687-687

Page: 46, Location: 692-695

Good options shouldn’t be penalized too strongly for being imperfect.

Page: 46, Location: 700-701

For these reasons, the multi-armed bandit problem effectively stayed unsolved. In Whittle’s words, “it quickly became a classic, and a byword for intransigence.”

Page: 47, Location: 710-712

The Gittins Index

Page: 47, Location: 712-712

Economists refer to this idea, of valuing the present more highly than the future, as “discounting.”

Page: 48, Location: 726-726

The Gittins index, then, provides a formal, rigorous justification for preferring the unknown, provided we have some opportunity to exploit the results of what we learn from exploring.

Page: 51, Location: 777-778

Page: 51, Location: 778-782

Regret and Optimism

Page: 52, Location: 795-795

For myself I am an optimist. It does not seem to be much use being anything else. —WINSTON CHURCHILL

Page: 52, Location: 797-798

In the memorable words of management theorist Chester Barnard, “To try and fail is at least to learn; to fail to try is to suffer the inestimable loss of what might have been.”

Page: 53, Location: 802-803

Regret is the result of comparing what we actually did with what would have been best in hindsight.

Page: 53, Location: 813-813

Page: 54, Location: 825-827

Of the ones they’ve discovered, the most popular are known as Upper Confidence Bound algorithms.

Page: 54, Location: 828-829

In a multi-armed bandit problem, an Upper Confidence Bound algorithm says, quite simply, to pick the option for which the top of the confidence interval is highest.

Page: 55, Location: 833-834

Bandits Online

Page: 56, Location: 852-852

Clinical Trials on Trial

Page: 59, Location: 900-900

Clinical testing of new drugs and treatments, the report notes, often requires risking harm to some patients, even if steps are taken to minimize that risk.

Page: 60, Location: 914-915

Page: 64, Location: 970-973

The Restless World

Page: 64, Location: 979-979

Page: 65, Location: 991-992

Page: 68, Location: 1031-1033

What we take to be the caprice of children may be wiser than we know.

Page: 69, Location: 1051-1051

Page: 69, Location: 1052-1055

Page: 70, Location: 1067-1071

Page: 71, Location: 1082-1084

Page: 71, Location: 1088-1090

The basic summary of this section: git while the Gittins’s good.

Page: 72, Location: 1096-1096

3 Sorting Making Order

Page: 72, Location: 1101-1102

The Ecstasy of Sorting

Page: 73, Location: 1119-1119

The first code ever written for a “stored program” computer was a program for efficient sorting.

Page: 74, Location: 1132-1132

The truncated top of an immense, sorted list is in many ways the universal user interface.

Page: 75, Location: 1146-1147

The Agony of Sorting

Page: 76, Location: 1155-1155

Page: 76, Location: 1155-1156

Page: 76, Location: 1156-1158

Page: 76, Location: 1156-1159

This is the first and most fundamental insight of sorting theory. Scale hurts.

Page: 76, Location: 1161-1162

Page: 77, Location: 1176-1178

Moreover, a computer scientist would want to know the worst sort time. Worst-case analysis lets us make hard guarantees: that a critical process will finish in time, that deadlines won’t be blown.

Page: 78, Location: 1186-1188

This is the rosiest class of problems there is: called “Big-O of one,” written O(1), it is also known as “constant time.”

Page: 78, Location: 1195-1196

Page: 79, Location: 1198-1201

Page: 79, Location: 1210-1217

The Squares: Bubble Sort and Insertion Sort

Page: 80, Location: 1224-1224

Obama was right to eschew Bubble Sort, an algorithm which has become something of a punching bag for computer science students: it’s simple, it’s

Page: 81, Location: 1228-1229

Obama was right to eschew Bubble Sort, an algorithm which has become something of a punching bag for computer science students: it’s simple, it’s intuitive, and it’s extremely inefficient.

Page: 81, Location: 1228-1229

Computer scientists call this, appropriately enough, Insertion Sort. The good news is that it’s arguably even more intuitive than Bubble Sort and doesn’t have quite the bad reputation.

Page: 82, Location: 1245-1247

Page: 82, Location: 1250-1251

Page: 84, Location: 1282-1284

The power of Mergesort comes from the fact that it indeed ends up with a complexity between linear and quadratic time—specifically, O(n log n), known as “linearithmic” time.

Page: 84, Location: 1284-1286

No wonder it’s the method of choice for large-scale industrial sorting problems.

Page: 85, Location: 1292-1292

Computer science, as undergraduates are taught, is all about tradeoffs.

Page: 88, Location: 1346-1346

Page: 88, Location: 1347-1352

The verdict is clear: ordering your bookshelf will take more time and energy than scanning through it ever will.

Page: 89, Location: 1364-1365

Page: 90, Location: 1374-1378

Charles Lutwidge Dodgson

Page: 90, Location: 1380-1380

Page: 92, Location: 1396-1399

Page: 92, Location: 1405-1409

Page: 92, Location: 1410-1413

Page: 96, Location: 1462-1465

Page: 97, Location: 1483-1485

Page: 97, Location: 1486-1488

Page: 99, Location: 1511-1516

Page: 101, Location: 1535-1537

Page: 101, Location: 1540-1542

Page: 101, Location: 1548-1550

Page: 102, Location: 1556-1559

in practice one straightforward principle determines which ships give way to which: the “Law of Gross Tonnage.” Quite simply, the smaller ship gets out of the way of the larger one.

Page: 102, Location: 1559-1561

Page: 103, Location: 1570-1573

4 Caching Forget About It

Page: 105, Location: 1608-1609

In the practical use of our intellect, forgetting is as important a function as remembering. —WILLIAM JAMES

Page: 105, Location: 1609-1611

Page: 107, Location: 1628-1630

The Memory Hierarchy

Page: 107, Location: 1633-1633

Page: 108, Location: 1641-1643

Wilkes’s proposal was implemented in the IBM 360/85 supercomputer later in the 1960s, where it acquired the name of the “cache.”

Page: 109, Location: 1662-1663

Page: 109, Location: 1671-1673

Page: 110, Location: 1673-1674

Modern consumer laptops, tablets, and smartphones have on the order of a six-layer memory hierarchy, and managing memory smartly has never been as important to computer science as it is today.

Page: 110, Location: 1675-1677

Eviction and Clairvoyance

Page: 110, Location: 1678-1678

Bélády’s 1966 paper on caching algorithms would become the most cited piece of computer science research for fifteen years.

Page: 111, Location: 1690-1691

Page: 111, Location: 1691-1694

Page: 111, Location: 1695-1698

We could just try Random Eviction, adding new data to the cache and overwriting old data at random.

Page: 111, Location: 1701-1702

Page: 112, Location: 1704-1706

third approach is Least Recently Used (LRU): evicting the item that’s gone the longest untouched (Stewart’s “When was the last time I wore it or used it?”).

Page: 112, Location: 1706-1707

Page: 112, Location: 1710-1711

Page: 113, Location: 1724-1727

Turning the Library Inside Out

Page: 113, Location: 1728-1728

Page: 114, Location: 1741-1744

Says Akamai’s chief architect, Stephen Ludin, “It’s our belief—and we build the company around the fact—that distance matters.”

Page: 117, Location: 1781-1782

Caching is just as useful when it’s proximity, rather than performance, that’s the scarce resource.

Page: 117, Location: 1784-1785

“I have to emphasize,” says Noguchi, “that a very fundamental principle in my method is not to group files according to content.”

Page: 120, Location: 1840-1841

But computer science gives us something that most efficiency gurus don’t: guarantees.

Page: 122, Location: 1858-1859

LRU tells us that when we add something to our cache we should discard the oldest item—but it doesn’t tell us where we should put the new item.

Page: 122, Location: 1859-1861

Page: 123, Location: 1880-1883

Page: 124, Location: 1892-1894

Page: 125, Location: 1903-1908

The Tyranny of Experience

Page: 126, Location: 1931-1931

big book is a big nuisance. —CALLIMACHUS (305–410 BC), LIBRARIAN AT ALEXANDRIA

Page: 126, Location: 1931-1932

A big book is a big nuisance. —CALLIMACHUS (305–410 BC), LIBRARIAN AT ALEXANDRIA

Page: 126, Location: 1931-1932

The fastest cache on current computers, for instance, is made with what’s called SRAM, which costs roughly a thousand times as much per byte as the flash memory in solid-state drives.

Page: 127, Location: 1936-1937

Unavoidably, the larger a memory is, the more time it takes to search for and extract a piece of information from

Page: 127, Location: 1945-1945

Unavoidably, the larger a memory is, the more time it takes to search for and extract a piece of information from it.

Page: 127, Location: 1945-1945

Page: 128, Location: 1951-1954

Caching gives us the language to understand what’s happening. We say “brain fart” when we should really say “cache miss.”

Page: 129, Location: 1968-1969

Page: 131, Location: 2001-2003

William James, the “father of American psychology,” asserts that “there’s nothing so fatiguing as the eternal hanging on of an uncompleted task,”

Page: 131, Location: 2007-2008

Page: 132, Location: 2017-2020

Page: 133, Location: 2034-2035

Page: 134, Location: 2043-2045

Thus we encounter the first lesson in single-machine scheduling literally before we even begin: make your goals explicit.

Page: 134, Location: 2045-2046

Page: 134, Location: 2054-2057

Page: 135, Location: 2065-2067

Page: 135, Location: 2069-2072

Getting Things Done

Page: 136, Location: 2079-2080

Minimizing the sum of completion times leads to a very simple optimal algorithm called Shortest Processing Time: always do the quickest task you can.

Page: 137, Location: 2090-2091

Page: 137, Location: 2095-2097

this strategy nonetheless offers a nice rule of thumb: only prioritize a task that takes twice as long if it’s twice as important.

Page: 138, Location: 2106-2106

Page: 138, Location: 2113-2115

It’s said that “a man with one watch knows what time it is; a man with two watches is never sure.”

Page: 139, Location: 2122-2123

Page: 139, Location: 2129-2131

Page: 140, Location: 2146-2147

Page: 142, Location: 2169-2172

A commitment to fastidiously doing the most important thing you can, if pursued in a head-down, myopic fashion, can lead to what looks for all the world like procrastination.

Page: 143, Location: 2179-2180

this problem belongs to a class that most computer scientists believe has no efficient solution—it’s what the field calls “intractable.”

Page: 145, Location: 2213-2215

Page: 146, Location: 2228-2231

The best time to plant a tree is twenty years ago. The second best time is now. —PROVERB

Page: 146, Location: 2234-2236

But there is one twist that can make it easier: being able to stop one task partway through and switch to another. This property, “preemption,” turns out to change the game dramatically.

Page: 146, Location: 2236-2238

In fact, the weighted version of Shortest Processing Time is a pretty good candidate for best general-purpose scheduling strategy in the face of uncertainty.

Page: 147, Location: 2253-2254

Page: 147, Location: 2254-2257

When the future is foggy, it turns out you don’t need a calendar—just a to-do list.

Page: 148, Location: 2265-2265

The hurrieder I go / The behinder I get —NEEDLEPOINT SEEN IN BOONVILLE, CA

Page: 148, Location: 2266-2267

Psychologists have shown that for us, the effects of switching tasks can include both delays and errors—at the scale of minutes rather than microseconds.

Page: 149, Location: 2283-2284

Brian, for his part, thinks of writing as a kind of blacksmithing, where it takes a while just to heat up the metal before it’s malleable.

Page: 150, Location: 2288-2289

Thrashing

Page: 150, Location: 2296-2296

Page: 150, Location: 2299-2303

Page: 151, Location: 2311-2314

Page: 152, Location: 2324-2326

Page: 152, Location: 2326-2328

Interrupt Coalescing

Page: 154, Location: 2359-2359

Page: 154, Location: 2359-2362

Page: 155, Location: 2375-2378

Methods such as “timeboxing” or “pomodoros,” where you literally set a kitchen timer and commit to doing a single task until it runs out, are one embodiment of this idea.

Page: 156, Location: 2378-2379

Page: 156, Location: 2385-2386

Page: 156, Location: 2386-2388

Page: 157, Location: 2395-2396

Page: 158, Location: 2414-2416

Page: 162, Location: 2471-2473

Page: 162, Location: 2478-2479

And in either 1746, ’47, ’48, or ’49 he would write one of the most influential papers in all of mathematics, abandon it unpublished, and move on to other things.

Page: 162, Location: 2484-2485

Page: 163, Location: 2491-2496

Page: 163, Location: 2496-2500

This is the crux of Bayes’s argument. Reasoning forward from hypothetical pasts lays the foundation for us to then work backward to the most probable one.

Page: 165, Location: 2516-2517

Laplace’s Law

Page: 165, Location: 2523-2524

Page: 165, Location: 2526-2528

Count the number of times it has happened in the past plus one, then divide by the number of opportunities plus two.

Page: 167, Location: 2548-2548

Page: 168, Location: 2575-2577

The Copernican Principle

Page: 169, Location: 2583-2583

It’s difficult to make predictions, especially about the future. —DANISH PROVERB

Page: 169, Location: 2584-2585

Page: 170, Location: 2607-2610

And it turns out that the Copernican Principle is exactly what results from applying Bayes’s Rule using what is known as an uninformative prior.

Page: 171, Location: 2617-2619

Page: 172, Location: 2627-2629

Real-World Priors

Page: 173, Location: 2643-2644

In the broadest sense, there are two types of things in the world: things that tend toward (or cluster around) some kind of “natural” value, and things that don’t.

Page: 173, Location: 2644-2645

It’s often lamented that “the rich get richer,” and indeed the process of “preferential attachment” is one of the surest ways to produce a power-law distribution.

Page: 174, Location: 2665-2667

Page: 174, Location: 2667-2668

Page: 174, Location: 2667-2669

Page: 175, Location: 2677-2684

Indeed, distributions that yield the same prediction, no matter their history or current state, are known to statisticians as “memoryless.”

Page: 178, Location: 2719-2720

Page: 178, Location: 2725-2727

Page: 178, Location: 2728-2729

Page: 178, Location: 2730-2732

Page: 179, Location: 2739-2741

Page: 181, Location: 2763-2765

Good predictions require good priors.

Page: 182, Location: 2778-2779

Learning self-control is important, but it’s equally important to grow up in an environment where adults are consistently present and trustworthy.

Page: 184, Location: 2809-2810

He is careful of what he reads, for that is what he will write. He is careful of what he learns, for that is what he will know. —ANNIE DILLARD

Page: 184, Location: 2813-2814

The best way to make good predictions, as Bayes’s Rule shows us, is to be accurately informed about the things you’re predicting.

Page: 184, Location: 2815-2816

Page: 186, Location: 2837-2845

Q.E.D.” Quod erat demonstrandum,

Page: 187, Location: 2864-2865

The question of how hard to think, and how many factors to consider, is at the heart of a knotty problem that statisticians and machine-learning researchers call “overfitting.”

Page: 188, Location: 2882-2883

The Case Against Complexity

Page: 189, Location: 2885-2885

And every prediction, crucially, involves thinking about two distinct things: what you know and what you don’t.

Page: 189, Location: 2889-2890

Page: 191, Location: 2927-2928

Page: 191, Location: 2929-2932

Fundamentally, overfitting is a kind of idolatry of data, a consequence of focusing on what we’ve been able to measure rather than what matters.

Page: 192, Location: 2942-2943

Overfitting Everywhere

Page: 193, Location: 2951-2951

The original goal of fencing was to teach people how to defend themselves in a duel, hence the name: “defencing.”

Page: 194, Location: 2965-2966

It’s as exciting a sport as ever, but as athletes overfit their tactics to the quirks of scorekeeping, it becomes less useful in instilling the skills of real-world swordsmanship.

Page: 194, Location: 2970-2972

Avinash Kaushik, digital marketing evangelist at Google, warns that trying to get website users to see as many ads as possible naturally devolves into trying to cram sites with ads:

Page: 195, Location: 2983-2984

Kaushik’s conclusion: “Friends don’t let friends measure Page Views. Ever.”

Page: 195, Location: 2988-2988

Mistakes like these are known in law enforcement and the military as “training scars,” and they reflect the fact that it’s possible to overfit one’s own preparation.

Page: 196, Location: 2997-2998

Detecting Overfitting: Cross-Validation

Page: 196, Location: 3000-3000

Research in machine learning has yielded several concrete strategies for detecting overfitting, and one of the most important is what’s known as Cross-Validation.

Page: 196, Location: 3005-3006

Page: 197, Location: 3007-3008

Page: 198, Location: 3028-3031

If you can’t explain it simply, you don’t understand it well enough. —ANONYMOUS

Page: 198, Location: 3029-3030

Page: 198, Location: 3031-3033

Page: 199, Location: 3040-3044

Page: 199, Location: 3040-3045

Page: 200, Location: 3055-3056

The Upside of Heuristics

Page: 200, Location: 3059-3060

Page: 201, Location: 3069-3070

The Weight of History

Page: 202, Location: 3085-3085

Page: 203, Location: 3103-3104

Page: 205, Location: 3130-3132

Page: 206, Location: 3149-3151

Page: 206, Location: 3151-3152

Page: 207, Location: 3160-3162

nuptial

Page: 208, Location: 3190-3190

Page: 210, Location: 3216-3219

In fact, it’s the most famous optimization problem of them all.

Page: 211, Location: 3225-3226

Page: 212, Location: 3247-3253

Page: 213, Location: 3263-3264

If you can’t solve the problem in front of you, solve an easier version of it—and then see if that solution offers you a starting point, or a beacon, in the full-blown problem. Maybe it does.

Page: 215, Location: 3295-3296

The message is simple but profound: if we’re willing to accept solutions that are close enough, then even some of the hairiest problems around can be tamed with the right techniques.

Page: 216, Location: 3299-3300

Page: 217, Location: 3319-3321

Page: 224, Location: 3428-3430

9 Randomness When to Leave It to Chance

Page: 224, Location: 3435-3436

Page: 227, Location: 3473-3474

Scott Fitzgerald once wrote that “the test of a first-rate intelligence is the ability to hold two opposing ideas

Page: 227, Location: 3473-3474

F. Scott Fitzgerald once wrote that “the test of a first-rate intelligence is the ability to hold two opposing ideas in mind at the same time and still retain the ability to function.”

Page: 227, Location: 3473-3474

Page: 230, Location: 3518-3520

Page: 232, Location: 3556-3559

Page: 233, Location: 3568-3573

Page: 234, Location: 3573-3575

Page: 236, Location: 3609-3611

There is no such thing as absolute certainty, but there is assurance sufficient for the purposes of human life. —JOHN STUART MILL

Page: 238, Location: 3639-3641

This is an example of a so-called greedy algorithm, which you can also think of as a “myopic algorithm”: one that shortsightedly takes the best thing available every step of the way.

Page: 240, Location: 3677-3678

Page: 242, Location: 3710-3713

apocryphal)

Page: 247, Location: 3779-3779

Page: 249, Location: 3806-3808

Page: 251, Location: 3835-3839

“Once you got somewhere you were happy,” he told the Guardian, “you’d be stupid to shake it up any further.”

Page: 251, Location: 3843-3844

10 Networking How We Connect

Page: 253, Location: 3879-3880

portent—

Page: 254, Location: 3886-3886

Page: 254, Location: 3886-3888

Page: 254, Location: 3888-3890

Page: 254, Location: 3891-3894

And the text message began, on December 3, 1992, with cheer: Neil Papworth at Sema Group Telecoms wishing Vodafone’s Richard Jarvis an early “Merry Christmas.”

Page: 254, Location: 3894-3895

Page: 255, Location: 3895-3898

Page: 255, Location: 3909-3912

Page: 256, Location: 3914-3915

Page: 257, Location: 3929-3930

Page: 257, Location: 3941-3944

Page: 259, Location: 3961-3964

Communication is one of those delightful things that work only in practice; in theory it’s impossible.

Page: 259, Location: 3968-3969

Page: 259, Location: 3970-3973

Page: 260, Location: 3987-3989

Page: 262, Location: 4006-4008

Page: 261, Location: 4001-4005

Every message could be the last, and there is often no telling the difference between someone taking their time to respond and someone who has long since ended the conversation.

Page: 262, Location: 4010-4011

Page: 263, Location: 4018-4021

Clearly the ALOHAnet protocol was going to need to tell competing signals how to give each other space, how to yield and make way for one another.

Page: 263, Location: 4031-4031

Page: 263, Location: 4032-4034

Page: 265, Location: 4050-4052

Page: 265, Location: 4052-4056

Page: 267, Location: 4088-4091

Page: 268, Location: 4107-4114

Page: 270, Location: 4131-4132

Page: 270, Location: 4139-4141

Page: 271, Location: 4141-4145

Page: 271, Location: 4155-4160

In TCP, as we’ve seen, there’s no such thing as a one-way transmission: without consistent feedback, the sender will slow down almost immediately.

Page: 272, Location: 4163-4164

Page: 272, Location: 4170-4172

Page: 273, Location: 4179-4181

As the saying goes, “the most exciting phrase to hear in science, the one that heralds new discoveries, is not ‘Eureka!’ but ‘That’s funny.’”

Page: 274, Location: 4200-4201

Page: 275, Location: 4208-4213

This superior resource utilization comes with a very real cost, however: delay.

Page: 275, Location: 4213-4213

Page: 276, Location: 4227-4228

Dropped packets are the Internet’s primary feedback mechanism.

Page: 277, Location: 4233-4234

Page: 277, Location: 4236-4237

One of the fundamental principles of buffers, be they for packets or patrons, is that they only work correctly when they are routinely zeroed

Page: 277, Location: 4240-4241

One of the fundamental principles of buffers, be they for packets or patrons, is that they only work correctly when they are routinely zeroed out.

Page: 277, Location: 4240-4241

Photons that miss the retina aren’t queued for later viewing. In real life, packet loss is almost total.

Page: 278, Location: 4262-4262

things done under overload. The most prevalent critique of modern

Page: 279, Location: 4264-4265

It used to be that people knocked on your door, got no response, and went away. Now they’re effectively waiting in line when you come home.

Page: 279, Location: 4268-4269

We used to request dedicated circuits with others; now we send them packets and wait expectantly for ACKs. We used to reject; now we defer.

Page: 279, Location: 4275-4277

Page: 279, Location: 4277-4280

Page: 281, Location: 4294-4295

And indeed, bringing latencies down is one of the current frontiers of networking research, and it will be interesting to see what that brings.

Page: 281, Location: 4305-4306

Page: 282, Location: 4324-4327

Arguably the most influential economist of the twentieth century, John Maynard Keynes, once said that “successful investing is anticipating the anticipations of others.”

Page: 283, Location: 4337-4338

In this way, the value of a stock isn’t what people think it’s worth but what people think people think

Page: 284, Location: 4340-4341

In this way, the value of a stock isn’t what people think it’s worth but what people think people think it’s worth.

Page: 284, Location: 4340-4341

Page: 284, Location: 4350-4352

Page: 285, Location: 4359-4361

Reaching Equilibrium

Page: 286, Location: 4383-4383

Page: 287, Location: 4387-4391

Page: 287, Location: 4400-4403

As UC Berkeley computer scientist Christos Papadimitriou

Page: 288, Location: 4415-4415

Page: 288, Location: 4414-4417

it.” And so, the original field of game theory begat

Page: 289, Location: 4418-4419

Page: 289, Location: 4429-4431

Page: 290, Location: 4433-4437

Here’s the problem. No matter what your accomplice does, it’s always better for you to defect.

Page: 290, Location: 4444-4444

Page: 291, Location: 4454-4455

Page: 292, Location: 4465-4466

Page: 294, Location: 4494-4496

Page: 295, Location: 4511-4515

either the two-party prisoner’s dilemma, or the multi-party tragedy of the commons?

Page: 295, Location: 4520-4520

Page: 296, Location: 4532-4534

Page: 297, Location: 4543-4547

Page: 298, Location: 4567-4569

Page: 299, Location: 4574-4576

Mechanism Design by Evolution

Page: 299, Location: 4579-4579

The heart has its reasons which reason knows nothing of. —BLAISE PASCAL

Page: 299, Location: 4583-4584

Page: 301, Location: 4611-4613

Page: 302, Location: 4617-4618

Page: 302, Location: 4621-4624

Page: 303, Location: 4633-4635

Page: 303, Location: 4635-4636

Page: 303, Location: 4637-4641

Page: 303, Location: 4643-4645

Page: 305, Location: 4672-4673

Page: 306, Location: 4679-4681

Page: 306, Location: 4686-4688

Page: 307, Location: 4700-4702

Page: 309, Location: 4730-4731

Page: 309, Location: 4737-4740

Page: 310, Location: 4740-4743

Page: 310, Location: 4743-4744

And to an algorithmic game theorist in particular, one property especially stands out: the participants are incentivized to be honest.

Page: 311, Location: 4759-4760

Page: 313, Location: 4796-4798

The road to hell is paved with intractable recursions, bad equilibria, and information cascades. Seek out games where honesty is the dominant strategy. Then just be yourself.

Page: 314, Location: 4807-4808

Page: 315, Location: 4819-4822