10 <strong>Topic IX. Seeing Patterns in Random Noise</strong> …
Topic IX. Seeing Patterns in Random Noise
- We often find mistake noise for signal; how do we minimize these mistakes, given that they are not always easy to tell apart?
- Humans are so good at finding signals in noise that sometimes they do so even when there is no signal. Many techniques of science and much of statistics is aimed at avoiding fooling yourself this way. A further problem is that we often aren’t aware of how much noise we have searched through, when we believe we have found a signal—the “Look Elsewhere Effect." For example, we tend to think coincidences are meaningful. Statistics was invented primarily to deal with the problem of distinguishing real signal from noise fluctuations that look like signal.
- Addressing the Question: How confident should we be?
- False pattern detection
- Guarding against false pattern detection
TOPIC RESOURCES
EXAMPLES
- Introductory Examples
- Higgs boson: Is this peak real or is it just a statistical fluctuation?
- Pulsar that people thought might be extraterrestrials.
- Animals in cloud shapes.
- Running into an acquaintance while traveling on the other side of the world from where you both live.
- Running into someone when you were just thinking about them the previous day.
- Dreaming an event and then something similar happening within the next month.
- Finding that two people in your class share a birthday, and thinking it indicates a special connection.
- Exemplary Quotes
- "I know it seems super meaningful that we ran into each other in Australia, when neither of us live in Australia, but I guess the chances of running into someone you know at some point, if you travel a lot and know a lot of people, are pretty high."
- "There are many many cases of people making insanely correct predictions, so many that some people are convinced clairvoyance is real. But there are many more cases of people making totally wrong predictions. So it's probably just noise; with enough predictions, someone will be correct by luck."
- If I ace a test every time I wear a certain t-shirt, then over time I might begin to attribute my success to the t-shirt and continue wearing it, deeming it "lucky". This could easily just be coincidence, but since I notice this pattern of correlation I am likely to keep up the pattern until it fails me.
- Cautionary Quotes: Mistakes, Misconceptions, & Misunderstandings
- Students had great difficulty recognizing the Look Elsewhere Effect in sets of studies, in part because they struggled with the conceptual underpinnings of statistical significance.
- "If you draw lines correctly between the stars, you can make out a message from the aliens. You just have to know which stars to connect to see the message."
LEARNING GOALS
- A. ATTITUDES
- Be wary of our tendency to see patterns that do not exist (to see signal where there is in fact only noise).
- B. CONCEPT ACQUISITION
- People are (evolutionarily?) disposed to over-perceive signal (i.e., noise often gets misinterpreted as signal), perhaps because the cost of missing real signal (false negatives) is typically higher than the cost of mistaking noise for signal (false positives).
- People tend to see any regularity as a pattern (i.e., see more signal than there is), even when “patterns” occur by chance (i.e. are pure noise), e.g.: People underestimate the frequency of apparent patterns produced by randomness, leading to overperception of spurious signal much more frequently than people account for. (Events that are just coincidental are much more likely than most people expect.)
- Gambler’s fallacy: Expecting that streaks will be broken, such that future results will “average out” earlier ones, even when all trials are independent.
- Hot-hand fallacy: Expecting that streaks will continue, even when all trials are independent.
- Look Elsewhere Effect: Even if there is a low probability of pure noise passing a given threshold for signal, if we look at enough noise some of it will pass that threshold by chance. I.e., if there is a low probability of obtaining a false positive in any given instance, the more times you try (the more questions you ask, measures you take, or studies you run without statistical correction), the more you increase the probability of getting a false positive. This occurs when one:
- a. Asks too many questions of the same data set, reporting only statistically significant results.
- b. Asks the same question of multiple data sets, reporting only statistically significant results.
- c. Runs a test or similar tests too many times, reporting only statistically significant results.
- d. This also occurs in everyday life, e.g. when one looks at a whole lot of phenomena and only takes note of the most surprising-looking patterns, not properly taking into account the larger number of unsurprising patterns/lack of pattern.
- Statistical Significance: How unlikely a given set of results would be if the null hypothesis were true (i.e. if the hypothesized effect did not actually exist).
- [Technical term: P-values: the probability of getting a result as extreme or more if in fact the hypothesis is false, simply through random noise.]
- Lack of statistical significance does NOT prove the null hypothesis.
- C. CONCEPT APPLICATION
- Describe how scientists guard against detecting a signal that does not exist.
- Recognize and explain the flaw in everyday scenarios in which people mistake noise for signal (e.g. Look Elsewhere Effect, gambler’s fallacy, hot-hand effect).
- Recognize and explain the flaw in a scenario where scientists mistake noise for signal.
- Given a news article or other concrete example, correctly extract the effect size versus statistical significance of a causal factor, and explain how each affects the importance and usefulness of the results.
CLASS ELEMENTS
- Suggested Readings & Reading Questions
- Clicker Questions
- Students identify Look Elsewhere Effect mistakes in various scenarios (including medical studies).
- A friend tells you that, when conducting coin flips, there were ten heads in a row. This is, of course, a surprising result. In what situation would it be most surprising?
- A. She is the only person flipping a coin, and she flipped it only ten times.
- B. There are many people flipping coins, and everyone flips ten times.
- C. There are many people flipping coins, and each person conducts 100 flips.
- D. She is the only person flipping a coin, and she conducts 100 flips.
- Discussion Questions
- Students discuss Look Elsewhere Effect mistakes in various scenarios, with Clicker Questions.
- Listen to excerpt from Radiolab, "A Very Lucky Wind" (first part of their episode on Stochasticity) and discuss.
- Practice Problems
- Discuss: https://xkcd.com/882/
- Class Exercises
- Professor leaves room and students write down two lists of 40 coin-toss results: “heads, tails, tails, heads...,” the first generated by students sequentially calling out “heads” or “tails,” trying to simulate random coin flips and the second by actually flipping coins. The professor returns, and has to guess which is random and which is simulated random.
- Stock picking activity. Students guess whether each of six fictional stocks will rise or fall. The instructor picks if each stock will rise or fall by flipping a coin, and then asks the students if anyone got all six right. Typically, at least one student will, just by chance, even though it is clearly a matter of chance.
- Homework Questions
- Describe a case not discussed in class where someone (or many people) see signal where in fact there is only noise.