Reasoning Tools (2016)

The Psychology of Thinking – with Richard Nisbett
The Royal Institution, Jun 22, 2016
Reasoning Tools for the Information Age

Vocabulary has increased
The more words you have, the more concepts you have.
The more concepts you have, the smarter you hare.

Calculus is routinely taught in high school

there’s been no continuous change: Scandinavia
Scandinavia does a better job than the rest of us bringing up the bottom
industrial revolution skills

Examples of the new tools: Statistics and probability theory
– sample
– population
– sample bias
– randomness
– law of large numbers (23:30)
– normal distribution
– standard deviation
– statistical significance
– regression to the mean (29:27-> 32:32)
26:57 sophomore slump, second novels, albums
– base rate
– correlation (odds) 17:55
Perceived and actual correlations acrross two occasions and across 20 (21:44)
+ abilities (test scores)
+ traits (honesty)

10:47 Scientific methodology
– control group
– randomized control experiment
– confounded variable
multiple regression analysis (39:40)
control for social class: the prestige of their occupation
– self-selection
– independence of observations
– natural experiment
– artifact

11:01 Decision Theory
– cost/benefit analysis
– opportunity cost (48:59)
– sunk cost (52:31)
– loss aversion
26:57 sophomore slump, second novels, albums
– base rate
– correlation (odds) 17:55
Perceived and actual correlations acrross two occasions and across 20 (21:44)
+ abilities (test scores)
+ traits (honesty)







Economics Claims a Precision Rarely Found

Economics Claims a Precision Rarely Found
Thanks for Mr. Roberts’s views on how dismal the science is in the dismal science.
May 24, 2016

Regarding Kyle Peterson’s “The Weekend Interview with Russ Roberts: When All Economics Is Political” (May 14): … reliance on regression analysis to find confirmation of our preconceptions

… all systems, and especially economic systems, are highly nonlinear in how they respond over time to any disturbances to the system, especially human, but also natural, disturbances.
It is effectively impossible to include all effects in any model of such a system, so we simplify for the sake of obtaining a computational forecast estimate in our lifetime.
Unfortunately, any nonlinear system unwinds over time in ways not exactly predictable given the limits of our computational model.
It is hubris to think that more data can make predictions of large material, human, environmental and econometric ensembles more reliable.
This is commonly known in chaos theory as the butterfly effect.
Even meteorologists …

this is a letter to:
When All Economics Is Political
The dismal science has too much junk science, says Russ Roberts, an evangelist for humility in a discipline where it is often hard to find.
By Kyle Peterson
May 13, 2016

Training Data Scientists (2014)

Structure Data 2014: How Will We Train Data Scientists of the Future?
GIGAOM, April 13, 2014
AnnaLee Saxenian — Professor and Dean, UC Berkeley

12:50 the core that everybody would agree to is pretty small:

  • statistics
  • computer science programming
  • Big Data tools

Introduction to Big Data
September 2015
by University of California, San Diego

Sampling bias

Accuracy of X-rays





D1: tuberculosis
D2: no tuberculosis
T+: positive X-ray
T-: negative X-ray
It is useful to find P(D1|T+), the probability that an individual has the disease given that he tests positive. This probability is also called the predictive value of a positive test.

Remark: Here because of the problem of sampling bias, it is not correct to simply estimate P(D1|T+) based on the observed data, i.e., the numbers given in the table. This incorrect estimate gives 22/73, which has a large upward bias and over estimates P(D1|T+).

We use the Bayes theorem to find P(D1|T+). The Bayes theorem says:




There are two importants things here:
1. Prior probability: P(D1): the probability of TB before having the data. This is called prior probability. Usually, a
judgement call has to be made as to what prior probability to use. For the present problem, it seems reasonable to use the population prevalence as the prior probability. In 1987, there were 9.3 TB cases per 100,000 population. Therefore, we specify: …

Biostatistics, 2003 .
Research Interests: analysis of high-dimensional data
Teaching: Topics in High Dimensional Data Analysis (STAT:7190)


Bayes Theorem

The multiplication rule for the occurrence of both of two or more events is as follows: If A, B, and C are independent, then
If two events such as B and D are not independent, then



The multiplication rule for probabilities when events are not independent can be used to derive one form of an important formula called Bayes’ theorem. Because P(B and D) equals both P(B | D) × P(D) and P(B) × P(D | B), these latter two expressions are equal. Assuming P(B) and P(D) are not equal to zero, we can solve for one in terms of the other, as follows:




which is found by dividing both sides of the equation by P(D). Similarly,


In the equation for P(B | D), P(B) in the right-hand side of the equation is sometimes called the prior probability, because its value is known prior to the calculation; P(B | D) is called the posterior probability, because its value is known only after the calculation.
The two formulas of Bayes’ theorem are important because investigators frequently know only one of the pertinent probabilities and must determine the other. Examples are diagnosis and management

Basic & Clinical Biostatistics, Fourth Edition
Copyright © 2004 by The McGraw-Hill Companies, Inc.

Markov processes

Ch. 5 Markov processes
by Scott E Page
Director, Center for the Study of Complex Systems
University of Michigan .

Ch 1 Decision Trees .
other resources:


Untangling Skill and Luck
July 15, 2010
The outcomes for most activities combine skill and luck.