This work is available here free, so that those who cannot afford it can still have access to it, and so that no one has to pay before they read something that might not be what they really are seeking.  But if you find it meaningful and helpful and would like to contribute whatever easily affordable amount you feel it is worth, please do do.  I will appreciate it. The button to the right will take you to PayPal where you can make any size donation (of 25 cents or more) you wish, using either your PayPal account or a credit card without a PayPal account.

Examples of a Common Kind of Fallacy in the Social Sciences
Rick Garlikov


Social scientists often claim to be able to empirically and objectively measure a subjective or vague characteristic by equating it with a precise objective criterion they say captures the meaning of the concept or provides necessary and sufficient evidence for the subjective determinations we make.  I think all too often they fail to get it right, if it can be done right at all.  Any and every such claim should be viewed with suspicion and subjected to serious scrutiny.








Because it makes the kind of error common in the social sciences, I want to examine a passage from a recent article printed in Hillsdale College’s publication Imprimus.  The article is “Restoring America’s Economic Mobility” by Frank Buckley and is in the September 2016 issue.  It is also online at https://imprimis.hillsdale.edu/restoring-americas-economic-mobility/  I think there are a number of common kinds of errors in the reasoning stated in the article, and these errors are not errors of form, and are thus ‘informal’ fallacies or what I think would be better called 'non-formal' fallacies.  However I don’t believe these are informal fallacies that have names or would be listed in lists of informal fallacies, though I could be mistaken about that. 


After I analyze this passage, I will also try to show that some experimental results in psychology make the same kind of error, using as examples, some particular research methods interpreted to show that people are not as rational as previously supposed.  I will not be arguing that people are necessarily rational, but that the experiments believed to show they are not, do not show that.

 

“[Many people today imagine America to still be] a country defined by the promise that whoever you are, you have the same chance as anyone else to rise, with pluck, industry, and talent. But they imagine wrong.  The U.S. today lags behind many of its First World rivals in terms of mobility. A class society has inserted itself within the folds of what was once a classless country, and a dominant New Class—as social critic Christopher Lasch called it—has pulled up the ladder of social advancement behind it.

 

“One can measure these things empirically by comparing the correlation between the earnings of fathers and sons. Pew’s Economic Mobility Project ranks Britain at 0.5, which means that if a father earns £100,000 more than the median, his son will earn £50,000 more than the average member of his cohort. That’s pretty aristocratic. On the other end of the scale, the most economically mobile society is Denmark, with a correlation of 0.15. The U.S. is at 0.47, almost as immobile as Britain.

 

“A complacent Republican establishment denies this change has occurred. If they don’t get it, however, American voters do. For the first time, Americans don’t believe their children will be as well off as they have been.” [1] 

 

Social scientists often claim to be able to empirically and objectively measure a subjective or vague characteristic by equating it with a precise objective criteria they say captures the meaning of the concept or provides necessary and sufficient evidence for the subjective determinations we make.  I think all too often they fail to get it right, if it can be done right at all.  Any and every such claim should be viewed with suspicion and subjected to serious scrutiny.  An easy example is IQ measurement by means of a test score on a certain kind of test,  whereby the higher one’s IQ score, the more intelligent one supposedly is.  However, clearly “intelligence” involves characteristics more than, and often different from, scoring high on such a test, particularly if the high score is attributable to coaching more than, or rather than, to some sort of inherent ability.  Intelligence may be about seeing connections other don’t (whether in comedy or in physics) or seeing them much faster, about learning new things quickly, about having deeper understanding and seeing ramifications, about capacity for learning and/or remembering, about having great common sense, being perceptive, etc. in ways a standard IQ test doesn’t measure or pretend to measure.  It might be about combining many different ideas over time to discover or invent something else no one has before, and which cannot be tested at some given time on a test where the answer is already known.  When one of my daughter’s was in sixth grade, she auditioned for something where one of the skill tests was sight-reading, but the piece they gave her was one she had played before.  Had she not told them, she would have likely seemed to be a great sight reader, though maybe she would have had to make some slight errors to carry out the pretense successfully.  (Possibly they knew she had studied this piece previously and it was a test of her candor, perhaps considered to be a test of honesty.)


So in the above passage we are given a way that is claimed to measure the notion of economic mobility and whether people have a good chance to succeed in America and “rise above their station of birth” even if they start from lowly beginnings.

 

Notice the formula is something of a complicated measure and is different from what it might seem to be at first.  Nothing in the measure shows whether a child makes more or less than its parent (I am using “parent/child” rather than “father/son”, since the latter seems sexist in this day and age.)  And it is misleading to claim, as it does, that this is about “comparing the correlation between the earnings of fathers and sons.” It is not that the child in the example makes half what its parent did, but that if you subtract the average income of the child’s generation in the country from the child’s income, it will be half the number you get from subtracting the average income of the parent’s generation in the country from the parent’s income.  It is the ratio of the child’s income difference from the child's peers to the parent’s income difference from the parent’s peers.   It is not easy to see what sort of “thing” that number represents.  And though in some cases it might coincide with economic mobility, or with signs of economic mobility, it need not.  It is not necessarily, and perhaps even not at all, the same thing as what is meant by economic mobility or as economic opportunity to rise economically or to be economically successful.

 

To see what it means or doesn’t mean, consider this scenario: suppose your mother made $100,000 a year while the median income of her age was $40,000.  That would yield a difference for her of $60,000 more than average, so she made quite a bit more money than most of her fellow contemporary citizens.  She made 2 and a half times what the median average person her age did, and she was in a much higher ‘class’ than they were (if we equate class with income).  And now suppose that by some economic booming circumstance, the average contemporary of yours makes $1 million annual income, and you make $1,001,500.  Now, you are making more than ten times what your mother made, but you are making ‘only’ $1500 more than your fellow citizens, so the comparison gives the number (1500/60,000 = .025) which would make your society highly mobile by this measure, as it indeed seems to be.  Lots of people went from lower class with their parents being relatively poor to themselves being quite rich, and you went from your mother’s upper class to being even richer (10 times richer) than she was, though not as much as your peers rose above their parents because they became 25 times richer than their parents.   And your position compared with others, declined quite a bit compared with your mother’s position relative to her contemporaries.  Yet, you are making ten times what she made.  So on this measure, compared with your peers you were downwardly mobile relative to your mother’s class compared with her peers, even though you make more than ten times what your mother made. 

 

Now suppose, that instead of your income being $1,001,500, your income is $1,030,000, while everyone else’s is $1,000,000.   That puts the correlation now at (30,000/60,000), which is .5 and is supposedly not a very economically mobile society, even though now everyone is rich, when only a few were before.  Everyone else grew the same amount as they did in the first case, and you earned a lot more too, but not proportionately as much more as they and not as proportionately higher than they as your mother was to her peers.

 

Oppositely, if the country were to fall on really desperate economic times and your income was $10,000 and everyone else’s was $5,000, the correlation would then be (5000/60,000 = .083) making it almost twice as “economically mobile” as Denmark.  But notice the “mobility,” if anything, is downward, not upward.  So this measure does not measure a good thing and insofar as it is a measure of economic mobility, economic mobility is then not a good thing – and certainly not something that shows you have a better chance of being more well off than your mother was.

 

Even if you find all this hard to follow, you can see the objective measure is not something that reflects what we would consider to be economic mobility, which should have more to do with how your income stacks up in ability to meet or exceed your needs and where you stack up in regard to whether you have resources in addition to purchase conveniences and luxuries.  How you stack up against your peers (which may determine your relative ‘class’ rung on the ladder) is not as much a sign of upward mobility as being able to live in a higher lifestyle than your parents in terms of your access to more security, conveniences, and luxuries.  In fact, the lament at the end of the quoted passage that “For the first time, Americans don’t believe their children will be as well off as they have been” is a serious problem but not one of economic “mobility”, but of economic progress, and actually would even contribute to ‘mobility’ on the measure used by the Pew Economic Mobility Project if it leveled out people’s incomes or just lowered the incomes substantially of the children of wealthier parents.

 

Finally, consider what it means to be “a country defined by the promise that whoever you are, you have the same chance as anyone else to rise, with pluck, industry, and talent”.  You are not kept down by your original class.  At first blush that would seem to be a good measure of an economic system’s mobility, and one of the major conservative economists of the last century, Friedrich Hayek, thought that an economic system’s fairness was signified by something of this sort – that everyone had the same chance to become wealthy as anyone else.  But 1) if no one has much chance to rise, that meets the criteria but it is a hollow promise or empty enterprise, and 2) notice that everyone has the same chance of becoming wealthy by winning the lottery, but that would not make an economic system based on a lottery or any other sort of gambling or unproductive distribution of existing wealth be a fair or good system, even if there were numerous lotteries.

 

I don’t think that mobility of the sort that is desirable has to do with ratios of poor to rich or with relative changes in class among different generations of rich and poor.  What is desirable is that everyone have a fair and reasonable opportunity to apply themselves and work hard at something right or good to do (not something like crime) and will likely succeed because of that, not that they have an equal chance (which could be zero or 1 in a billion) as everyone else to succeed.  And by ‘succeed’ I mean be able to have a decent life and one that fairly rewards people proportionately to the contribution they make toward the total bounty of available goods and services. What you want is that if everyone who is able to, works and has a fair opportunity to work, they all together produce enough for a good life for all and that each person receives his/her fair share of what they all produce.[2]   Upward economic mobility is about the reasonably good opportunity to have more goods and services than your parents did, particularly if they had relatively little access to security, necessities and conveniences, and you have great access to them and to at least some luxuries.  Upward mobility from everyone's being millionaires to being billionaires is not nearly as important as upward mobility from starving or barely getting by to being reasonably comfortable and secure.


Experimental Results in Psychology Making the Same Kind of Error: using particular
objective measures to confirm or deny claims about subjective, abstruse, or open-ended concepts

The following in normal black font is from different sections of "Rethinking Rationality: From Bleak Implications to Darwinian Modules" Richard Samuels, Stephen Stich, and Patrice D. Tremoulet.  My analysis of it will be in red font in the appropriate places:
"About thirty years ago, Amos Tversky, Daniel Kahneman and a number of other psychologists began reporting findings suggesting much deeper problems with the traditional idea that human beings are intrinsically rational animals. What these studies demonstrated is that even under quite ordinary circumstances where fatigue, drugs and strong emotions are not factors, people reason and make judgments in ways that systematically violate familiar canons of rationality on a wide array of problems. [The canons involved here are not that familiar to the test subjects.  And the questions do not necessarily test for understanding of, or compliance with, those canons.] Those first surprising studies sparked the growth of a major research tradition whose impact has been felt in economics, political theory, medicine and other areas far removed from cognitive science.
...

"The Selection Task:
In 1966, Peter Wason reported the first experiments using a cluster of reasoning problems that came to be called the Selection Task. A recent textbook on reasoning has described that task as "the most intensively researched single problem in the history of the psychology of reasoning." (Evans, Newstead & Byrne, 1993, p. 99) A typical example of a Selection Task problem looks like this:

"What Wason and numerous other investigators have found is that subjects typically do very poorly on questions like this. Most subjects respond, correctly, that the E card must be turned over, but many also judge that the 5 card must be turned over, despite the fact that the 5 card could not falsify the claim no matter what is on the other side. Also, a large majority of subjects judge that the 4 card need not be turned over, though without turning it over there is no way of knowing whether it has a vowel on the other side. And, of course, if it does have a vowel on the other side then the claim is not true. It is not the case that subjects do poorly on all selection task problems, however. A wide range of variations on the basic pattern have been tried, and on some versions of the problem a much larger percentage of subjects answer correctly. These results form a bewildering pattern, since there is no obvious feature or cluster of features that separates versions on which subjects do well from those on which they do poorly.  [A sample question which many more people answer better, though the logic is identical to the letter/number problem, is below.  I will argue there is an obvious recognizable feature that separates the version on which subjects do well from the one on which they do poorly.  And this difference is one commonly problematic in trying to teach students any sort of logic by starting with examples they can understand from knowledge or partial knowledge of the subject matter, but then trying to apply the same principles to cases whose subject matter they don't know as well.  They often cannot see the relevant analogies. More about that later.]
...

"The Conjunction Fallacy
Ronald Reagan was elected President of the United States in November 1980. The following month, Amos Tversky and Daniel Kahneman administered a questionnaire to 93 subjects who had had no formal training in statistics. The instructions on the questionnaire were as follows:

In this questionnaire you are asked to evaluate the probability of various events that may occur during 1981. Each problem includes four possible events. Your task is to rank order these events by probability, using 1 for the most probable event, 2 for the second, 3 for the third and 4 for the least probable event.

Here is one of the questions presented to the subjects:
Please rank order the following events by their probability of occurrence in 1981:
(a) Reagan will cut federal support to local government.
(b) Reagan will provide federal support for unwed mothers.
(c) Reagan will increase the defense budget by less than 5%.
(d) Reagan will provide federal support for unwed mothers and cut federal support to local governments.
The unsettling outcome was that 68% of the subjects rated (d) as more probable than (b), despite the fact that (d) could not happen unless (b) did (Tversky & Kahneman, 1982). In another experiment, which has since become quite famous, Tversky and Kahneman (1982) presented subjects with the following task:
Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations.

Please rank the following statements by their probability, using 1 for the most probable and 8 for the least probable.
(a) Linda is a teacher in elementary school.
(b) Linda works in a bookstore and takes Yoga classes.
(c) Linda is active in the feminist movement.
(d) Linda is a psychiatric social worker.
(e) Linda is a member of the League of Women Voters.
(f) Linda is a bank teller.
(g) Linda is an insurance sales person.
(h) Linda is a bank teller and is active in the feminist movement.
In a group of naive subjects with no background in probability and statistics, 89% judged that statement (h) was more probable than statement (f). When the same question was presented to statistically sophisticated subjects -- graduate students in the decision science program of the Stanford Business School -- 85% made the same judgment! Results of this sort, in which subjects judge that a compound event or state of affairs is more probable than one of the components of the compound, have been found repeatedly since Kahneman and Tversky's pioneering studies.
 
Base-Rate Neglect
On the familiar Bayesian account, the probability of an hypothesis on a given body of evidence depends, in part, on the prior probability of the hypothesis. However, in a series of elegant experiments, Kahneman and Tversky (1973) showed that subjects often seriously undervalue the importance of prior probabilities. One of these experiments presented half of the subjects with the following "cover story."

A panel of psychologists have interviewed and administered personality tests to 30 engineers and 70 lawyers, all successful in their respective fields. On the basis of this information, thumbnail descriptions of the 30 engineers and 70 lawyers have been written. You will find on your forms five descriptions, chosen at random from the 100 available descriptions. For each description, please indicate your probability that the person described is an engineer, on a scale from 0 to 100.

The other half of the subjects were presented with the same text, except the "base-rates" were reversed. They were told that the personality tests had been administered to 70 engineers and 30 lawyers. Some of the descriptions that were provided were designed to be compatible with the subjects' stereotypes of engineers, though not with their stereotypes of lawyers. Others were designed to fit the lawyer stereotype, but not the engineer stereotype. And one was intended to be quite neutral, giving subjects no information at all that would be of use in making their decision. Here are two examples, the first intended to sound like an engineer, the second intended to sound neutral:

Jack is a 45-year-old man. He is married and has four children. He is generally conservative, careful and ambitious. He shows no interest in political and social issues and spends most of his free time on his many hobbies which include home carpentry, sailing, and mathematical puzzles.

Dick is a 30-year-old man. He is married with no children. A man of high ability and high motivation, he promises to be quite successful in his field. He is well liked by his colleagues.

As expected, subjects in both groups thought that the probability that Jack is an engineer is quite high. Moreover, in what seems to be a clear violation of Bayesian principles, the difference in cover stories between the two groups of subjects had almost no effect at all. The neglect of base-rate information was even more striking in the case of Dick. That description was constructed to be totally uninformative with regard to Dick's profession. Thus the only useful information that subjects had was the base-rate information provided in the cover story. But that information was entirely ignored. The median probability estimate in both groups of subjects was 50%. Kahneman and Tversky's subjects were not, however, completely insensitive to base-rate information. Following the five descriptions on their form, subjects found the following "null" description:

Suppose now that you are given no information whatsoever about an individual chosen at random from the sample.

The probability that this man is one of the 30 engineers [or, for the other group of subjects: one of the 70 engineers] in the sample of 100 is ____%.

In this case subjects relied entirely on the base-rate; the median estimate was 30% for the first group of subjects and 70% for the second. In their discussion of these experiments, Nisbett and Ross offer this interpretation.

The implication of this contrast between the "no information" and "totally nondiagnostic information" conditions seems clear. When no specific evidence about the target case is provided, prior probabilities are utilized appropriately; when worthless specific evidence is given, prior probabilities may be largely ignored, and people respond as if there were no basis for assuming differences in relative likelihoods. People's grasp of the relevance of base-rate information must be very weak if they could be distracted from using it by exposure to useless target case information. (Nisbett & Ross, 1980, pp. 145-6)"
These analyses are incorrect.  In the Reagan and Linda cases, it is presumed that the test subjects think of the combined cases as independent, when in fact they likely do not -- particularly as shown by the answers they give.  And it is perfectly reasonable to think that in some cases a combined event is more likely to occur than either event individually without the other insofar as one believes that they are somehow related at least probabilistically.  For example, if you are asked which is more probable 1) lightning, 2) thunder, 3) thunder and lightning, it would be understandable for people to select 3 more than 1 or 2, if that is all the information they are given in the question.  In the Reagan case, for example, where respondents ranked "(d) Reagan will provide federal support for unwed mothers and cut federal support to local governments" more probable than "(b) Reagan will provide federal support for unwed mothers" it seems to me plausible to think they believed that if he were to be able to provide the support for unwed mothers, he would also cut support to local governments, as perhaps one way to keep a budget in balance, and that if he couldn't do both (or more precisely, if he couldn't find a way to keep the budget in balance and get those funds from somewhere else without raising taxes), he would be less likely to provide support for unwed mothers.

Or suppose you had to order the following in terms of probability in regard to some random person P, where P could be anyone anywhere in the world at a any time:
       P died in New York City on 911 in the World Trade Towers collapses.
       P died in New York City.
      P died on September 11.
There is a perfectly good sense, though it is difficult to articulate, in which the first statement is more likely true than either of the others because there is only a 1 in 365.25 chance dying on September 11, and a small chance of anyone’s dying in New York City (unless you know they live there or work there) -- just a little more than 1 chance in a thousand (based on a population of 8.5  million in New York City and a world population of 7.4 billion, in statistics from 2015) -- but there is a big chance that someone who died in New York City on September 11 was a victim in the terrorist attack on the Trade Towers.  Surely more people died in that attack then otherwise died in New York City on that or most other days, since the average daily number of deaths in all of New York State in 2014 was 404, which was exactly the same in 2008, 2012, and 2013, and was 403 in 2011, and 397 in each of 2009 and 2010.  If the second statement were P died in London and/or the third statement was P died on April 3, people would probably still be inclined to pick the first statement as more likely not only in spite of the combination but because of it.
 
So the question is ambiguous or misleading as to what probability is being sought.
 
And in the Jack and Dick, lawyer/engineer base-rate questions, it seems pretty obvious to me that respondents believed they picked up on personality or other behavioral cues that the test creators did not think existed in the descriptions.  When it is said that
Some of the descriptions that were provided were designed to be compatible with the subjects' stereotypes of engineers, though not with their stereotypes of lawyers. Others were designed to fit the lawyer stereotype, but not the engineer stereotype. And one was intended to be quite neutral, giving subjects no information at all that would be of use in making their decision
it seems clear to me that the design and intentions were not successful, and that Nisbett and Ross, for example, are mistaken in assuming that what they call “useless target case information”, “totally nondiagnostic information”, and “worthless specific evidence” is perceived in that way by respondents who, if Nisbett and Ross were right, should ignore it and give the same answers they would give with only the base-rate information.  But clearly, respondents think that the 70/30 ration between engineers/accountants or between accountants/engineers is insufficient to make them question their ability to discern traits they think apply to one or the other, whether psychologists think those traits in the descriptions are neutral or not.  Just because Kahneman and Tversky believed they had descriptions neutral in regard to stereotypes of lawyers or engineers did not mean their test subjects perceived those descriptions that way.  And apparently they did not.  They perceived something they thought overrode the ratios given, because when there were no descriptions given, they went simply by the ratios.  And the fact you know that only 30% of the group in question is lawyers is not going to deter you from guessing someone is a lawyer if you believe you perceive cues you think more likely indicative of a lawyer than an engineer.  Now whether the cues you think indicate someone's being more likely a lawyer than an accountant are reasonable or not is perhaps open to conjecture, but insofar as one does have such a cue in mind, the base rate information is not relevant, and it is reasonable to ignore it.  While it may not be reasonable to believe x implies y, if you do believe it, then it is reasonable to believe y is true when you perceive x.

All kinds of information can seem to indicate or stereotype people and professions.  I was in a supermarket checkout lane one time between a woman who was buying only M&Ms and a bottle of wine.  I commented that at least she had two of the major food groups even though they didn’t go together.  She said that her husband actually liked them together and that every night he had a glass of wine with nine M&Ms.  I joked that it seemed seven or ten would be better with the wine, and she said “No, he always has precisely nine.”  So I asked whether he was an accountant, and she said “No, he is an engineer.”  I should have guessed that first because I heard countless stories of engineers who were totally rigid in their thinking and had seen it first hand with some myself.  And in fact, one engineer (who did not fit the stereotype) told me one time that “engineers are people who are good at math but who don’t have enough personality to be an accountant”.  And another engineer told me that one joke popular among engineers is “The way you tell the difference between an engineer who is an introvert and one who is an extrovert is that the extrovert engineers looks at the shoes of the person he is talking to instead of his own.”  And who would have thought that "looking at shoes when talking" is a stereotype sign of an engineer.  There are probably all kinds of elements in the Kahneman and Tversky’s descriptions that respondents take to be more likely indicative of either lawyers or of engineers than Kahneman and Tversky or other psychologists recognize.  If I am right, then if they had asked the respondents why they chose the answers they did which went against the given base-line odds, they would have seen what the perceived cues were and that they existed in the minds of the respondents.  As I point out later about Piaget’s experiments with children, normally one should at least ask respondents why they give the answers they do before jumping to speculative conclusions about why they did.  I was giving a talk to a group of retired highly professional senior citizens one time and to illustrate some point I was making, I gave them a progression of numbers and asked what the next number should be in the progression.  All but two gave the answer I expected, but two of them gave an answer that just seemed screwy to me.  I started to just say the majority answer was the right one, but instead of doing that, I asked why the two people gave the number they did.  It turned out they saw a different formula that also accounted for the progression of the numbers already given, and their formula generated a different next number than the formula the rest of us saw.  That happily illustrated what was the main point of my talk to the group that it is unfair to use student answers on most kinds of tests to grade them, unless you at least also find out the reasons the students gave those answers.
"Before leaving the topic of base-rate neglect, we want to offer one further example illustrating the way in which the phenomenon might well have serious practical consequences. Here is a problem that Casscells et. al. (1978) presented to a group of faculty, staff and fourth-year students and Harvard Medical School.

"If a test to detect a disease whose prevalence is 1/1000 has a false positive rate of 5%, what is the chance that a person found to have a positive result actually has the disease, assuming that you know nothing about the person's symptoms or signs? ____%

"Under the most plausible interpretation of the problem, the correct Bayesian answer is 2%. I will argue that below that what the authors consider to be "the most plausible interpretation of the problem" is not the most plausible interpretation of it. But only eighteen percent of the Harvard audience gave an answer close to 2%. Forty-five percent of this distinguished group completely ignored the base-rate information and said that the answer was 95%.

...
Here is the problem identical in logic, but not content, to the letter/number problem above, but on which most people do better:
"...there are some versions on which performance improves dramatically. Here is an example from Griggs and Cox (1982). 


"From a logical point of view this problem is structurally identical to the [Letter/Number problem above], but the content of the problems clearly has a major effect on how well people perform. About 75% of college student subjects get the right answer on this version of the selection task, while only 25% get the right answer on the other version. Though there have been dozens of studies exploring this "content effect" in the selection task, the results have been, and continue to be, rather puzzling since there is no obvious property or set of properties shared by those versions of the task on which people perform well."
But the content makes a difference because it is easier for people to keep in mind what the connection is between the opposite sides of the cards and thus apply the logic to it, in the drink/drinking age test than in the letter/number test.  People know the drinking age in many places is 21.  So it is easy to know that if you are 25, you can drink or not drink and that if you are drinking coke, it doesn't matter what age you are.  So the only two important cards are the drinking beer one and the 16 year old one, for if someone is drinking beer, the other side better show they are over 20, and if someone is 16, the other side of the card better not be that they are drinking beer.  But the relationships between one side being a letter of some sort (in this case, a vowel) and the other side being a number of some sort (in this case, an odd number) is not as easy to keep in mind or reason about. 

The tests about which cards to turn over to test whether underage people are drinking or whether cards with vowels on one side have a consonant on the other, are probably intended to be tests of whether people correctly understand "if, then" statements and whether they understand what confirms the four basic kinds of arguments below that utilize them.  But they do not necessarily do that, although it is difficult for most students to know the difference among the four "if, then" argument forms when presented abstractly just using letters to represent statements, and the vowel/odd number test is one example of that.  Whenever you have a statement of the form "If statement A is true then statement B is true", A is called the antecedent and B is called the consequent, and the 'if, then' statement says that 'A implies B' or that "A's being true lets you know that B is also true".  One rough way to think of that is to interpret it as "A leads to B".  This is sometimes represented as "A > B".  The important thing to notice is that the implication is stated to be in only one direction: from A to B, not from B to A.  If the implication happens to be one that can go in both directions, then that would need to be stated separately as both "A > B and B > A"; "if A is true, then B is true and if B is true, then A is true."  There are some relationships which do go in both directions, but most only go in one direction.  Being a triangle with all equal angles and being a triangle with all equal sides imply each other.  But being a square and being a rectangle do not imply each other; being a square does imply being a rectangle but being a rectangle does not imply being a square.  In some cases the implication is easy to see and to remember; in others it is not, and this will be an important point in regard to the different results in the above two tests of supposedly the same reasoning skill, which is essentially the following.
There are four argument forms that start with the premise, 'if A, then B', depending on whether the second premise affirms or denies the truth of the antecedent, or affirms or denies the truth of the consequent.  Two of the forms are always valid, giving good deductions; and two are always invalid, giving flawed deductions -- fallacies:
     valid form: affirming the antecedent to derive the consequent; i.e., knowing the consequent is true because the antecedent is; this is always valid:
1) if A, then B.
2) A. 
Therefore 3) B.

For example:
1) If Annie is a dog, then Annie is an animal.
2) Annie is a dog.
Therefore 3) Annie is an animal.
     valid form: denying the consequent to derive that the antecedent is also not true; i.e., knowing the antecedent must be false because the consequent is; this is always valid:
1) if A, then B.
2) not B. 
Therefore 3) not A.

For example:
1) If Annie is a dog, then Annie is an animal.
2) Annie is not an animal.
Therefore 3) Annie is not a dog.
     fallacy of affirming the consequent (mistakenly believing the antecedent must be true because the consequent is; no argument in this form is valid):
1) if A, then B. 
2) B. 
Therefore 3) A.

For example:
1) If Annie is a dog, then Annie is an animal.
2) Annie is an animal.
Therefore 3) Annie is a dog.
     fallacy of denying the antecedent (mistakenly believing the consequent must be false because the antecedent is; no argument in this form is valid):
1) if A, then B. 
2) not A. 
Therefore not B.

For example:
1) If Annie is a dog, then Annie is an animal.
2) Annie is not a dog.
Therefore 3) Annie is not an animal.
In the abstract without an example, this is somewhat difficult for people to understand, but it is particularly difficult for most people to apply when they are not familiar with the content because it is difficult for them to keep in mind what is related to what and in 'which direction'.  There is an important difference in that regard between the vowel/odd number question and the beer/age question because most adults and older teens are familiar with the latter correlation, but the former one is arbitrary and unfamiliar.
(As an aside, though an important one, the logic of this, as one way implications, is also difficult for people to apply when they are very familiar with the content and they know or believe that the particular content goes in both directions, and that not only "if A, then B", but "if B, then A", which is the same as "A if and only if B", and which comes out to be the same as "if A, then B, and if not A, then not B".  For example in the test for whether a cake is finished baking or not: if a toothpick stuck into the center comes out clean, the cake is done, and if it comes out with particles stuck to it -- i.e., not clean -- the cake is not done.  The toothpick comes out clean if and only if the cake is done and it comes out dirty if and only if the cake is not done.  However, that is not the problem for the letter/number test and the drink/drinking age test.  The difference between those two test results can be explained by the familiarity/unfamiliarity distinction.)
For a different, more problematic, example, if your spouse says "I may go to the store later" that clearly lets you know that if she goes to the store, then she won't be home (at that time), but it also can imply that if she is not at home, she will be at the store, even though it does not definitively imply it because she is not saying that she won't go out otherwise or go somewhere else too.  So it may be understandable that if you come home and your spouse is not there, to assume s/he went to the store.  And it is also the case that if you guess wrong, your spouse will imply it was your fault.  E.g., if you ask whether they went to the store or not and they said "Well, I told you that I might go out to the store, and I was out, wasn't I!" that meant they thought they had said they were only going out to the store if they went out.  And if they say, ""I never said I was only going to the store if I went out" that means they didn't mean it both ways and it may even imply s/he didn't go to the store at all, but went somewhere else instead.

However, most people do not have problems with the above four argument forms when they are knowledgeable about and familiar with the relationship between the truth of A and the truth of B, and  know that it only goes in one direction.  For example, you know that if someone was murdered that they are dead, and you know that people can be dead without having been murdered.  Now there are four possibilities that you find out about Jones:
     Jones was murdered (it is accurately on the news).  In this case, you know Jones is dead.

     Jones is not dead (you see him working in his yard).  In this case, you know Jones was not murdered.

     Jones is dead (you find his body or his tombstone).  In this case if you think that means Jones was murdered, you can't really know that for sure, even if Jones seemed to be perfectly healthy earlier.

     Jones was not murdered.  In this case, if you think that means Jones is not dead, then you may be wrong or you may be right, but it is a mistake to think it must mean Jones is not dead.
But if you are told "If the card has a vowel on one side, it will have an odd number on the other" it is difficult to connect vowels with odd numbers, because that is not any kind of natural or familiar relationship and it easily could have gone the other way.  So it is easy to get confused about which card is necessary to turn over to disconfirm the statement that "If the card has a vowel on one side, it will have an odd number on the other".  That is a fairly convoluted problem, and failure to get it right, doesn't show lack of logic any more than it shows you just got confused about which kinds of letters were lined up with which kinds of numbers.

On the other hand, knowing that Jones is dead because you found out he was murdered is not necessarily a result of your having made a deduction of the above sort either.  So even if you were given cards that on one side says "Jones is dead" or "Jones is alive" and on the other side says "Jones was murdered" or "Jones was not murdered" and you were asked which cards would confirm that the cards have the correct things on the opposite sides that match what is on the side you see, and you know which ones to turn over, that doesn't mean you are doing it from reasoning alone.  You can easily know that turning over a card that says "Jones is dead" won't tell you anything because even if the cards are done correctly the other side can say he was murdered or he was not, and neither will turning over a card that says "Jones was not murdered", because he could be dead or not be, and you can't tell the card is wrong from either way.  It is only the following cards that would show the cards are not labeled correctly:  cards that say "Jones was murdered" on one side and say "Jones is not dead" on the other.  So cards that say either of those two things are the only cards you need to turn over to see what is on the other side.  But people can know that or figure it out without understanding the general principles or the forms of "if then" arguments which are valid and which are not. 

I tell my students to keep the murdered/dead example in mind when looking at any if/then argument because it can help them analyze them by analogy if they cannot analyze them through the logic alone.  But they often cannot do that, if the subject matter is not familiar to them or if they have been taught the content incorrectly.  For example, many science textbooks say that scientists confirm or disconfirm hypotheses by seeing whether experimental results conform to predictions based on the hypothesis.  But that is a clear case of the fallacy of confirming the consequent in order to affirm the antecedent, so it cannot be how science works, or it would not work at all.  (See "The Nature of Scientific Confirmation").  But students will use every possible excuse they can think of to try to justify the truth of the textbook claim.  Or, there is a published article in which the author argues:
1) If we could theoretically predict human behavior accurately in the way we can theoretically predict the behavior of planets or other simply material objects, then that would show there is no free will.
2) We cannot theoretically accurately human behavior [for a flawed reason he gives].
Therefore, 3) there is free will.
Most students get all caught up in whether there is free will or not and whether human behavior can theoretically be accurately predictable or not instead of seeing quite simply that even if statements 1 and 2 are true, they do not show that statement 3 is true, because the form is the fallacy of denying the antecedent to derive the denial of the consequent.  It is exactly analogous to:
1) If we could prove Jones was murdered, we would know he is not alive.
2) We cannot prove Jones was murdered.
Therefore 3) Jones is alive.
But many students see that the latter argument is flawed but cannot see the former one is.  They can see the logical flaw in the second argument but not the identical flawed logic of the first one, because  the subject matter content, confuses them.  (By the way, the reason the first argument talks about being theoretically predictable is because although weather is difficult to predict, we do not therefore think it has free will.  We think that weather is theoretically predictable if we just knew and could keep track of all the variables, as computer modeling currently does to some extent with some success.)

There are many things we tend to know more by familiarity than by reasoning.  E.g., the understanding of “place value” in arithmetic.  Children have a difficult time learning it (see Understanding and Teaching Place-Value) but most adults can work with it pretty well even though it is unlikely they really understand its logic, which I explain in the essay The Socratic Method: Teaching by Questions and in response to which I have received emails from elementary school math teachers saying that reading it let them understand place value for the first time.  One can work with things and answer questions about them -- even teach them -- without understanding their logic or basis.  Another example of that is that many students can do arithmetic but have great difficulty with algebra.  They can work with familiar concrete manipulations but do not really understand numerical relationships they have not had to work with or have not discovered on their own.

In regard to the test about the false positives above, the concepts involved in the question are particularly unfamiliar (i.e., a 5% erroneous test about a .1% occurring condition), and that, combined with the ambiguity of the statement "The test gives 5% false positives" to mean either that 'the test is wrong 5% of the time when it says the condition is present' or mean (as the researchers take it to mean) that "the test will say positive in 5% of negative cases", is highly likely to make people give the wrong answer, regardless of their reasoning ability -- if they cannot ask for clarification of the question or have any reason to believe there is some computation involved in what you are asking, or that the .1% frequency of the disease is relevant to what is being asked.  E.g., if 5% of the tests are false positives, then 95% are not -- meaning that they either are positive and not false, or they are not positive.  And in fact, it is really difficult to see what it even means to ask "What is the chance of being right with a 5% false positive test for a .1% condition."  That anyone cannot do that while answering a set of questions they probably don't care about in a short amount of time, doesn't show they have difficulty reasoning.  Taking it to mean they do, gives false positives itself.

In general if you give intelligent people who should be reasonably knowledgeable questions which many of them miss, it is perhaps a sign there is something wrong with the questions and how you are interpreting the results you get from using them.  And it seems to me you should normally ask people why they chose the answer they did instead of just speculating why they did from their answer.

To that end, I presented some students of mine with the letter/number question above as the first question in a three-question quiz, with the beer/age question being the second question on the quiz.  The third question asked them to explain their answers to the first two questions.  Their answers to the third question was much more indicative in many cases (than were their answers to the first two questions) of whether they were being rational or not in determining their answers to the first two.  The students’ answers to the third question are in blue font:

 

First student answer: 

1. If a card has a vowel on one side, then it has an odd number on the other side.

a.  Since E is a vowel, then it has an odd number on the other side.

b. Since 5 is an odd number, then it has a vowel on the other side.  [This is not what is given in the problem, and the student has made one of at least two possible errors in thinking it is true: 1) they assumed the converse of a true 'if then' proposition is true, which would be a logical error, or 2) they misunderstood or misremembered the rule to be that vowels and only vowels have odd numbers on the other side, which would be an error of understanding, interpretation, or memory, not of reasoning.]

c. Therefore, E and 5 should be turned over because the vowel will have an odd number on the other side, and the odd number will have a vowel on the other side. [But if they misunderstood the rule or thought the converse of a true 'if then' proposition is necessarily true, that does not explain why they then failed to say you need to turn over the card with the "4" on it or the one with the "C" on it, which would logically follow that you would have to do under that understanding.]

 

2. If a person is drinking a beer, then he must be over 20 years old.

a. Since you must be over 20 to drink beer, then the person drinking beer must be checked.

b. Since you must be over 20 to drink beer, then you must check to be sure to [check to see the] 16 y.o is not drinking beer.

c. Therefore,  anyone drinking beer must be over 20 and anyone 20 and under must not be drinking a beer.  [The student appears to have the logic of this one correct while missing the identical logic of the vowel/odd number one.  But I will argue and explain later that the student may not be using logic in the beer/age case, but simply pointing out known facts.]

 

Next Student

1: If you turned over the E (first card), the 5 (the third card), and the 4 (the last card); you would know if the vowel had an odd number and you would know if the odd number had a vowel [the part I underlined is irrelevant], and you would know from looking at the 4 if the statement was false if it had a vowel behind it. We would not need to worry about the c (the second card) because it doesn't matter what is behind that card, we are only looking for cards that are both odd numbers and vowels. This student has reasoned through this in a partially correct way that has the right sort of idea for deciding, but has either got confused about the original given relationship, has done the reasoning incorrectly, or didn't keep track of her results accurately.  The statement I put in bold purple should have read "we are only looking for cards that are both even numbers and vowels, not odd numbers and vowels.  But given what she did say, that does not explain why she said to look at the "5" card, for under that (mistaken) view, you  would only need to look at the  E and the 4.  So in one way it is possibly a reasoning error but even if not, she did not see the significance of her reasoning.]

 

2:  I chose drinking beer card and the 16 years old card. The reason I chose these two is, the law is for people drinking beer, so, we would need to turn that card over to check to see if their age is over 20. You would also need to flip the last card because that person is under 20, and you can see if they are drinking beer. Because she obviously understands the logic here, it seems most likely to be the content in the letter/number problem, and her inability to keep track of it, that made her make all the errors she did in answering it.

Next student:

To verify it is true, you turn over the C card. If there is an even number, you know the claim to be true.  [This is incorrect, and shows either a misunderstanding or lack of logic, but it is not clear which.] But, we aren’t done. In order to make sure that the claim holds, we have to test the statement for falsity as well. This is the step most people don’t realize they must do.

if I turned over the 5 card and found a vowel on the other side, it would mean that the claim is false. I must turn over the 5 card in order to check this, or I cannot be sure that my claim holds strong in all possible scenarios presented.  [This is mistaken because she seems not to know that the vowels are suppose to have odd numbers, not even ones, so she has the right idea, but the facts wrong.]

We do not need to check the E card, because it has no vowel [clearly she lacks the knowledge that E is a vowel] and we are only talking about cards with vowels. We also do not need to turn over the 4 card,

The reason we don’t need to turn the 4 card over, is because our claim is not talking about what we find on the reverse side of even numbered cards. All we need to check is that there are no vowels paired up with odd numbered cards, [here she has clearly got confused about what is supposed to be on the back of vowels -- what you have to check is just that no vowels are paired up with even numbered cards] because those sort of combinations are in direct conflict with our original claim.

IF there was a VOWEL on one side, THEN it has an odd number on the other side. [Here she is saying the original directive correctly, but it has no carryover for showing her she was saying it incorrectly previously.]

It *does not* say that all even numbers must be paired with vowels. It *does not* say ‘if there is an even number on one side, then there is a vowel on the other side’.  [But it implies that if there is an even number card, it must have a consonant on the other side, so either she is just confused here or is making a logical error about what is relevant, but the prime directive is just really difficult to keep in mind, so it is difficult to tell whether someone has confused it or made a logical error, just from the answers they mark.] It only says that, if there is a vowel on one side, then there is an odd number on the other side.

 The beer problem is exactly the 1st problem,

“If a person is drinking beer, then they are over 21 years of age.”

In the beer card problem, we first need to turn over the beer card. Because if the age on the reverse of the card was less than 21, the person would be breaking the law. The other card we need to turn over is the 16 card because if the 16 year old is drinking beer, they are breaking the law. We can’t know if they are breaking the law or not until we turn over the card.

We do not need to turn over the soda card because it isn’t alcohol, anyone, both under and over 21 can drink it. We also do not need to turn over the 25 card because adults can drink soft drinks as well as alcohol there is no need to check what the person who is of legal age is drinking.  [She has this right and sees correctly it is the same problem as the letter/number problem, but she clearly could not keep the letter/number relationships in mind, and perhaps did not try to write them down or work them out in some diagrammatic  way to help her keep from confusing them.  Each time I myself do the letter/number problem or comment on the student answers, I can easily get it messed up too if I try to just intuit it strictly in my head without being more systematic in keeping track of the relationship and which combination is the one given initially.]

Another student:

Question 1: Says the card that has a vowel has an odd number on the other side. So the clear answer is card "E" the vowel, and card "5" the odd number. [This students is just stating what he chose, but not explaining it.  So I cannot tell whether he chose rationally or luckily or coincidentally.]

 

Another student:

In question 1, all of the cards have to be turned over to verify whether or not the statement is true. In question 2, the cards with "drinking beer" and "16 years old" must be turned over. [This student has the first one wrong and the second correct, but without knowing the reasons for either answer, it is impossible to tell whether they arrived at their answers through correct reasoning in the right one or incorrect reasoning in the wrong one or were just intuiting/guessing or what.]

 

Another student:

I picked E since it would be the most direct proof of whether or not the claim was true. If there is an odd number on the other side, then I know the claim is true and if not, then it would be evidence that it's false. In the case of number 4, it could help to indirectly test the claim. If it has a vowel on the backside, then that would show the statement to be false. I didn't pick C as the statement does not claim that odd numbers cannot be on the back of consonants, only that if the letter is a vowel the number will be an odd one.  [However, this student made a careless error by marking “C” even though saying it was not to be marked.  I asked her later and she had checked it originally by intuition and then after reasoning through it in writing, she intended to uncheck it and forgot.  And that means just seeing her marked answer would have been mistaken evidence she was not rational.  Yet she reasoned it out perfectly and just made a 'careless' error after doing so.]  The evidence gathered could be used as indirect evidence (by not refuting the claim), but I don't believe it to be in this case. With 5, I initially thought it'd be correct to turn it over, but the thing is, the logic used would be a converse of the original claim: "That if a card has an odd number, then a vowel will be on the opposite side." That isn't necessarily true because, as I pointed out with C, the letter on the back could very well be a consonant since the claim didn't state that odds can ONLY be on a vowel. So, it wouldn't exactly be concrete evidence for the claim either way.

 

Question 2.

 

I only picked "drinking beer" since the law specifically states that I would have check the card only if the person is drinking beer. I shouldn't go around checking every underage person about their drink unless I know that they are drinking beer. That is all law gives me to permission to check. I could check the 16 year old card just to be sure that they're not drinking beer, but the only one I absolutely have to check is the one I know is drinking beer. Drinking coke isn't against the law and if someone is over twenty, I have no reason to check them.  [In this one, she got it wrong because she brought into her reasoning the issue of invading privacy without probable cause in the legal sense.  She was reasoning about the logic of the problem correctly, but didn't realize it was only a logic puzzle, not a legal one as well.  She basically reasoned through the relevant part correctly, but misinterpreted the point of the question sufficiently to mark the wrong answer.]

 

There are a number of ways to determine the right answer in the letter/number question.  The Samuels, Stich, Tremoulet article gives one explanation; the students here give their explanations; the four murdered/dead -- antecedent/consequent forms above can work also.  But there is another way to do it too, using more general principles from derivations in propositional logic.  The original directive in the letter/number problem is essentially that "If you have a vowel, then you must have an odd number".  This is an "if then" conditional statement.  As explained above, in formal propositional logic, the "if" part of any conditional is called the antecedent, and the "then" part is called the consequent.  This was explained above in regard to the murdered/dead examples, but the more general principle follows from the fact that  "If then" statements are only false in formal logic when the antecedent is true and the consequent is false.  That means "if then" conditionals are only false when you have the combination of a "true antecedent and false consequent"; in all other cases they can be true.  That means or implies for "if then" statements to be true you must have "not-(true antecedent + false consequent)".  By logical principles that one can see if one works through them, that turns out to be the same as saying they are true whenever you have either a false antecedent or a true consequent, which in the case of the letter/number problem means having either a consonant (i.e., not a vowel) or an odd number.  Since either of those things will make the card fit the principle no matter what is on the other side, the consonant cards and the odd number cards do not need to be examined.  So in the letter/number question, the only combination that would make the directive false is a vowel with an even number, which is essentially what some of the students said though they were not taught the formal logic manipulations.  That means you have to check to make sure no card has a vowel and an even number.  That means you only need to check the vowel cards to make sure they have no even number, and the even number cards to make sure they have no vowel.  The consonants don't matter and the odd number cards don't matter.  Consonants can have odd or even numbers, so checking them proves nothing.  And odd numbers can have vowels or consonants, so checking them proves nothing.  But that is harder to keep in mind than knowing that you don’t need to check people drinking coke because it doesn’t matter whether they are under 21 or not, and it doesn’t matter whether anyone at least 21 is drinking beer or not.  And you don’t need to go through the logic of it for beer because you know the factual parts.


Now it seems to me that one is being rational no matter which of those methods (or possibly some other method that works correctly) one uses.  One does not need to know formal logic either using general principles and manipulations or even which antecedent/consequent forms are always valid and which are always invalid.  Those are just formal descriptions and reminders or shortcuts of the logic involved.  The physicist Richard Feynman once said that the rules of algebra were just ways to teach people who couldn't do math to get the right answer.  They were not math reasoning themselves but formal representations of some math reasoning.  I think the same is true for logic and formal logic manipulations.  In some cases people can get a right algebra or logic answer by doing the proper manipulations by recipe, but that does not make them 'rational' or 'mathematical' in the sense of being able to understand the reasoning or logic of what they are doing.  And not knowing the rules of logic or algebra does not mean one is not rational in determining the right answer through some other method. 


However
, there are some methods of getting the right answer which do not show rationality of the sort in question.  Obviously, cheating by finding out someone else's right answer does not.  But neither would some cases of experience that show that a new problem is so similar to one whose answer you know that you can make a very simple analogy to get the answer.  Once you have seen the problem with one form of reasoning by thinking about it, you can often see that different problems (i.e., different subject matter) in the same form has the same problem -- even if you do not reason through the specific subject matter flaw itself.  For example, once you see that the truth of "All A are B" does not necessarily imply that "All B are A" because you see cases such as "All dogs are animals" does not show "All animals are dogs" or that "All squares are rectangles" does not mean that "All rectangles are squares" or that "All murdered people are dead" does not mean that "All dead people were murdered", you can know immediately that even if "All good things are things that make people happy" that would not necessarily show "All things that make people happy are good things."  And you know that, even if you don't know, for some other reason, whether all things that make people happy are good things or not.  You only know it doesn't follow from the statement that "All good things are things that make people happy."  Some people have more trouble generalizing this sort of thing than others, and they have more trouble transferring it to different subject matter, but I don't think that makes them necessarily less rational.  And it certainly does not mean they cannot rationally solve problems of the same form by simply knowing or thinking more about the specific cases.  The relationship between knowledge and reasoning is not always straightforward.


In my essay "Words, Pictures, Logic, Ethics, and Not Being God" I point out that logic is unnecessary if one knows the truth of a statement directly.  My standard example is in playing poker -- you have to use logic and probabilities to try to determine as best you can what your opponents' cards are, but you don't need logic or probability to know what your own cards are, because you simply look at them.  Of if a spectator sees each player's hand, the spectator knows who has the winning hand if the cardholder does not fold, but plays it out.  Short of having the highest possible hand, the players themselves cannot know who has the highest hand but have to try to deduce it from whatever evidence they can muster.


And the more knowledge you have about any subject, the less you have to rely on logic to know things about it.  The movie 
Sully illustrates the difference between what might be called 'pure' logic and knowledge that comes into play in evaluating evidence.  In the movie, based on the true story of airline Captain Chesley "Sully" Sullenberger's safe landing of U.S. Air's flight 1549 on the Hudson River after losing power in both engines shortly after take-off, due to flying through a flock of geese, the National Transportation Safety Board (NTSB) held hearings that presented evidence that instead of the landing's being a life-saving miracle of skill and knowledge, it put all the passengers' and crews' lives in unnecessary jeopardy.  Their evidence for that claim was that the simulations they conducted after the event showed that, given the same altitude, speed, wind conditions, distance from two different airports with available open runways, etc. the plane could have safely landed at either without risking the incredibly dangerous water ('crash'-) landing.  The argument implied in the movie was basically:

1) In simulations using the data from the plane at the time of the engine failures from collisions with the geese, the 'plane' landed safely at La Guardia or at Teterboro airports.
2) The actual plane under those conditions would have been able to do the same thing as was accomplished in the simulators.
Therefore 3) the water landing was unnecessary.
4) The water landing was extremely hazardous and no commercial airliner previously ever survived such a landing.
Therefore 5) Captain Sullenberger unnecessarily jeopardized the crew and passengers on that flight.

It certainly appeared in the movie that Sully was guilty of an error of judgment that would have cost him his job and his pension.  Two of the simulations were conducted by the airplane manufacturer with live video feeds showing them to the NTSB hearing as they were being conducted, and in the first, the simulated plane landed safely at La Guardia and in the second, it landed safely at Teterboro.  This mirrored the previous results conducted by others for the NTSB, and it was impressive, dramatic evidence for the argument. 


But the above argument is invalid, and Sully was able to show almost immediately, dramatically in the movie, that it was not only invalid but that conclusions 3 and 5 were actually false.   However, he was able to do that from more than just the logic of them, which to most of us, and to the NTSB seemed airtight (no pun intended; well ... maybe a little bit intended).  In the movie, his initial response to the evidence of the simulator flights was "Can we now get serious!"  He knew that the simulated flights did not reflect the actual total conditions, though they reflected the technical flight conditions. But there was more to it than just the technical flight conditions.  The premises -- the evidence statements -- 1 - 4 above were all true, but were not the whole truth.  And what was missing invalidated the conclusion, and also would show it to be not just unproved, but actually false.  He pointed out that they were trying to assign human error to him but left out all the human elements from the simulations. 


First, he asked how many practice runs the simulator pilots made with the data before this live broadcast simulation.  The answer was 17.  Sully pointed out that he and his co-pilot didn't have that luxury or any such rehearsals and practice runs.  He also pointed out that in the simulations, immediately after contact with the geese and seeing both engines failed, the plane was immediately turned to La Guardia or to Teterboro, with no questions about what had happened and whether the engines could be restarted or whether that was a risky choice or not which might have put in jeopardy the lives of thousands of people over and above the lives onboard the aircraft.  It was essentially a test run with hindsight as to the technical problem and without regard for the catastrophe of failure.  Sully and his co-pilot didn't have that luxury either.  There were things to check and decisions to be contemplated that took time.  Sully asked to have the simulations re-run with a time factor built in for the decision making after the engine failures.  The NTSB directed 35 seconds to be waited after the engine failures before the plane was turned.  In actual fact, it took longer than that, but Sully was willing to see what happened with the simulations when just the 35 seconds were added.  In both those simulations, the plane crashed on the way to the airports.  They could not clear the buildings or make the distance.  So while it was true that had they immediately turned, they would have been able to land safely, there was no way to know that at the time, and making the decision to head to either airport could have been a disaster of a far greater magnitude than crashing catastrophically into the Hudson and losing everyone onboard.  The problem was not just one of what was safer for the passengers and crew but also what was the most reasonable option for other possible victims, on the ground or in buildings, as well.  No matter what was chosen, there would be a point of no return from its consequences fairly soon (pun intended) but the actual consequences were unable to be known at the time, so the first decision made had to be pretty much correct.


Now 'pure' logic could have shown the above argument to be flawed, but it is unlikely that without special knowledge of the situation, that the logical flaw would have been apparent.  And, more important, I believe, Sully probably didn't use logic to discover the flaw, but 'immediately' saw the flaw (from his knowledge and experience) to show what was invalid about the logic.  I will give a different, far simpler, example of this sort of thing below.


An omniscient pilot would have turned back immediately (but also would not have taken off before geese got out of the way), but pilots aren’t omniscient and have to use reason, knowledge, experience, and intuition based on experience (or what Sully referred to in the movie as knowing that they couldn't make it to either airport by 'eyeballing' the flight paths required).


But the implied argument is actually more than 1 - 5 above.  It is:

1') In simulations using the data from the plane at the time of the crash the 'plane' landed safely at La Guardia or at Teterboro airports.
2') The actual plane under those conditions would have been able to do the same thing.
Therefore 3') the water landing was unnecessary.
4') The water landing was extremely hazardous and no commercial airliner ever survived such a landing.
Therefore 5') Captain Sullenberger unnecessarily jeopardized the crew and passengers on that flight, and he should have known that.

But you can not derive what he should have known from 1 - 4.  To derive what he should have known from 1 - 4, without making some of the premises include that he should have known their content.  E.g., you would have to add something like for 2' "The actual plane under those conditions would have been able to do the same thing, and Captain Sullenberger should have known that," the second part of which is patently false at the time of the problem.  Similarly, 3' would have to be "The water landing was unnecessary and Captain Sullenberger should have known that."  And the latter part of that is also patently false at the time of the problem.  And this still leaves out the problem of whether a possible potentially failed return that would hit a crowded building or block in "one of the most densely populated places on the planet" made deciding not to take the risk of that possibility, despite the known hazard of a river landing, was worth it.  The simulator crashes had no potential risk of killing actual New Yorkers in buildings or square block areas.  Simulations are safe no matter how they turn out (though I know of one humorous real-life exception); actual planes not being able to make it back to the airport in New York or that part of New Jersey, not so much.  So even if turning back immediately, as in simulations, would have allowed landing safely, you would have to add to the argument that he should have known that, in order to justify the claim  that he should have known to turn back.  But arguably he couldn’t have known that, though the simulator pilots knew it after practice runs and after engineers made all sorts of calculations Sully didn't have the time or computing means to make.  Turning an invalid deductive argument into a valid one makes an argument unsound if the only way to do it requires adding at least one false (or uncertain/unknowable) premise. That will always be the case when the conclusion is false, because the argument then has to be either invalid or have at least one false premise.  Since the conclusion turned out to be false in both arguments, it meant any argument having that conclusion that had all true premises has to be incomplete and invalid, and that if premises were added to make it complete and valid, at least one of the premises would have to be false.  Either way the argument would be unsound and the conclusion not proved.  But it would be difficult to know that if one were not an experienced pilot, even if one were a logician.  And in fact, conclusion 5 (original argument) is true only in the same sense that he unnecessarily jeopardized the passengers and crew by taking off when he did, given that he was going to hit the geese and lose both engines.  It depends on what you count as 'unnecessarily jeopardizing' people.  The jeopardy was only unnecessary under a state of perfect knowledge that could not be had at the time and which could only be seen later once one already knew what would happen and immediately knew the best alternative.  But in that sense of unnecessarily jeopardizing people -- doing what (would have) turned out to be safe -- then Sully did not jeopardize anyone at all because he did get them all back alive and well.


But, getting back to determining whether people are rational or not based on whether they solve a particular problem or set of problems correctly, not being able to solve a logic problem at a particular time may show lack of the right creative inspiration, not lack of reasoning ability.  There are plenty of “brain teaser” logic puzzles that are extremely difficult to discover right answers about, but very easy to see when the right is pointed out (and explained, if necessary).  E.g., clearly “the father of your father is not your father; and, though perhaps less obvious, the cousin of your cousin may or may not be your cousin also.  But is the sibling of your sibling also your sibling?”  Most people would say "yes," but the correct answer is “not necessarily.”  The sibling of your sibling may be your sibling, but the sibling of your sibling may also be you; and you are indeed the only sibling of your sibling if your parents only had two children.  And, obviously, you are not your sibling.

Or consider this: A young boy, his father, and his grandfather are riding in a car that gets into a horrific accident.  The father and grandfather are killed immediately, and the boy is in bad shape but still alive; he needs surgery.  He is transported immediately to a nearby trauma center that has one of the country’s best surgeons on duty.  But upon seeing the patient, the surgeon says “I cannot operate on this patient because he is my son.”  How can that be?  Most people cannot figure out an answer to this, but everyone immediately recognizes how the correct answer  makes perfectly good sense once they are given it -- which illustrates once again the difference between the reasoning and the creativity of problem solving.  That one is not creative enough to find the logical solution does not mean one does not have the reasoning ability necessary to solve it or understand it.  The answer is that the surgeon is the boy's mother.  I have had even ardent feminists not be able to answer it and then, upon hearing the answer, be immediately really upset at themselves for being unconsciously biased in ways they demand others not be.

But there is one more wrinkle to this problem as well.  If the riddle had been that a girl, her mother, and grandmother were in the car, and there was that same horrific accident in which the mother and grandmother both were killed instantly and the girl was taken to the hospital, and the surgeon came out and said "I cannot operate on this patient because she is my daughter," the answer to "How can that be?" would be immediately obvious to most people -- not because they would use logic necessarily, but because being used to surgeons being men, they would immediately realize the surgeon must be her father.  If that even requires a 'deduction' at all, it is a very obvious one, tantamount to knowing (I think immediately) that if a man introduces someone as his child, that would be the same as saying he is the person's father (and that if he is wrong about either statement, he is also wrong about the other).
  'Deducing' (if we are going to call it that) that the man is the child's father from his stating s/he is his child is certainly not the same kind of exercise in logic that deducing s/he is his child from their appearances and/or mannerisms being the same or from checking their DNA or from having seen the child with its mom and finding out the woman is the man's wife (though, of course, he could be her second husband and the child's stepfather).  Or suppose the hospital staff immediately recognizes the child and knows it is the on-call surgeon's daughter and knows he would not operate on her, then it requires no deduction at all.  We would hardly call people rational if they knew an answer without having to deduce it. 

Or suppose one gets an answer to a 'logic' question correct because the problem presented is very similar to one that the person already knows the answer to.  The 'only' logic involved is seeing the similarity and then applying the techniques one knows.  If one figured out the answer by oneself the first time, that took creativity and logic most likely; but once one has done that, it does not (or it does not require quite as much).  If we asked the car accident riddle in America using three women in the car first, and then asked it with regard to the three men, most people would likely get both correct.  But even with as many women physicians as there are today in the U.S., very few people, even ardent feminists, get the answer to it right when it is simply asked about the boy, father, grandfather in the car.

So it is not clear to me that the people who give the wrong answer to the letter/number card problem are not rational or that the people who give the right answer to the beer/age one are.  And it is not clear to me that the people who cannot realize the surgeon is the boy's father in the original riddle are irrational or that anyone who gives the right answer to it when it involves all women in the car, or who has been asked it about all men after being first asked it about all women in the car, is therefore rational.


Moreover, some problems have more than one answer, depending on how one understands them.  One such question was raised by Martin Gardner in Scientific American (see https://en.wikipedia.org/wiki/Boy_or_Girl_paradox), where his answer was challenged.  The question is “If a family has two children and you know one of the two is a boy, what are the odds that both children are boys?”  Many people responded the odds were 50/50, since there is a 50/50 chance the other child is a boy.  But Gardner’s answer was there is a 1 out of 3 chance.  It turns out that if you are talking about a particular family where you know one child is a boy, the odds are 50/50 that the other child is also.  But if you are just asking about families with two children in general that have one boy, 1/3 of all such families will have two boys because there are three equal possibilities: the older child is a boy and the younger one is a boy; the older child is a boy and the younger one is a girl; the younger child is a boy and the older one is a girl.  There are roughly two families with one boy and one girl for every family with two boys.  There are roughly equal numbers of two children families which have a) two girls, which have b) two boys, which have c) older girl younger boy, and which have d) older boy, younger girl.  But since the question precludes families with both girls, that leaves twice the number with one girl and one boy as the number with both boys.   So if a psychologist asks a question with only one interpretation and answer in mind, but the subject has a different interpretation and gives a correct answer to it, the psychologist will misread the knowledge or reasoning ability of the subject. 

 

Some of Piaget’s questions of children to see whether they understood concepts at different ages is an example of this, because he seemed to have particular interpretations of the problems that were possibly not the interpretations of the children, and it is entirely possible that they did not have his interpretations because there experience was limited, not because their conceptual ability was undeveloped. 

One such question is which of two glasses will hold more water, showing the child a tall glass or a shorter one that is fatter (i.e., has a larger circumference).  Younger children tend to say the taller glass.  But that does not show they do not have a concept of volume.  It may show that they think the question is about which will hold the taller amount of water, or it may mean they don’t realize that increasing the diameter of a cylinder adds more volume to it than increasing its height/length does.  Most adults probably don't even know that, and don't know it is because the volume of a cylinder is
πr2h so that whatever you increase the radius by is squared but what you increase the height by is not.  Obviously children do not know that, but children also do not generally even have the experience of pouring liquids from one shape glass (or container) into another of a different shape but apparently similar size and being surprised that, or which, one holds more.  The fact children have no reason to know the taller glass might hold less does not mean they do not know what "less" means or that they do not have the concept of (greater) volume.  They simply could be thinking height matters more than width, since they have no experience to believe otherwise and height tends to be more noticeable or enticing.  The fact that most adults cannot tell you how many jelly beans are in a large jar, no matter what shape the jar, and the fact that guesses will probably be in a wide range among different adults, does not mean none those adults do not know what volume means or what "number of marbles" means and do not have the concept of either.

Similarly showing a child two strings of equal length -- one stretched out straight and the other in a serpentine configuration -- and asking which is longer, does not show that the children who pick the stretched out one do not have a concept of length.  It may show they don’t realize how much longer a curved line will look if straightened out.  You cannot likely choose shoe laces for your shoes on the basis of their length; typically you have to be told which lengths fit shoes with which number of eyelets.  And most of us could not even choose a belt that would fit us or someone else just by looking at different length belts without knowing their sizes in inches, and it is extremely difficult to guess the correct length of various curved lines. 

Or it may be that the child thinks “longer” means “further from the starting point (as in ‘as the crow flies’), in which case the stretched out straight string is longer than the serpentine one (no matter how long the serpentine one would be if stretched out straight) whose other end does not extend as far from the starting point.  And the fact adults may not mean it that way in a case like this is something one learns simply from experience with adults, not necessarily from the acquisition of an intellectual or conceptual ability one didn't have.  Without understanding the child's reasoning, which may or may not be easy to ascertain, it is a likely to mistake to assume it is deficient in some way. 


One should be very skeptical of any claim by a social scientist that some precise objective test captures what is represented or meant by any subjective quality and judgment, or that it measures how we make the judgment and whether we do it reliably or reasonably or consistently or not.

[1] He then goes on to explain why so many people will vote for Donald Trump because of this, in order to protest against and  thwart both, the Democratic and the Republican Establishment, because Trump seems to these voters to demonstrate understanding of the people’s frustration with lack of opportunity. [Return to text.]


[2] This omits for here how it is right to treat or take care of those who cannot contribute much because of youth, old age, disability, illness, etc.  Those are important issues, but separate from what it means to be an economically mobile society. [Return to text.]


One of the best bomber pilot instructors for the U.S. during WWII had some sort of certification about to expire, and he was ordered to take a simulator test immediately upon returning to the base, even though he had been flying real aircraft for so many hours that he was totally exhausted.  He fell asleep in the simulator and woke up to the sound of alarms sounding that he was in a fatal dive.  So, instinctively he bailed out and pulled his ripcord as he jumped what turned out to be the couple of feet down to the floor, to the never-ending delight of all his friends who reminded him of it every chance they got.  He may have been the only pilot ever to bail out of a simulator.  Fortunately he did not injure anything but his pride. [Return to text.]


This work is available here free, so that those who cannot afford it can still have access to it, and so that no one has to pay before they read something that might not be what they really are seeking.  But if you find it meaningful and helpful and would like to contribute whatever easily affordable amount you feel it is worth, please do do.  I will appreciate it. The button to the right will take you to PayPal where you can make any size donation (of 25 cents or more) you wish, using either your PayPal account or a credit card without a PayPal account.