Geometers, Scribes, and the structure of intelligence

2017-07-17

When people talk about general intelligence in humans, they tend to talk about measured IQ. While a lot of variation in IQ is really just variation in brain health, and probably related to variation in general health, there are at least two distinct modes of general intelligence in humans: fluid intelligence and crystallized intelligence.

Fluid intelligence is pretty much anything you can use a spatial metaphor to think about, and is measured pretty directly by Raven's Progressive Matrices. It's used for puzzle-solving.

Crystallized intelligence, on the other hand, relies on your conceptual vocabulary. You can do analogical reasoning with it – so it lends itself to a fortiori style arguments.

I don't think it's just a coincidence that I know of two main ways people have discovered disjunctive, structural reasoning – once in geometry, and once in the courts.

Geometry and the rules of structured argument

Supposedly, in ancient Greece, it was common for temples or other religious sites to have an inscription above the door saying, “let no impious (or, sometimes, unjust) person enter.” But, according to legend, above Plato’s academy was the inscription, “let no ungeometrical person enter.” No one unschooled in geometry. Why? Because mathematical reasoning was believed to be a prerequisite for serious philosophical inquiry.

This legend has some origin in fact. The Greek philosophical tradition was originally not distinct from mathematics – the Pythagoreans were a cult of mathematician-musician-philosophers, and Plato himself, in the dialogue Meno, has Socrates demonstrate a proposition in geometry as a paradigmatic case of how one might come to truly learn.

Nor is it a coincidence that the Greeks gave the science of shape, quantity, and number, the name mathematics, Greek for “that which is learnable.”

Nor is it a coincidence that J.S. Bach, whose music is a regular favorite of mathematicians, intellectuals, and people on psychedelic trips, whose music is more like a structured argument, more symmetrical, than any other great music that's survived intact, who was one of the inspirations for the great modern mathematical artwork Gödel, Escher, Bach, was also a member of the Pythagorean Society

How does spatial reasoning lead to formal, logical reasoning?

Perhaps at some point before the invention of mathematics as a subject, people needed to reason about the shapes of stones in order to do architecture, or to construct tools or devices. Eventually, to survey land for planning or tax assessment.

At first, your spatial intuitions will be good enough. You can rotate objects in your mind to see whether they'll fit together properly, before constructing them, as long as they're only a few, and not too complex. In Use the Native Architecture, Marcello Herreshoff points out that mathematical reasoning is much easier if we use our minds' native hardware for solving math problems, instead of algebraic formalism:

Some things the brain can do quickly and intuitively, and some things the brain has to emulate using many more of the brain’s native operations. […] In particular, visualizing things is part of the brain’s native architecture, but abstract symbolic manipulation has to be learned. Thus, visualizing mathematics is usually a good idea.

When was the last time you made a sign error?

When was the last time you visualized something upside-down by mistake?

I thought so.

But eventually you come up against problems that have too many parts to just directly use your spatial intuitions. So you come up with rules of thumb, like that the circumference of a circle is about thrice its diameter. But these will not always yield consistent results.

It’s not enough that this part fits, and separately that part fits – to build an archway, or even just a colonnade, all the parts have to fit together at the same time.

With enough investigation, you can start to develop rules that generalize better, by composing them from simpler rules that are easier to check in the general case. If your spatial intuitions are already sure that triangles have this property, and they're sure that things with this property have that property, then you know that triangles have that property too. This is enough for beautiful purely visual proofs like the classic visual demonstration of the Pythagorean theorem, which uses only one word: "Behold!".

Spatial reasoning has the nice quality that you're always already thinking structurally. A line connecting A and B also always connects B and A. An argument that one angle in a triangle is large is an argument that the sum of the other two must be small. It's just a matter of backing out the attributes of the thing, from your intuitions, and then using that to extend your predictive power to places your intuitions aren't big enough for.

If you pursue geometry long enough, and have enough of a tradition of verbal articulation – say, because you have a tradition of public deliberation through debate, like the Athenians – you eventually arrive somewhere very different. Euclid's *Elements *starts with a fairly tractable set of rules and definitions, and demonstrates a series of logical propositions, proceeding from shapes, to magnitudes, to multitudes. It's not obvious starting out, that multitudes are at all the same sort of thing as the things your spatial reasoning works on – for one thing, they're discrete, not continuous – for another, they don't seem to have dimensionality – and yet, the same sort of reasoning that was tested on geometry and found valid, can prove things about numbers.

And then, from this, it's easy enough to generalize the idea of formal argument, predication, syllogism, analogy, and logical implication.

The actual sequence of developments seems to support this, at least among the Greeks; the Pythagoreans substantially predate Aristotle's formalization of the rule of logic.

More than a millennium after Euclid compiled the Elements, it was still directly inspiring philosophers to aspire to a higher standard of rigor. Aubrey relates this story about Thomas Hobbes:

He was forty years old before he looked on geometry; which happened accidentally. Being in a gentleman's library Euclid's Elements lay open, and 'twas the forty-seventh proposition in the first book. He read the proposition. 'By G ,' said he, 'this is impossible!' So he reads the demonstration of it, which referred him back to such a proof; which referred him back to another, which he also read. Et sic deinceps, that at last he was demonstratively convinced of that truth. This made him in love with geometry.

Note what went on here. Hobbes thought the proposition absurd. Then, he saw that it referred back to other claims. He granted, that if those things were true, then this thing would surely be true as well. He looked back at the prior proofs, and saw that they were well-structured, so long as you granted their premises. Then, finally, he got to the beginning of the book, and found the earliest proofs persuasive, so long as he accepted the axioms – which seemed fine. Then, and only then, did he assent – immediately and enthusiastically – to the final proposition.

This is very different from being hit with a lot of independent arguments or pieces of evidence for a point of view. It’s not that Hobbes was eventually beaten into submission by a great many arguments. There was a single argument, with a formal structure. Nor did the argument gradually make an impression on him over time – he went from total incredulity to total belief, perhaps within a few minutes. Rather, he could affirm the structure of the argument – assess its validity – before he had properly evaluated its soundness and truth. Once he'd traced it back to a beginning that he was persuaded was sound, the information cascaded like a line of dominos, to the final proposition that he had been investigating in the first place.

This is not well modeled by the “marketplace of ideas” alone, or by the notion of independent “memes” competing for mindshare. Something else was going on here. And I think it’s quite likely that this was what made his Leviathan such a carefully argued and well thought-out book, even though it never attained the sort of certainty one might find in geometry.

Rotating a shape or imagining how multiple shapes might fit together is more than just associative reasoning – there is a structure to it, and anyone can make inferences on this basis. Excepting ungeometrical persons.

The rules of law

The second way humans reached towards generalized principles of reasoning seems to have come from our innate talent for keeping track of social norms.

Social groups have an interest in enforcing norms and punishing people who break the rules. People have to track what the rules are and make inferences about what would violate the norm. Even if they want to secretly violate norms, they need to know what's forbidden, and be able to track who could know. And they need to keep an internal record of what's where.

Crows, ravens, and other corvids are clearly not generally intelligent the way humans are – there's a lot less they can do, and their brains are much, much smaller than ours are. And yet, they seem to have some facility for analogical reasoning and prospective planning. They hide food, and don't want other crows to steal it. They can hide the food better if they keep track of when they might be observed (h/t Corvid Research). This requires keeping track of particular known facts, but also being able to think from multiple perspectives in order to reason about consequences. As a result, they can use tools and plan ahead, both faculties that seem related to what makes humans so powerful, faculties that require recursive and structured thinking.

While corvids seem to have a specialized capacity for inference and analogy about things related to food storage, concealment, and retrieval, humans seem to have a specialized capacity for inference and analogy about social rules.

The Wason card task is a classic test of facility at logical inference, which most people are not very good at, but humans tend to perform better when presented with the same logic problem, when filled in with content about social rules. Sociopaths, however, showed no such improvement. The Economist's summary is good here, as is The Last Psychiatrist's summary. I'll quote The Economist here:

[The] first presentation might be of four cards, each with a number on one side and a colour on the other. The cards are placed on a table to show 3, 8, red and brown. The rule to be tested is: “If a card shows an even number on one side, then it is red on the other.” Which cards do you need to turn over to tell if the rule has been broken?

That sounds simple, but most people get it wrong. Now consider this problem. The rule to be tested is: “If you borrow the car, then you have to fill the tank with petrol.” Once again, you are shown four cards, one side of which says who did or did not borrow the car and the other whether or not that person filled the tank:

Dave did not borrow the car Helen borrowed the car Brianne filled up the tank with petrol Kirk did not fill up the tank with petrol

Once again, also, you have to decide which cards to turn to see if the rule was broken.

In terms of formal logic, the problems are the same. But most people have an easier time answering the second one than the first. (In both cases it is cards number two and four that need to be turned.)

[…] When the two researchers probed the prisoners' abilities on the general test, they discovered that the psychopaths did just as well—or just as poorly, if you like—as everyone else. In this case the average score for all was to get it right about a fifth of the time. For problems cast as social contracts or as questions of risk avoidance, by contrast, non-psychopaths got it right about 70% of the time. Psychopaths scored much less—around 40%—and those in the middle of the psychopathy scale scored midway between the two.

The Wason test suggests that analysing social contracts and analysing risk are what evolutionary psychologists call cognitive modules—bundles of mental adaptations that act like bodily organs in that they are specialised to a particular job. This new result suggests that in psychopaths these modules have been switched off.

But, again, the human mind has limited scope. Even with native hardware designed to help on a class of tasks, if you're managing a large enough group, you need to formalize the structure. So courts started noticing patterns. Once they've decided a case on a principle, all cases where the principle applies even more strongly are implicitly decided the same way.

Law students, at least in the United States, are familiar with oddities such as the rule against perpetuities. The reason this rule is famous is that it pops up in lots of cases it seems like it shouldn't, because rules have to be consistently applied – nonobvious conclusions are the result of applying and composing simple principles.

The rule is basically that you can't create a trust to hold assets for many generations. The point of this rule is that without such a limitation, typical rates of return would quickly lead to a situation where trusts that just kept reinvesting their assets would dominate the economy, and be able to impose their will on the comparatively asset-poor living. Since this consequence seemed bad, lawmakers forbade it.

However, there are lots of cases where someone could accidentally set something like this up. For instance, a legacy left in trust for the not-yet-born children of someone still living could easily have an unlikely but possible loophole that lets it last beyond the maximum legal term. As a result, it's common practice to include weird stipulations that definitely satisfy the rule, as upper bounds, to stop the whole thing from being struck down. A trust created in my grandmother's will terminates no later than twenty-one years after the death of the last surviving member of the British royal family who was alive when the trust was created.

This is nuts. In fact, the common practice of copying large volumes of legal boilerplate word-for-word comes precisely from a desire to avoid the need to engage in novel structural thinking, in order to avoid introducing errors of this sort. This sort of application is what happens when people try to copy old magical spells without understanding magic.

But while this sort of structural inference about social rules may be something of a lost art – I've been involved in contracts where the legal team of the other side was literally unable to explain the meaning of a clause in the contract they'd asked me to sign - the faculty that created this kind of text, when generalized and put to the right sort of work, is extremely useful. It enables coordination over long stretches of time and space. It is necessary for the meaningful rule of law. It enables later scholars to build on the work of earlier ones. Many types of contracts would be infeasible without it; how could you possibly expect an insurance contract to mean anything without structured thinking including complex conditionals?

The Talmud, a record of a very different legal tradition, provides some more examples. Like other legal traditions, it admits of analogical “a fortiori” or “kal vakhomer” argumentation – if X motivates Y, then a stronger version of X must motivate Y at least as much. There's an attempt to apply precedents consistently. But there are some other oddities that seem to me to capture the flavor of this style of structural thinking very well.

For instance, when discussing the evidence for a legal opinion, the Talmud will often address a series of arguments for the proposition, only to point out the flaws of each argument in turn, rejecting them one by one. Then, at the end, an argument is provided for which there is no refutation, so it is accepted as a valid justification for the opinion.

This is not the sort of thing one does if one is just trying to figure out what the law is. In that case, the Talmud would only consider the strongest arguments.

Nor is this the sort of thing one does to maximize the rhetorical force applied towards the favored conclusion. If the Talmud were trying to do that, it would have knocked down a series of arguments against the position. Instead, it only makes sense if you care about what constitutes an acceptable argument, not just which legal conclusion happens to be true in this case. If you wanted to be clear, not just on which things you think are true, but which premises they depend on and which premises they don't. In short, if you cared about the structure, not just the content. Because the structure affects every part of the law.

There's another quirk of Talmudic discourse that I find illuminating, and is illustrated by this example from the Babylonian Talmud, Berachot 40a:

MISHNAH. If one says over the fruit of the tree the benediction, 'who creates the fruit of the ground', he has performed his obligation, but if he said over the produce of the ground, 'who createst the fruit of the tree', he has not performed his obligation. If he says, 'by whose word all things exist' over any of them, he has performed his obligation.

GEMARA. What authority maintains that the essence of the tree is in the ground? – R. Nahman b. Isaac replied: It is R. Judah, as we have learnt: If the spring has dried up or the tree has been cut down,² he brings the first-fruits but does not make the declaration.³ R. Judah, however, says that he both brings them and makes the declaration.⁴

(2) If one has gathered first-fruits, and before he takes them to Jerusalem the spring which fed the tree dries up, or the tree is cut down. (3) V. Deut. XXVI, 5-10, because it contains the words 'of the land which Thou, O Lord, hast given me', and the land is valueless without the tree or the spring. (4) Because the land is the essence, not the tree; v. Bik. I, 6¹

Rabbi Judah was already identified with the opinion that the important thing about a tree is the potentially productive land it stands on, and not vice versa. Then, there was another, unattributed opinion supporting this position. The response was to attribute this opinion to Rabbi Judah. Why?

This is not the sort of thing a critical historian would do. If you were mainly interested in which historical individual said the thing, you'd never treat agreement in principle as evidence of personal identity. An historian would look to things like historical context to explain someone's opinion, or perhaps to matters of word usage to determine whether it might plausibly have been said by someone of that era.

But, to the Talmudic mind, the important thing is not to identify a single historical person, but a single line of argument. The enactive details of issuing an opinion on a particular law, in a particular time and place, are just accidental – what's real and important is the principle. And, when Rabbi Judah makes a judgment based on that principle, he is ipso facto making all the judgments implied by the principle. The attribution of the anonymous opinion to Rabbi Judah isn't to be taken literally, or enactively, but structurally. The law has no respect for persons:

Ye shall not respect persons in judgment; but ye shall hear the small as well as the great; ye shall not be afraid of the face of man; for the judgment is God's

–Deuteronomy 1:17

There are different ways law is organized - in some cases codes will try to account for everything and then judges are supposed to directly apply the code, in other cases there's gradual accretion of precedent. But generally, there's the implication that the law already has an opinion, even if we haven't yet fully worked out what it is.

This is literally orthodox Jewish doctrine: the "Oral Torah" was already given at Mount Sinai along with the written bible, even though it wasn't written down fully until the Talmud. The Talmud is, of course, a record of active investigation and debate. In what sense could it possibly have already been revealed? Perhaps in the same sense that while the practice of geometry was invented, its content was only discovered through that practice.

Thus, we again have the idea of logical implication working both forwards and backwards, timelessly. This is very different from the idea of argument as a way of building momentum for a case, racking up evidence on one side. The court has to decide, not just this case, but every case this one will be precedent for, and all the upstream implications of principle.

So, an adversarial system of law doesn't just have two important parts (the adversaries), it has three: a bias one way, a bias the other way, and a structural, symmetrist judge deciding what's allowable and what's not, what gets counted and what doesn't. And this third feature is what distinguishes a trial by law from a trial by combat, or a vulgar political debate.

Universal claims about the behavior of nature are often called descriptions of natural laws by analogy to this sort of reasoning – a law is the sort of thing that is true everywhere, so that you can reason disjunctively about it. Newton gave us three laws of motion in his geometric text, the Mathematical Principles of Natural Philosophy.

So, there's a second pathway to structural thinking, based on a more specialized faculty common to most humans. Perhaps excepting sociopaths.

On that day, the Lord shall be one, and his name one.

Both law and geometry start with cognitive domains humans evolved to be especially good at dealing with (spatial configurations of physical objects, social configurations of norms), a specialized faculty of structured interpretation, and then some process of generalization. They both end up using structural features of language like predication, negation, and recursion, to articulate and generalize abstract structures. In both cases, human beings had a sense that they were thus being connected with the divine.

Of course, there is in a sense only one such thing as a general intelligence. Just like any Universal Turing Machine can simulate any other with enough translational labor, any faculty that enables general intelligence can do anything in the domain of any other general-intelligence faculty, even if they have different specialties.

In the algebraic revolution, mathematics – once powered by our brains' spatial reasoning modules – was recast as a verbal formalism. This enabled qualitatively different advances, but could express everything geometry had already discovered.

Noam Chomsky is famous for his claim that there is an universal faculty of grammar among humans – that some basic language structures are universal, even if we have to learn how to fill them in. A key part of universal grammar is a way for parts of sentences to refer, not just to objects out there in the material world, but other parts of sentences, recursively. This is necessary for things like complex conditionals.

It does seem like humans are unique in our ability to learn grammar. Other primates can learn some signs and meanings, parrots can learn spoken words, and cetaceans seem to be doing something, but as far as we know, only humans can conceive and describe abstract structural models to order our ideas. This is a likely explanation for humans’ unique ability to control our environment.

However, even we humans do not seem to have a uniform, fully general ability to implement universal grammar. At the least, skill in applying it to any given domain seems to vary quite a lot.

When writing about actors and scribes I mentioned how the President of the United States does not seem to be using properly recursive language. But this problem was already well-known to people trying to hire programmers. Coding Horror, in Why Can't Programmers.. Program?, provides a good overview:

I was incredulous when I read this observation from Reginald Braithwaite: Like me, the author is having trouble with the fact that 199 out of 200 applicants for every programming job can't write code at all. I repeat: they can't write any code whatsoever.

The author he's referring to is Imran, who is evidently turning away lots of programmers who can't write a simple program: After a fair bit of trial and error I've discovered that people who struggle to code don't just struggle on big problems, or even smallish problems (i.e. write a implementation of a linked list). They struggle with tiny problems.

So I set out to develop questions that can identify this kind of developer and came up with a class of questions I call "FizzBuzz Questions" […] An example of a Fizz-Buzz question is the following:

Write a program that prints the numbers from 1 to 100. But for multiples of three print "Fizz" instead of the number and for the multiples of five print "Buzz". For numbers which are multiples of both three and five print "FizzBuzz".

Most good programmers should be able to write out on paper a program which does this in a under a couple of minutes. Want to know something scary? The majority of comp sci graduates can't. I've also seen self-proclaimed senior programmers take more than 10-15 minutes to write a solution.

Dan Kegel had a similar experience hiring entry-level programmers: A surprisingly large fraction of applicants, even those with masters' degrees and PhDs in computer science, fail during interviews when asked to carry out basic programming tasks. For example, I've personally interviewed graduates who can't answer "Write a loop that counts from 1 to 10" or "What's the number after F in hexadecimal?" Less trivially, I've interviewed many candidates who can't use recursion to solve a real problem. These are basic skills; anyone who lacks them probably hasn't done much programming.

Speaking on behalf of software engineers who have to interview prospective new hires, I can safely say that we're tired of talking to candidates who can't program their way out of a paper bag. If you can successfully write a loop that goes from 1 to 10 in every language on your resume, can do simple arithmetic without a calculator, and can use recursion to solve a real problem, you're already ahead of the pack!

Between Reginald, Dan, and Imran, I'm starting to get a little worried. I'm more than willing to cut freshly minted software developers slack at the beginning of their career. Everybody has to start somewhere. But I am disturbed and appalled that any so-called programmer would apply for a job without being able to write the simplest of programs. That's a slap in the face to anyone who writes software for a living.

The vast divide between those who can program and those who cannot program is well known. I assumed anyone applying for a job as a programmer had already crossed this chasm. Apparently this is not a reasonable assumption to make. Apparently, FizzBuzz style screening is required to keep interviewers from wasting their time interviewing programmers who can't program.

Lest you think the FizzBuzz test is too easy – and it is blindingly, intentionally easy – a commenter to Imran's post notes its efficacy: I'd hate interviewers to dismiss [the FizzBuzz] test as being too easy - in my experience it is genuinely astonishing how many candidates are incapable of the simplest programming tasks.

[…]

It's a shame you have to do so much pre-screening to have the luxury of interviewing programmers who can actually program. It'd be funny if it wasn't so damn depressing.

These problems are, in some sense, very simple. They don't require detailed technical knowledge. As you can see, it is difficult for competent programmers to comprehend the arrangement of a mind that cannot solve such a problem trivially. But they do require an understanding that language can be something other than declarative: the ability to track conditionals and recursion. The ability to think with formal structure. To think in grammar.

The Babylonian Talmud: Seder Zera’im, ed. Rabbi Dr. I. Epstein, trans. Maurice Simon (The Soncino Press: London, 1961), p. 248 ↩

7 Comments

Milan Griffes 2017-07-17 at 10:17 am UTC

The last sentence in the corvids paragraph is incomplete:
"As a result, they can use tools and plan ahead, both faculties that mostly"

Benquopost author 2017-07-17 at 10:57 am UTC

Thanks! Fixed.

abstract gradient 2017-07-17 at 1:48 pm UTC

where does postrigor fit into this https://terrytao.wordpress.com/career-advice/there%E2%80%99s-more-to-mathematics-than-rigour-and-proofs/ ?
is the thing about the Wason card task why http://www.cheng.staff.shef.ac.uk/morality/morality.pdf ?

Benquopost author 2017-07-17 at 4:27 pm UTC

Postrigor would be somewhat surprising if we didn't have native cognitive modes for doing logical reasoning, and such that logical reasoning were made of following rules (and then learning to chunk them etc.). But it's not at all surprising if the rules are just scaffolding until you learn to make the right connections between the relevant mathematical objects and your native logic engine. Once you've made the connection, you don't need all the rules holding you in place.

abstract gradient 2017-07-17 at 4:30 pm UTC

i mean: postrigor usually feels like spatial reasoning / qualia blobs. sets of arbitrary qualia which combine in intuitive ways. much more like spatial reasoning than logical reasoning.

Romeo Stevens 2017-07-18 at 1:17 pm UTC

Law is the math of social reality is a fun sutra. Lately I've been describing the world as divided into causal world and social world and the various natives, immigrants, emigrants, export and import policies of each etc.

James 2018-12-12 at 9:14 pm UTC

In fact, with respect to the legal system, in those cases in which we are most concerned with outcome of the process, there are in fact four parties involved:
(1) a party biased in favour of one side, (2) an opposing party biased in favour of the other side, (3) a structurally symmetric judge who decides what sorts of facts are admissible and what sorts of inferences (or chains of inferences) are allowable by (1) and (2), and (4) a putatively neutral jury which listens to all of the admissible facts and allowable inferences and makes a decision about the case.
In a sense, the adversarial partisans present their premises and the rules of inference which, taken together, they believe will resolve the case on its merits. The judge controls the intellectual hygiene by ensuring the arguments on offer by both sides are valid and by excluding premises which are egregious and should be excluded for the sake of efficiency (why waste the jury's time?) or because the jury may be prone to thinking true these premises. And the jury is ultimately responsible for determining the soundness of the argument, which just amounts to adjudicating the truth of the premises on offer.
Interestingly, I have always thought that the skills used in law somewhat different geometry. Aside from the fact that law involves a lot of distinction-drawing which is not in math (since the objects under manipulation in math tend to be individuated or defined with a sufficient degree of precision before the mathematician behinds working with them), I have tended to view law as replete with disjunctive arguments (given the importance of enunciating limiting principles), while instead viewing mathematical arguments as conditionals or bi-conditionals.

Geometry and the rules of structured argument

The rules of law

On that day, the Lord shall be one, and his name one.

Footnotes

7 Comments