F.A.Z.-Column by Emanuel Derman

Little Big Data

Von Emanuel Derman
, 11:00
Choosing what data to collect takes insight; making good sense of it requires the classic methods: you still need a model, a theory, or intuition to find a cause.

Seventy years ago cybernetics was a hot field; thirty years ago, it was catastrophe theory. Those Greek-inspired words for disciplines that once brought hope of explaining human behavior now evoke a quaint nostalgia, like Polaroids of long-haired young people in bell-bottomed jeans and tie-dyed T-shirts. The new buzzword nowadays is Big Data, the fashionable term for capturing and analyzing the vast collections of information that people reveal about themselves when shopping on Amazon, Travelocity, and Netflix, or when writing about themselves on Facebook and Twitter. Big Data utilizes a mix of computer science, information technology, mathematics, and applied statistics. It is increasingly used to sell you products or persuade you to vote for politicians by tailoring the product’s or politician’s image to your particular data-generated persona. Some talking heads like to say that computer-aided analysis of patterns will soon replace our traditional methods of discovering the truth, in medicine and the social sciences as well as in physics.

What are the classic ways of knowing? Recall the great triumph at the dawn of modern science, the understanding of gravitation and motion. How did that come about?


For millenia after the Greeks, scientists’ prejudices led them describe all planetary movements in terms of circles about a stationary earth. But the motion of a planet, as seen from the orbiting earth itself, is too complicated for a single circle -- sometimes it seems to move backwards relative to the earth -- and so it needs circles moving on circles moving on circles, i.e. epicycles. Eventually, Galileo pointed out that the earth wasn’t stationary, that the earth and planets orbited the sun, and that the planets’ weird apparently retrograde motions were not intrinsically theirs but rather a consequence of their being observed from the moving earth.

Intuition, followed by checking the data

In the early 1600s Kepler examined the data on planetary positions and formulated his three astonishing laws of planetary motion: planets move in ellipses (not circles) about the sun, the line between the Sun and a planet sweeps out equal areas in equal times, and the square of the orbital period is proportional to the cube of the distance from the sun.

If you want to glimpse the miracle of discovery, think about Kepler’s second law: the line between the Sun and a planet sweeps out equal areas in equal times. This deep symmetry of planetary motion implies that the closer the planet to the sun, the more rapidly it moves, as shown below.


The astonishing thing is that there is no line between a planet and the sun for Kepler to observe. His data consisted of planetary positions in the night sky. How then did he decide to describe the motion of the planets in terms of an invisible imaginary line? No one knows exactly, but it involved long immersion, struggle, and strange associative thinking that arose from somewhere inside him, and then - Aha! - intuition, followed by checking the data.

How to discover theories

Intuition is the first means of knowing. The observer becomes so close to the object (or person) observed that he begins to experience their existence from both outside and inside them. Intuition is a merging of the observer with the observed. It’s almost quantum-like, the ability to be in two places at the same time.


Kepler’s laws described the patterns of the planets, but not their causes. Newton found a cause; he showed that Kepler’s laws were a mathematical consequence of Newton’s own theories of gravitation (the inverse square law of attraction) and motion (Force = mass times acceleration).

How did Newton discover his theories? For sure, the orbiting planets and falling apples didn’t announce the laws that drove them. Wrote John Maynard Keynes about Newton: I fancy his pre-eminence is due to his muscles of intuition being the strongest and most enduring with which a man has ever been gifted. Keynes understood something about the discovery of truth which many of his more formal economist disciples have never learned.

Useful, picturesque, but not entirely true

Theories are descriptions of the laws of the world; they can be right, partially right or totally wrong. What all theories have in common is that, like God’s voice to Moses in the desert, they proclaim: I am what I am. Theories stand on their own feet.


Newton’s laws have been supplanted by Einstein’s, but that doesn’t mean that Newton is an approximation to Einstein. Newton is to Einstein as cursive is to typing, or as navigation by the stars is to the Global Positioning System. Two different approaches reach the same end by different means, with different accuracies. One doesn’t approximate the other. Both are theories that describe facts.

The final mode of understanding is a model. A model compare something we don’t understand to something we do. So, for example, the famous liquid drop model of the atomic nucles pretends that the nucleus is a drop of water that can vibrate and rotate and even fission into two. Useful, picturesque, but not entirely true. Similarly, the Black-Scholes financial option model compares the uncertain movement of stock prices to the diffusion of smoke from a cigarette tip. Useful, up to a point -- but not fact. Models are metaphors, graven images of reality but not reality itself, analogies whose incautious use can unleash all the dangers of idolatry that God warned against in the second of his commandments.

Against the bewitchment

There’s one final mode of understanding: statistics, the statistical analysis that lies behind Big Data. Statistics seeks to find past tendencies and correlations in data, and assumes they will persist. But, in a famous unattributed phrase, correlation does not imply causation.


Big Data is useful, but is not a replacement for the classic ways of understanding the world. Data has no voice. There is no “raw” data. Choosing what data to collect takes insight; making good sense of it requires the classic methods: you still need a model, a theory, or intuition to find a cause.

“Philosophy is a battle against the bewitchment of our intelligence by means of language,” wrote Wittgenstein. I take that to mean that language can deceive our natural intuition, and we need philosophy to reclaim it.

In a similar sense, I would argue, science is a battle against the bewitchment of our intelligence by data.

Quelle: F.A.Z.
  Zur Startseite