The Mathematics of Musical Harmony: Deriving the 12-Tone Equal Temperament
Introduction: Why 12?
Look at a piano keyboard. Within each octave, there are exactly 12 keys—7 white and 5 black. This isn’t arbitrary. It’s the result of a mathematical puzzle that took humanity over two thousand years to solve.
The question is deceptively simple: How do you divide an octave into equal parts so that you can play in any key and have it sound equally good (or equally “off”)?
The answer is the 12-Tone Equal Temperament (12-TET), the tuning system underlying virtually all Western music today. In this post, we’ll derive it from first principles, explore its historical origins, and understand the beautiful compromise it represents.
The Pizza Problem: A Taste of What’s Coming
Before we dive into the math, here’s a simple analogy.
Imagine you’re cutting a pizza into slices. You want exactly 12 equal slices that together make a whole pizza. Easy, right? Just divide by 12.
But what if the pizza insists on being cut using only “nice” fractions like 1/2, 2/3, or 3/4? These fractions don’t divide evenly into 12. If you keep cutting with these “perfect” sizes, after going around the pizza, you’ll find you’ve either cut too much or too little—one slice will be awkwardly big or small.
That’s the tuning problem in music.
Nature gives us “perfect” ratios (2:1 for an octave, 3:2 for a fifth). But when you try to tile 12 of these perfect pieces around the musical “pizza,” they don’t fit. Equal temperament is the solution: we make every slice slightly imperfect, so they all fit together exactly.
The Tuning Problem: A Mathematical Conflict
The Perfect Octave
The most fundamental interval in music is the octave. But what is an octave, really?
Physically, an octave means doubling the frequency. If a guitar string vibrates at 440 times per second (440 Hz), the note one octave higher vibrates at 880 Hz. This 2:1 ratio is so universally perceived as “the same note” that it’s recognized across all human cultures.
$$ f_{octave} = 2 \cdot f_0 $$
The Perfect Fifth
The next most pleasing interval is the perfect fifth—a frequency ratio of 3:2.
$$ f_{fifth} = \frac{3}{2} \cdot f_0 $$
The Pythagorean Comma: Where It All Falls Apart
Here’s the problem. Imagine you’re an ancient instrument maker, and you want to derive all 12 notes of the scale using only the “perfect” 3:2 ratio.
The Ancient Tuning Method:
You start with a reference note C (normalized to frequency = 1). Then you keep multiplying by 3/2 to get the next “fifth.” But here’s the catch: whenever the frequency exceeds 2 (i.e., goes beyond one octave), you divide by 2 to bring it back into the same octave:
| Step | Calculation | Result | Note |
|---|---|---|---|
| 1 | Start | 1.000 | C |
| 2 | 1 × 1.5 | 1.500 | G |
| 3 | 1.5 × 1.5 = 2.25 → ÷2 | 1.125 | D |
| 4 | 1.125 × 1.5 | 1.688 | A |
| 5 | 1.688 × 1.5 = 2.531 → ÷2 | 1.266 | E |
| 6 | 1.266 × 1.5 = 1.898 | 1.898 | B |
| 7 | 1.898 × 1.5 = 2.847 → ÷2 | 1.424 | F# |
| … | … | … | … |
| 13 | After 12 steps… | 1.0136 | B# (should be C!) |
Notice step 13: after going around the circle 12 times, you expect to land back on C (= 1.000). But instead, you get 1.0136—slightly sharp! This sequence is called the Circle of Fifths.
The Mathematical Proof:
The table above shows what happens within a single octave. But let’s zoom out and look at the raw frequency ratios without folding:
$$ \text{12 fifths} = \left(\frac{3}{2}\right)^{12} = 129.746… $$
$$ \text{7 octaves} = 2^7 = 128 $$
The mismatch is called the Pythagorean Comma:
$$ \text{Comma} = \frac{(3/2)^{12}}{2^7} = \frac{531441}{524288} \approx 1.0136 $$
That’s about 23.46 cents (where 100 cents = 1 semitone)—a clearly audible error.
The Wolf Interval: When Math Hurts Your Ears
In historical tuning systems (like Pythagorean or Meantone temperament), this error had to go somewhere. It was typically dumped into a single interval—usually between G♯ and E♭—which became so dissonant it was called the “Wolf Interval” (because it howled like a wolf when played).
This wasn’t just a mathematical curiosity—it was a practical crisis. Composers wanted to write in all keys. Keyboard instruments couldn’t be re-tuned mid-performance. Something had to give.
The Solution: Equal Temperament
We’ve seen the pizza problem: “perfect” slices don’t tile into 12. The solution? Make every slice equally imperfect—sacrifice a little purity so they all fit together exactly. Let’s derive this mathematically.
The Mathematical Derivation
We want to divide one octave into 12 equal steps. Let $r$ be the frequency ratio of a single semitone (the smallest step).
Since 12 semitones must equal one octave:
$$ r^{12} = 2 $$
Solving for $r$:
$$ r = 2^{1/12} = \sqrt[12]{2} \approx 1.05946 $$
This is the equal-tempered semitone ratio. For any note $n$ semitones above a reference frequency $f_0$ (such as A4 = 440 Hz):
$$ \boxed{f_n = f_0 \cdot 2^{n/12}} $$
But who was the first to rigorously calculate this magical number $\sqrt[12]{2}$?
A Chinese Prince and His Abacus: Zhu Zaiyu (朱载堉)
In the West, the mathematical formula for equal temperament is often attributed to figures like Vincenzo Galilei (father of Galileo) or Simon Stevin around 1600.
But history tells a different story.
In 1584, a Ming Dynasty prince named Zhu Zaiyu (朱载堉) published his masterwork, A New Account of the Science of the Pitch-Pipes (《律呂精義》). In it, he presented the exact calculation of $\sqrt[12]{2}$, computed to 25 decimal places using a giant abacus.
$$ \sqrt[12]{2} = 1.0594630943592952645618252… $$
In his own words:
蓋十二律黃鐘為始,應鐘為終,終而復始,循環無端。……是故各律皆以黃鐘……為實,皆以應鐘倍數一零五九四六三……為法除之,即得其次律也。
(Translation: The twelve pitch-pipes begin with Huangzhong and end with Yingzhong, then cycle back endlessly. … Therefore, each pipe takes Huangzhong as the base, and divides by the Yingzhong multiplier 1.059463… to obtain the next pipe.)
Zhu Zaiyu didn’t just approximate it—he derived it rigorously. His work predates the earliest European calculations by several decades.
The Price of Perfection: What Equal Temperament Sacrifices
Equal temperament solves the Wolf Interval problem, but at a cost. Let’s compare:
| Interval | Just Intonation (Pure Ratio) | Equal Temperament (12-TET) | Difference (cents) |
|---|---|---|---|
| Perfect Fifth | 3/2 = 1.5000 | $2^{7/12}$ ≈ 1.4983 | −1.96 |
| Major Third | 5/4 = 1.2500 | $2^{4/12}$ ≈ 1.2599 | +13.69 |
| Minor Third | 6/5 = 1.2000 | $2^{3/12}$ ≈ 1.1892 | −15.64 |
The fifth is nearly perfect (less than 2 cents off). But the major third? It’s almost 14 cents sharp—quite noticeable to a trained ear.
This is why barbershop quartets and a cappella groups often sing in just intonation—they can adjust pitches in real-time. Pianos, however, are locked into equal temperament; they can’t adjust on the fly.
Summary: The Beautiful Compromise
Equal temperament is a mathematically optimized trade-off: we sacrifice a little purity (especially in thirds) to gain the freedom to play in any key. It’s not perfect—it’s equally imperfect everywhere. And that’s exactly why it works.
Remarkably, this elegant solution was first rigorously derived not in Renaissance Europe, but in Ming Dynasty China—by a prince with an abacus.
References
- Barbour, J. Murray. Tuning and Temperament: A Historical Survey. Michigan State College Press, 1951.
- Cho, Gene Jinsiong. The Discovery of Musical Equal Temperament in China and Europe in the Sixteenth Century. Edwin Mellen Press, 2003.
- Duffin, Ross W. How Equal Temperament Ruined Harmony (and Why You Should Care). W. W. Norton, 2007.