The Mathematics of Musical Harmony: Deriving the 12-Tone Equal Temperament

Publish on: 2026/01/21 Classify at: RESEARCH/Acoustics

Words: 1822 Read:≈ 4min

Summary

Why does a piano have 12 keys per octave? We derive the mathematical formula for equal temperament, explore its historical origins (including the Chinese prince Zhu Zaiyu), and understand the compromises it makes.

Introduction: Why 12?

Look at a piano keyboard. Within each octave, there are exactly 12 keys—7 white and 5 black. This isn’t arbitrary. It’s the result of a mathematical puzzle that took humanity over two thousand years to solve.

The question is deceptively simple: How do you divide an octave into equal parts so that you can play in any key and have it sound equally good (or equally “off”)?

The answer is the 12-Tone Equal Temperament (12-TET), the tuning system underlying virtually all Western music today. In this post, we’ll derive it from first principles, explore its historical origins, and understand the beautiful compromise it represents.

No Music Theory Required You don’t need to read sheet music or play an instrument to follow this article. We’ll start from scratch—with vibrations and simple ratios. If you can understand fractions, you can understand this.

The Pizza Problem: A Taste of What’s Coming

Before we dive into the math, here’s a simple analogy.

Imagine you’re cutting a pizza into slices. You want exactly 12 equal slices that together make a whole pizza. Easy, right? Just divide by 12.

But what if the pizza insists on being cut using only “nice” fractions like 1/2, 2/3, or 3/4? These fractions don’t divide evenly into 12. If you keep cutting with these “perfect” sizes, after going around the pizza, you’ll find you’ve either cut too much or too little—one slice will be awkwardly big or small.

That’s the tuning problem in music.

Nature gives us “perfect” ratios (2:1 for an octave, 3:2 for a fifth). But when you try to tile 12 of these perfect pieces around the musical “pizza,” they don’t fit. Equal temperament is the solution: we make every slice slightly imperfect, so they all fit together exactly.

The Pizza Problem: Using 'perfect' slices (left) leaves a gap or overlap. Equal slices (right) fit exactly, but each is slightly 'imperfect.'

The Tuning Problem: A Mathematical Conflict

The Perfect Octave

The most fundamental interval in music is the octave. But what is an octave, really?

What’s an Octave? When a man and a child sing “Happy Birthday” together, they’re often singing the same melody but at different pitches—one high, one low. That difference is an octave. To our ears, they sound like “the same note,” just higher or lower.

Physically, an octave means doubling the frequency. If a guitar string vibrates at 440 times per second (440 Hz), the note one octave higher vibrates at 880 Hz. This 2:1 ratio is so universally perceived as “the same note” that it’s recognized across all human cultures.

$$ f_{octave} = 2 \cdot f_0 $$

The Perfect Fifth

The next most pleasing interval is the perfect fifth—a frequency ratio of 3:2.

What’s a Fifth? Sing the first two notes of “Twinkle Twinkle Little Star” (“Twin-kle”). That jump is a perfect fifth. It sounds inherently stable and “open.” The ancient Greeks, particularly Pythagoras, believed this ratio was divine.

$$ f_{fifth} = \frac{3}{2} \cdot f_0 $$

The Pythagorean Comma: Where It All Falls Apart

Here’s the problem. Imagine you’re an ancient instrument maker, and you want to derive all 12 notes of the scale using only the “perfect” 3:2 ratio.

The Ancient Tuning Method:

You start with a reference note C (normalized to frequency = 1). Then you keep multiplying by 3/2 to get the next “fifth.” But here’s the catch: whenever the frequency exceeds 2 (i.e., goes beyond one octave), you divide by 2 to bring it back into the same octave:

Step	Calculation	Result	Note
1	Start	1.000	C
2	1 × 1.5	1.500	G
3	1.5 × 1.5 = 2.25 → ÷2	1.125	D
4	1.125 × 1.5	1.688	A
5	1.688 × 1.5 = 2.531 → ÷2	1.266	E
6	1.266 × 1.5 = 1.898	1.898	B
7	1.898 × 1.5 = 2.847 → ÷2	1.424	F#
…	…	…	…
13	After 12 steps…	1.0136	B# (should be C!)

Notice step 13: after going around the circle 12 times, you expect to land back on C (= 1.000). But instead, you get 1.0136—slightly sharp! This sequence is called the Circle of Fifths.

The Mathematical Proof:

The table above shows what happens within a single octave. But let’s zoom out and look at the raw frequency ratios without folding:

$$ \text{12 fifths} = \left(\frac{3}{2}\right)^{12} = 129.746… $$

$$ \text{7 octaves} = 2^7 = 128 $$

The mismatch is called the Pythagorean Comma:

$$ \text{Comma} = \frac{(3/2)^{12}}{2^7} = \frac{531441}{524288} \approx 1.0136 $$

That’s about 23.46 cents (where 100 cents = 1 semitone)—a clearly audible error.

The Pythagorean Comma: 12 perfect fifths (blue spiral) overshoot 7 octaves (red line) by about 1.36%.

The Wolf Interval: When Math Hurts Your Ears

In historical tuning systems (like Pythagorean or Meantone temperament), this error had to go somewhere. It was typically dumped into a single interval—usually between G♯ and E♭—which became so dissonant it was called the “Wolf Interval” (because it howled like a wolf when played).

The Wolf’s Howl Medieval and Renaissance musicians avoided certain keys specifically because they contained the Wolf Interval. If your piece modulated to the “wrong” key, the result was… painful.

This wasn’t just a mathematical curiosity—it was a practical crisis. Composers wanted to write in all keys. Keyboard instruments couldn’t be re-tuned mid-performance. Something had to give.

The Solution: Equal Temperament

We’ve seen the pizza problem: “perfect” slices don’t tile into 12. The solution? Make every slice equally imperfect—sacrifice a little purity so they all fit together exactly. Let’s derive this mathematically.

The Mathematical Derivation

We want to divide one octave into 12 equal steps. Let $r$ be the frequency ratio of a single semitone (the smallest step).

Since 12 semitones must equal one octave:

$$ r^{12} = 2 $$

Solving for $r$:

$$ r = 2^{1/12} = \sqrt[12]{2} \approx 1.05946 $$

This is the equal-tempered semitone ratio. For any note $n$ semitones above a reference frequency $f_0$ (such as A4 = 440 Hz):

$$ \boxed{f_n = f_0 \cdot 2^{n/12}} $$

On a logarithmic frequency axis, equal temperament divides the octave into 12 perfectly equal parts.

But who was the first to rigorously calculate this magical number $\sqrt[12]{2}$?

A Chinese Prince and His Abacus: Zhu Zaiyu (朱载堉)

In the West, the mathematical formula for equal temperament is often attributed to figures like Vincenzo Galilei (father of Galileo) or Simon Stevin around 1600.

But history tells a different story.

In 1584, a Ming Dynasty prince named Zhu Zaiyu (朱载堉) published his masterwork, A New Account of the Science of the Pitch-Pipes (《律呂精義》). In it, he presented the exact calculation of $\sqrt[12]{2}$, computed to 25 decimal places using a giant abacus.

$$ \sqrt[12]{2} = 1.0594630943592952645618252… $$

In his own words:

蓋十二律黃鐘為始，應鐘為終，終而復始，循環無端。……是故各律皆以黃鐘……為實，皆以應鐘倍數一零五九四六三……為法除之，即得其次律也。

(Translation: The twelve pitch-pipes begin with Huangzhong and end with Yingzhong, then cycle back endlessly. … Therefore, each pipe takes Huangzhong as the base, and divides by the Yingzhong multiplier 1.059463… to obtain the next pipe.)

Zhu Zaiyu didn’t just approximate it—he derived it rigorously. His work predates the earliest European calculations by several decades.

Why Was This Forgotten? Zhu Zaiyu’s work was theoretically brilliant but was not widely adopted in China at the time. Traditional Chinese music continued to use other tuning systems. Meanwhile, Europe—with its keyboard instruments and polyphonic traditions—had a more pressing need for equal temperament.

The Price of Perfection: What Equal Temperament Sacrifices

Equal temperament solves the Wolf Interval problem, but at a cost. Let’s compare:

Interval	Just Intonation (Pure Ratio)	Equal Temperament (12-TET)	Difference (cents)
Perfect Fifth	3/2 = 1.5000	$2^{7/12}$ ≈ 1.4983	−1.96
Major Third	5/4 = 1.2500	$2^{4/12}$ ≈ 1.2599	+13.69
Minor Third	6/5 = 1.2000	$2^{3/12}$ ≈ 1.1892	−15.64

The fifth is nearly perfect (less than 2 cents off). But the major third? It’s almost 14 cents sharp—quite noticeable to a trained ear.

Beat Frequency Comparison: A pure major third (red) produces no beats. An equal-tempered major third (blue) produces slow 'wobbles' due to the slight mistuning.

This is why barbershop quartets and a cappella groups often sing in just intonation—they can adjust pitches in real-time. Pianos, however, are locked into equal temperament; they can’t adjust on the fly.

A Note on Bach You may have heard that J.S. Bach wrote The Well-Tempered Clavier to showcase equal temperament. Not quite! Bach used “Well Temperament”—a system where all keys are playable but each has a subtly different “color.” Modern equal temperament makes all keys sound identical, losing some of that nuance.

Summary: The Beautiful Compromise

Equal temperament is a mathematically optimized trade-off: we sacrifice a little purity (especially in thirds) to gain the freedom to play in any key. It’s not perfect—it’s equally imperfect everywhere. And that’s exactly why it works.

Remarkably, this elegant solution was first rigorously derived not in Renaissance Europe, but in Ming Dynasty China—by a prince with an abacus.

References

Barbour, J. Murray. Tuning and Temperament: A Historical Survey. Michigan State College Press, 1951.
Cho, Gene Jinsiong. The Discovery of Musical Equal Temperament in China and Europe in the Sixteenth Century. Edwin Mellen Press, 2003.
Duffin, Ross W. How Equal Temperament Ruined Harmony (and Why You Should Care). W. W. Norton, 2007.