# Music Understood By Numbers

If there is one topic I am quite ignorant about, it is probably music. One reason for that is I never learned to play an instrument (other than the computer :), because traditional music notation has always seemed highly inelegant to me, for something as elegant as music, so I refused to deal with it. I still wanted to understand music however, but since no book talked about it in my “language”, I did what every programmer would: write some code to figure it all out on the basis of just numbers.

The initial goal is to find a justification for our current musical system (12 half notes).

For any musical system, our goal is to layer sine waves on top of eachother to create more complex sounds.

To create sounds that sound “pleasant”, we need to be able to layer sine waves such even though they have different frequencies, they have wavelengths that are fixed ratios of eachother, such that they overlap in the same way again, after very few cycles. Infact, a ratio of 2:1 (an octave) is easiest to the ear, and repeats in the shortest amount of time (after 1 cycle of the lower tone). Other ratios, such as 2:3 and 3:4 are equally important. 2 sinewaves which don’t have such a ratio will take many cycles to coincide again, by which time the ear has lost track, which means they sound as if the sounds don’t belong together or work against eachother. This is what you get if you have a dissonant or just a sound that is “off”.

So we need a system whereby the notes we have chosen can most easily build these ratios. To start with, the concept of an octave is a given, if you see that we need to divide up notes in an logarithmic rather than linear fashion. The question that remains is: how do we split up the octaves.

Lets first look at our current 12 half note system, and then see if it is any better with an N other than 12. If we divide an octave in 12 steps, doing this we get the following ratios going from 1 to 2:

``````[1.00000,1.05946,1.12246,1.1892,1.25991,1.33482,1.41419,
1.49828,1.58736,1.68175,1.78175,1.88769,1.99993]``````

What we can simply compute to get an impression of the layering capacity of a certain musical system, is to compute for each note how many wavelength cycles are needed before we are back at the same starting point relative to a governing octave. The lower this number, the better suited the note for building pleasent ratios.

The problem is that for waveforms to get back to the starting point exactly is almost an impossibility with our logarithmic sequence. We can assume that if the waveforms almost coincide, that the ear won’t hear the difference. How much tolerance the ear exactly has before a pleasent note turns into a dissonant I don’t know, and probably differs per person, so I experimented with different error tolerances to be able to compare the results:

tolerance -> 0.01 0.03 0.05 0.10
1 84 17 16 16
2 49 8 8 8
3 37 16 16 5
4 50 23 4 4
5 3 3 3 3
6 70 12 12 5
7 2 2 2 2
8 63 17 12 5
9 22 22 3 3
10 55 23 9 5
11 89 9 9 9
12 1 1 1 1

the numbers in the table are the amount of cycles before waveforms coincide again, and is the important number. As you can see, even at the very low tolerance of 0.01, the 5th and 7th half note stand out as being very close to perfect ratios with the octave… this is no surpise as the correspond to well known tonics/ratios in traditional music, i.e.:

half note cycles ratio name
12 1 1:2 octave
7 2 2:3 septime
5 3 3:4 quint

The code to compute the above table is btw (in haskell):

``````tolerance = 0.01
octave::Int
octave = 12
factor = [0,0,0,1.25992,1.18920,1.14869,1.12246,1.10408,1.09050,1.08005,
1.07177,1.06504,1.05946,1.05476,1.05075,1.04729,1.04437] !! octave
f x n = (n,g y):f y (n+1) where y = x*factor
g y = take 1 (filter (close [1..100]) (h (y-1) 1))
close [] _ = False
close (n:ns) (q,x) = (o<tolerance && o>(-tolerance)) || (close ns (q,x)) where o = n-x
h y n = (n,(y*n)):h y (n+1)
main = take octave (f 1 1)``````

So using these tables we can order each note to “pleasantness”, depending on tolerance:

 0.01 0.03 0.05 0.10 12 7 5 9 3 2 4 10 8 6 1 11 12 7 5 2 11 6 3 8 1 9 4 10 12 7 5 9 4 2 10 11 6 8 3 1 12 7 5 9 4 3 6 8 10 2 11 1

The ones in red are notes that would be “not pleasant”. As you can see from the numbers, these are close to the ratios used in music, predicted purely statistically.

As it turns out, the official hertz for notes used today doesn’t follow the exact logarithmic progression. This makes sense too, as we can see from these numbers that the important ratios like 2:3 etc don’t coincide exactly. So we can make them coincide exactly, by fiddling with the numbers slightly, making the 2:3 and 3:4 sound even better, probably at the cost of notes like 2 and 11, but they sounded bad anyway.

Another interesting question remains, and that is: is a subdivision of an octave logarithmically by 12 the most optimal system? An easy way to find out is to simply compute the above tables for systems other than 12, and see which one has the best number of cycles for 2:3 (most important after 1:2, which is present by definition):

other octaves (tolerance 0.01):

divisor 2:3 cycles
16 11
15 11
14 25
13 13
12 2 <-
11 7
10 8
9 6
8 6
7 26
6 49
5 27
4 22
3 50

So maybe an 8 or 9 tone system could work, but nearly as good as the 12 tone one.

So assuming we have just proven that 12 steps is the best subdivision, how did we ever get into this mess of having whole and half notes, shifted differently depending on sharp/flat?