👵🏽 🍸 🕴🏻 Examination of the type system for checking the correctness of music 🔚 🔁 🥅

Today, much is said about the presentation of music using programming languages, since on the one hand it is an interesting task for engineers, and on the other it is part of the task of universal description of music.

What does it look like? For many languages, music programming environments have been created. The most popular are TidalCycles for Haskell and Sonic Pi for Ruby on the Raspberry Pi. There is also a tool using the Leipzig composer library . Since it is written in Clojure, it lacks type checking.

(def row-row-row-your-boat
  (phrase [3/3 3/3 2/3 1/3 3/3]
          [  0   0   0   1   2]))

->> row-row-row-your-boat
  (canon (simple 4))
  (where :pitch (comp C major))
  (where :time (bpm 90))
  play)

The duration and pitch of sounds are represented as integers and literal coefficients, which is not very convenient. When it comes to music transformation, programming can be a great help. Say, in the above example, the key is set to the key of C major and the tempo is 90 beats per minute.

When programmers see music code, they most often ask whether it is possible to prevent fake execution using a type system. Fair question. If music can be represented as code, can errors in writing music be considered programming errors? And if so, why not improve the process of writing music using the techniques we use to write programs?

In particular, there is an obvious analogy between errors that the type system can prevent and common cases of musical inaccuracy. If your programming language is able to check whether you passed a string to a function that expects to get a number, then it can check that you do not use f-sharp in a passage written in C major, the key of which there are no signs of alteration.

In this article, we will find out what makes music correct and how it can be represented using a type system.

Prescriptivism

Theorists have been arguing for centuries about what makes music right. Most often, they operate within the framework of musical prescriptivism: evaluating music for compliance with some set of rules.

You can describe the dominant methodology for the formation of rules:

Learning the canon
Definition of patterns
Formulating them as a set of rules

Then it can be argued that:

Music that does not comply with these rules is incorrect.

For example, the composition rule states that it is not allowed to make a jump to a large septima from to to s in the melody. This approach has led to numerous valuable findings. Classical music has extremely interesting patterns, and a generalization of our observations as a set of rules helps in discussing the phenomenon of music.

In addition, following the rules is useful when mastering musical instruments. It’s good when teachers correct children who practice playing the violin in C major, but go astray. There are situations in which it is useful to adhere to the definitions of correct and incorrect music.

However, this approach has two main disadvantages.

First, the rules arising from this process are highly dependent on what the theorist considers canon. The bias of the dataset leads to the bias of the conclusions, and Western classical music does not reflect the entire musical experience of mankind. Creating rules based on the music of Bach and Mozart can be fun, but they won't tell you too much about music in general.

The situation is complicated when you realize that musical genres are often strongly associated with certain cultures and ethnic groups, and that prioritizing certain types of music you thereby unwittingly prioritize certain types of people. For example, in the academic theory of music, much attention is paid to the study of classical European traditions and relatively little to music originating in the diasporas of immigrants from Africa: blues, jazz, rock and roll and hip hop. If we evaluate the correctness of music through the traditional theory, this may lead to a culturological version of the problem when face recognition algorithms worked better with Caucasians than with Negroids.

The second big drawback of creating rules by identifying and generalizing patterns is that in this way we will always focus on the past. And music excites most when it breaks the rules. Before Jimmy Page used the distortion effect , engineers considered clipping the signal during amplification as a defect. And although the revealed patterns help to understand modern music, however, they are completely unsuitable for evaluating future music.

A good type system for music should avoid both described traps of prescriptivism. It should not be based on unacceptable or subjective criteria for correct music. At the same time, trying to make erroneous states unrepresentable , the type system should not interfere with the appearance of new states.

Descriptivism

An alternative to prescriptivism is descriptivism. If in the first case musical patterns are considered as binding rules, then in the second they are considered as patterns that are formed in the course of practice. A descriptive approach to the study of musical rules looks like this:

Exploring the body of music
Identify trends
We formulate them in the form of a model

It can be argued that:

Music deviating from the model is unusual.

To do this, we need a good model of musical structures that can describe the ordinary or the unusual, rather than the right or the wrong. David Huron, in his book Sweet Anticipation, offers a descriptive theory that can be reduced to three key aspects:

Music is graded through statistical training
You are pleased with the correct forecast
Novelty doesn't let you get bored

In our previous prescriptivistic example, jumping from a note to C is considered a mistake. And according to the theory of Huron, this will be very unusual. Those who have listened to Western music a lot will find out that when they hear a note to, the next note will most often be another note, or a note one step higher or lower in sound. Jumping up to the B will surprise the audience, and they may take this negatively.

One of the curious consequences of Huron's theory is that writing music is not an exercise in maximizing or minimizing fiction. Music balances on the border between order and chaos, and the location of the listeners depends on the ability of the musician to balance the shock of novelty with the satisfaction of receiving the expected. If you really want to check the correctness of the composition, you will have to make sure that it does not go too far from accepted traditions, but also does not observe them too strictly.

The following are data from a statistical analysis of melodic trends. Darker colors indicate more likely results. Select a line with one of the notes and find on it a cell with a note that is more likely to follow your choice - the intersection highlighted in dark with the column of another note. For example, in this dataset in 33.53% of cases after the note re comes the note to.

These data are taken from the book of Huron, they are the result of the analysis of more than 250,000 tone pairs from Germanic folk songs in major. Another array of data, say hip-hop or rock and roll, will demonstrate different probabilities.

All these probabilities can be equivalently represented as entropy bits: a probability of 50% corresponds to one bit, a probability of 25% to two bits, etc. To avoid dividing by zero, we assume that for transitions that never occur in the dataset, the probability is 1/100 000. Now the darker color shows transitions that are more expensive in terms of entropy - that is, less likely. For example, for re after equals 6 bits of entropy, that is, it has a probability of 1/64.

Representation in the form of entropy helps us to combine surprise from a series of notes in which each transition from one sound to another is an entropy counter, with surprise from the whole melody as the sum of transitions.

 ->  ->  ->  = 2,20 + 2,71 + 5,94 = 10,85 
 ->  ->  ->  = 2,48 + 7,43 + 5,94 = 15,85

This is very similar to the hidden Markov model , but instead of generating strings of notes, we use weighted probabilities to estimate the likelihood of a melody.

You may have noticed that the transition from note to C is only 2.48 bits, although at first glance this contradicts the fact that the transition from C to C was described above as very unusual. Most often, unusual transitions are approximately 10 bits of entropy. The reason for the discrepancy is that the Huron data does not distinguish between a jump of six notes from to to si (very unusual) and a transition to one note from to to a lower si (a fairly common situation). If the data allows us to separate these two cases, then the model will show that the chain of do-si-mi-fa will sound even more unconventional than before-re-mi-fa.

Typed notes

Ideally, a type system helps programmers make the right decisions and eliminates the possibility of errors. A good musical analogy is solfeggio. If you do not have a musical education, then you could get acquainted with solfeggio on the song Do-Re-Mi from the musical The Sound of Music . Each note of an octave has its own name, and any note outside the tonality of the name does not have:

To (lowest note from this set)
Re
Mi
F
Salt
La
Si
To (now an octave higher)

This system allows musicians to adhere to the correct notes, it sets a mini-language that prohibits playing notes outside the selected key. There are infinitely many frequencies between d and re, but they are not reproduced within the solfeggio.

This can be easily represented in the code as algebraic type data:

data Solfege = Do | Re | Mi | Fa | So | La | Ti

Now you can define musical functions that do not allow notes outside the keys. For example, this function repeats one note n times:

repeat : Nat -> Solfege -> List Solfege
repeat 0 s = [s]
repeat n s = s :: repeat (n - 1) s

With this definition, triple repetition of a note re is an expression that will be easily understood in the musical field. A three-fold repetition of a sharp sharp is a type check error, because the sharp sharp is not a type value Solfege.

This gives us a certain type of safety. Our system allows us to avoid the wrong notes, but it still does not protect against incorrect combinations of notes. If we played in C major, we would not play F sharp, but we could play C, then jump to a large septima on a B note, which is forbidden by (traditional) composition rules. Such contextual type checking is possible, but requires a more complex approach.

Typed Transitions

The Mezzo Haskell library uses related types to verify musical correctness. The description says that this is a "very strict test for music", the library can check for compliance with various rules of composition, and not just tonality. In Mezzo, the following code is compiled because the pre-re-mi-fa melody consists of the allowed intervals:

comp = defScore $ start $ melody :| c :| d :| e :| f :>> g

But if we break the rule prohibiting jumping from to to si, then Mezzo will not compile such a code:

comp = defScore $ start $ melody :| c :| b :| e :| f :>> g

It’s great that Mezzo will even point out a problem:

Major sevenths are not permitted in melody: C and B
In the first argument of ‘(:|)’, namely ‘melody :| c :| b’
In the first argument of ‘(:|)’, namely ‘melody :| c :| b :| e’
In the first argument of ‘(:>>)’, namely
    ‘melody :| c :| b :| e :| f_’

It may seem that the dream of presenting incorrect music as type errors has come true. But sometimes awkward situations arise. When the author of the library encoded Chopin's prelude, it turned out that on several occasions the composer did not follow the traditional rules:

The work was transcribed almost entirely, but sometimes it was necessary to skip a few notes because they created the forbidden intervals indicated by Mezzo.

You may ask, who undertook to ban it? If Chopin is not allowed to talk about what is permissible in Western music and what is not, then who then?

The problem is that although Mezzo is able to take context into account, type estimation remains a binary choice task. Either all the notes are correct and the work is compiled, whether one of the notes is knocked out and the work is not compiled. And despite the fact that Mezzo allows you to choose composition rules, the decision to apply or not apply the rule is global. If you disable the rule because some musical decision violates the usual agreement, then you will cancel the check of other parts of the work.

Typed entropy

One way to break or circumvent musical typing rules is to model the transitions between notes in terms of similarity on Huron’s advice. Instead of declaring a particular transition right or wrong, you can calculate the amount of entropy that reflects the audience’s surprise at the pairing combinations of notes.

The following code is written in the dependent typed functional language Idris . In the code, it is checked that the melody corresponding to the rules starts with to and ends with salt and that the path between these notes generates 8-16 bits of entropy. First you need to determine the type Melodythat reflects our representation of the melody with a certain similarity or magnitude of entropy. This type has three constructors:

Pure: Reflects the creation of a melody from a single note, with an upper and lower entropy level equal to zero.
(>>=): takes two melodies and adds up their entropy boundaries with the cost of switching from a note at the end of the first melody to a note to at the end of the second.
Relax: takes a melody and expands its boundaries of entropy.

data Melody : (Solfege, Solfege) -> (Nat, Nat) -> Type where

  Pure  : x -> Melody (x, x) (0, 0)

  (>>=) : Melody (w, x) (low, high) ->
          (() -> Melody (y, z) (low2, high2)) ->
          Melody (w, z) (cost x y + low + low2, cost x y + high + high2)

  Relax : Melody (x, y) (low + dl, high) ->
          Melody (x, y) (low, high + dh)

This type definition is more complicated, but thanks to it, Idris can compose a melody using notation do, and the compiler will track the boundaries of entropy for us. Here's the relevant tune with the low cost of all transitions:

conventional : Melody (Do, So) (8, 16)
conventional = Relax $ do
  Pure Do
  Pure Re
  Pure Mi
  Pure Fa
  Pure So

The following tune does not comply with the rules, because it jumps to a large septima. In Mezzo, it will either be prohibited, or you will have to completely disable the rule. Here we just expanded the boundaries of the complexity of the melody so that it fit into the range of 8-24 bits of entropy:

unconventional : Melody (Do, So) (8, 24)
unconventional = Relax $ do
  Pure Do
  Pure Ti
  Pure Mi
  Pure Fa
  Pure So

If instead Melody (Do, So) (8, 24)we would assign a type not conforming to the rules of the melody Melody (Do, So) (8, 16), then it would not compile!

The novelty of the approach is that it allows you to understand when the music becomes too boring. If the entropy of the melody does not reach the lower limit, a type error is generated. So if you assign a type to a melody that complies with the rules, Melody (Do, So) (16, 24)then it will not be compiled either, because its entropy does not reach the lower limit. This inspires us with faith in the Huron theory, according to which it will be too hard for people to listen to music that is too unusual or unpredictable.

In a degenerate case, transitions between notes inside a key do not increase entropy, and transitions outside a key give an arbitrarily large increase in entropy. Then it is advisable to revert to a binary evaluation, in which each note will be considered permitted or prohibited, without any intermediate statuses.

Important but difficult to verify

Checking types in music is difficult because it is difficult to determine the correctness of the music. If we cannot mathematically accurately model the definition of musical correctness, then we cannot transfer it to the type system. Even if we can authentically formalize our understanding, we will need to make an effort and apply a powerful type system so that the compiler can adequately evaluate the melody. However, this is possible with some definition and a few simplifying assumptions.

Is this just curiosity? Maybe. It is unlikely that the music industry will soon adopt a type system to prevent compositional errors. But if the study of computing in the field of music will help us to evaluate differently what plays such an important role in the life of almost all of us, then this alone justifies all efforts.

Perhaps no less important is the fact that music is only one of the areas where correctness is important, but difficult to verify. The more actively the social and cultural sphere is automated with the help of programming, the more often situations arise in which it is difficult to evaluate the answer, which is crucial for people. Take the use of a computer program that decides how long to imprison . The correctness of the sentences imposed depends on complex trade-offs between costs and probabilities, and if we cannot track the logic of decision-making by the system, then how can we evaluate the fairness of the result?

Every time we automate opaque decision-making processes, we need to ask ourselves: “How do we know that the system is working correctly?”