You are on page 1of 14

Description-Based Design of Melodies

Author(s): François Pachet


Source: Computer Music Journal, Vol. 33, No. 4 (Winter 2009), pp. 56-68
Published by: The MIT Press
Stable URL: http://www.jstor.org/stable/25653513
Accessed: 14-07-2017 00:14 UTC

REFERENCES
Linked references are available on JSTOR for this article:
http://www.jstor.org/stable/25653513?seq=1&cid=pdf-reference#references_tab_contents
You may need to log in to JSTOR to access the linked references.

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted
digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about
JSTOR, please contact support@jstor.org.

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
http://about.jstor.org/terms

The MIT Press is collaborating with JSTOR to digitize, preserve and extend access to Computer Music
Journal

This content downloaded from 67.184.195.145 on Fri, 14 Jul 2017 00:14:12 UTC
All use subject to http://about.jstor.org/terms
Frangois Pachet
Sony CSL Paris Description-Based Design
6 Rue Amyot
75005 Paris, France of Melodies
pachet@csl.sony.fr

Most current approaches in computer-aided compo For instance, Hamanaka, Hirata, and Tojo (2008)
sition (CAC) are based on an explicit construction propose a morphing metaphor in which melodies
paradigm: users build musical objects by assembling can be created as interpolations between two given
components using various construction tools. Virtu melodies. But this approach is limited to the context
ally all technologies developed by computer science of the generative theory of tonal music (Lerdahl and
and artificial intelligence have been applied to CAC, Jackendoff 1983), and is not extensible to arbitrary
thereby progressively increasing the sophistication categories, as we will show later.
of music-composition tools. Composers can choose We propose here a novel approach to music
between many programming paradigms to express composition called description-based design that
the compositions they "have in mind/' from the attempts to remove the need for the user to under
now-standard time-lined sequencers (e.g., Stein stand anything technical related to the target objects.
berg's Cubase) to advanced programming languages In this article, we focus on the creation of simple
or libraries (e.g., OpenMusic; Assayag et al. 1999). musical objects?unaccompanied melodies?as a
Although these explicit constructions do benefit working example, but our paradigm is general and
from abstractions of increasing sophistication (ob can be applied to many other fields of design.
jects, constraints, rules, flow diagrams, etc.), CAC First, we introduce the general description-based
always remains based on an explicit construction design mechanism, and then we describe the type of
paradigm: Users must give the computer a clear melodies we target. Finally, we describe experiments
and complete definition of their material. This ap demonstrating the functionality of the algorithm
proach has the enormous advantage of letting users and its potential.
control all dimensions of their work. However, it
also requires from users a fine understanding of
the technicalities at work. For instance, composing Description-Based Design
music with object-orientation requires the under
standing of objects, classes, and message passing. Description-based design stems from the paradigm
Using constraints requires the understanding of of Reflexive Interaction (Pachet 2008). The idea is to
constraint satisfaction, filtering, and of the basic let users manipulate images of themselves, produced
constraint libraries, etc. by an interactive machine-learning component. The
An interesting attempt to escape these technical creation of objects (musical objects in our case)
requirements is the Elody system (Letz, Orlarey, is performed as a side-effect of the interaction, as
and Fober 1998) in which the user can create opposed to traditional interactive systems in which
arbitrary abstractions by selecting musical material target objects are produced up-front as the result of
together with a specific dimension of music (e.g., a controlled process. A typical example of reflexive
interactive system is the Continuator (Pachet
pitch or rhythm). These abstractions can then be
2004), a system that continuously learns stylistic
applied to other musical material to create yet
information coming from the user's performance
more complex objects. But here again, the user
and generates music "in the same style" in the
must mentally maintain a model of the abstraction
form of real-time answers to, or continuations of,
algorithm at work, a task that can be particularly
the music performed by the user. The Continuator
difficult as the complexity of the composition
was shown to trigger spectacular interactions with
grows. Other approaches propose construction tools
that do not require explicit programming skills.
professional jazz musicians (Pachet 2004) as well as
with children (Addessi and Pachet 2005) involved
in free, unstructured improvisation. However, this
Computer Music Journal, 33:4, pp. 56-68, Winter 2009 type of interaction shows limitations when users
? 2009 Massachusetts Institute of Technology. want to structure their production?in other words,

56 Computer Music Journal

This content downloaded from 67.184.195.145 on Fri, 14 Jul 2017 00:14:12 UTC
All use subject to http://about.jstor.org/terms
when they want to shift from improvisation to A Machine-Learning Tagging System
composition.
Description-based design adds a further com The second component is a tagging system, associ
ponent to the Continuator-like interaction by ated with a machine-learning algorithm that learns
introducing an explicit linguistic construct, pre a mapping between user tags and the feature set.
cisely aimed at addressing this "structure" problem In our case, this machine-learning component is a
inherent in free-form improvisation systems. The support vector machine (SVM). SVMs are automatic
idea is as follows. In the first phase, the system classifiers routinely used in many data-mining
generates objects?melodies, in our case?randomly applications (Burges 1998). In the training phase, the
or according to specific generators. The user can SVM builds an optimal hyper-plane that separates
then freely tag these objects with words, "jumpy," two classes, so as to maximize the so-called
"flat," "tonal," "dissonant," etc. Each object can be "margin" between the classes. This margin can then
tagged by several words, or by none. In the second be used to classify new points automatically using
phase, the user selects a starting object (say, a flat a geometrical distance as illustrated subsequently.
melody), and one of the previously assigned tags SVMs have been used extensively to learn music
(say "jumpy"). The user can then ask the system information, in particular in the audio domain,
to produce a new object that will be "close" to the typically using spectral features (Mandel, Poliner,
selected one, but "more jumpy" or "less jumpy." and Ellis 2006). In our case, we use them to learn
More generally, the user can reuse any of the tags to classes from symbolic features, as described herein.
modify a given object in the semantic direction of The tags are entered by the users as free text.
the tag. The system will then attempt to generate a To each tag is associated an SVM classifier that is
new object that optimally satisfies two conditions: retrained each time a user adds or removes a tag for
being "close" to the starting object, and increasing an object. To avoid undesirable effects such as over
(or decreasing) the probability of being of a certain fitting, a feature-selection algorithm is applied prior
tag. The new object (in our case, a progressively to the training phase. We use the IGR (information
"jumpier" melody) is then added to the palette of gain ratio) algorithm (Quinlan 1993) by which only a
objects created by the system. It can, in turn, be limited number of features are kept, maximizing the
tagged or refined at will. The design activity is there "information gain" of each retained feature. Once
fore strictly restricted to tagging objects and creating trained, this classifier can compute a probability
variations using these tags. The implementation of for that tag to be true for any object. Similarly, the
this scheme requires a combination of components classifier computes the probabilities, for a selected
that we briefly describe here. melody, of all learned tags, according to the current
state of the system in the session. More precisely, for
Object Generator a given tag, we use Boolean classifiers trained on pos
itive and negative examples. Positive examples are
We call the objects that the user wants to produce all the objects having been tagged by the user with
target objects. In our context, these objects are de this tag. Negative examples are chosen automati
fined by a set of technical features that describe these cally by the system, according to various heuristics.
objects and a generator that produces sets of objects. (Notably, it chooses approximately the same number
The generator is a program that should be able to of negative examples as positive ones, and it chooses
randomly generate every possible object of interest. only objects that have been tagged, obviously with
The choice of the feature set will influence the tags other than the one under consideration.)
capacity of the system to faithfully learn user tags, Of course, the accuracy of this prediction depends
but identifying a reasonable feature set is usually on many factors, including the feature set, but also
straightforward. These two ingredients are therefore the number of examples (i.e., objects having been
easy to design. We give the details for the particular tagged by the user). In the experiment described in
case of unaccompanied melodies subsequently. this article, we show empirically that the predictions

Pachet 57

This content downloaded from 67.184.195.145 on Fri, 14 Jul 2017 00:14:12 UTC
All use subject to http://about.jstor.org/terms
Figure 1. A support vector defined by particular and the hyperplane is probability to belong to A
machine defines a class "margin" points called traditionally interpreted as is greater. Note that the
separation in a support vectors and define the inverse of the distance between an item
n-dimensional space as an the margin. The lighter probability for the item to and its variations
hyperplane of dimension line, in the middle of the belong to the class outside (variationl and variation2)
n ? 1, dividing the space margin, represents the the boundary In this case, is not necessarily
into two regions. In this class separation itself. The variation2 is closer to A meaningful.
figure, the heavy lines are distance between a point than variationl, so its

melodies herein. Increasing the probability of the


variation to belong to the class represented by the
tag is done in our case by exploiting the probability
given by the SVM, as described in the previous
section. Sorting generated variations according to
the distance to the initial object is a crucial step,
as it ensures that the resulting variation will be as
close as possible to the starting object.
There are several ways to implement such a
distance. One is to use the feature set described
in the previous section, which defines a natural
distance between two objects (e.g., a Euclidean
distance). The feature space can also be transformed,
e.g., by the kernel used for classification. In our
Support Vectoxs * example, a standard radial basis function kernel
was used. However, the distance between two items
given by the SVM is not necessarily appropriate, as
are satisfactory after about 70 examples for each kernel transforms aim primarily at optimizing class
category, but this result is not general. separation and not at defining a meaningful distance
A crucial aspect of the classifier to work in our between items of the training set.
context it that it should yield, for a given item, A better option is to introduce a domain
not only a class membership, but a probability of dependent distance. In the case of melodies, we
membership. SVM are an appropriate framework in use a Levenshtein (1965) distance on the pitch
this case, as they precisely transform a classification sequence, as described subsequently. A simple ex
problem into a geometrical distance problem. Once tension of this algorithm is the use of "compound
trained to classify between two classes, say A and commands." Arbitrary Boolean expressions can be
B, the SVM identifies a set of support vectors that formed from basic "more" or "less" commands,
define an optimal margin between A and B, as illus such as "more Tx AND less T2 AND as T3." (The
trated in Figure 1. The classification decision for an tags Tn are typically adjectives.) Such an extended
item / is then made on the basis of the distance of J
Boolean expression can be easily substituted to the
to the hyperplane defined by the support vectors. As test of line 6 in the pseudo code of Figure 2. An
a consequence, one can interpret this distance as the example is given in the section entitled "Stretching
inverse of the probability for I to belong to A (or B). a Tonal Melody," where we generate a "more long
This is particularly true for items that are "outside" AND as tonal" melody.
the hyperplane, namely, that are not classified as We will now describe an application of our
belonging to the class under consideration. scheme to the construction of melodies.

A Combinatorial Generator The Case of Unaccompanied Melodies

The task of the combinatorial generator is to gener Melodies are a good example to illustrate our
ate variations of an object that maximize two prop approach because they are both technical objects
erties: being as close as possible to the initial object, and subjective ones.
and increasing (or decreasing) the probability of a
given tag/classifier by a given ratio. A naive, combi Five Types of Melodies
natorial version of this algorithm is given in Figure 2.
The generation of variations is domain-specific. There are many known technical features to
We describe a simple variation generator for describe melodies, notably related to pitch

58 Computer Music Journal

This content downloaded from 67.184.195.145 on Fri, 14 Jul 2017 00:14:12 UTC
All use subject to http://about.jstor.org/terms
Figure 2. The commands is Figure 3. A typical tonal
combinatorial search straightforward. Here, N is melody (My Rifle, My
algorithm. The "FindLess" a predetermined number Pony and Me, originally
algorithm is similar, and of variations to be sung by Dean Martin and
extension to compound generated. Ricky Nelson), here,
stylized.

FindMoreOfTag (Source, Tag, ratio)

1. initial_Tag_Prob := probability that Source is classified as Tag

2. Variations := N randomly generated variations of Source

3. Sorted_Variations := Variations sorted according to distance to


Source

4. For all V in Sorted_Variations do

5. Prob_V := classify V according to Tag


6. If (prob_V > (initial_Tag_Prob * (1 + ratio)) return V
7. If no object was found with a probability of Tag being greater

than initial_Tag_Prob then report failure, or restart with greater


value of N

Figure 2

distribution, repetition, or tonality. There are also


many subjective appreciations one can think about
to talk about melodies (simple, jumpy, linear,
Figure 3
annoying, dissonant, etc.). There is, furthermore,
no simple way to associate these subjective
expressions to the technical features, especially for category. To this end, we chose not to consider
non-musicians. Even for trained musicians, finding arbitrary subjective categories, but limit the ex
the "right melody" can sometimes be an extremely periment to five "controlled" categories: tonal,
difficult task. So melodies are an ideal playground brown, and serial, as well as long and short. The
for description-based design. justification for this choice comes from the clarity
In this experiment, we restrict ourselves to the of the definition of each of them, which allows us
composition of four-bar unaccompanied melodies, to test the results non-ambiguously. More precisely,
with a maximum of four possible note durations we introduce the following (possibly overlapping)
(quarter, half, dotted half, and whole notes) and categories.
a pitch range of [60, 80] (in MIDI pitch). Figure 3
gives an example of such a melody. Although these
Tonal Melodies
restrictions may appear drastic compared to real
melodies, these constraints still define a search Tonal melodies are melodies having a clear tonal
space of more than 2016 possible items, large enough center. Although the notion of tonality has long
to justify the use of our framework. been an object of debate in musicology as well
The aim of the experiment described here is as in cognitive science (Temperley 2007), it is
to demonstrate that the algorithm proposed and quite easy to produce melodies with a clear tonal
described previously essentially works, i.e., does center, and we give a simple algorithm to do so
produce "close variations" that increase the proba herein. An example of a tonal melody is given in
bilities of melodies to be of an arbitrary subjective Figure 3.

Pachet 59

This content downloaded from 67.184.195.145 on Fri, 14 Jul 2017 00:14:12 UTC
All use subject to http://about.jstor.org/terms
Figure 4. The melody of Figure 5. Examples of (a)
With a Little Help from tonal, (b) serial, and (c)
My Friends (here, stylized) brown melodies generated
is a typical brown melody. by our three generators.

and provided them to the system with their


corresponding tags. Examples of these gener
Figure 4 ated melodies are given in Figure 5. The gen
erators are defined as follows. For each genera
tor, the basic operation described is repeated nb
times, where n b is a random number between 0
and 16.
The tonal generator randomly draws notes from
A _(a) _
a random scale (e.g., C major). Each note falling on
a beat is chosen from the triad of the scale. The
other notes are chosen randomly from the scale.
(b)
The duration of each note is randomly selected
among quarter and half notes. The serial generator
randomly draws a pitch from an initial list of all
(c) possible pitches. It then removes it from the list
and repeats the operation. When the list is empty,
Figure 5 the list is filled again. This ensures a "maximally"
serial melody given the pitch range. The brown
Brown Melodies generator starts from a random pitch. It then
randomly draws an interval in [-1, +1] and adds it
Brown melodies are melodies with only small to the preceding pitch. In all three cases, the notes'
intervals. The "brown" term is borrowed from the
MIDI velocities values (corresponding roughly to
famous experiment of Voss and Clarke (1978), who loudness) are random integers taken in the range
compared random, Brownian, and 1// melodies. We [70, 100].
give later the description of a simple algorithm that The variation generator we use is a deliberately
generates brown melodies. A typical example of a simple algorithm. Starting from a set containing
brown melody is the song With a Little Help from only the initial melody, it generates a variation by
My Friends by John Lennon and Paul McCartney randomly applying one of the following three modi
(see Figure 4). fications: (1) modify the pitch of a randomly selected
note; (2) insert a random note with a random pitch
Serial Melodies and velocity,- or (3) remove a randomly selected
note.
Serial melodies are defined here to be melodies in
The resulting variation is then added to the set,
which all pitches occur with equal frequency. Serial
and the process is repeated by randomly applying
melodies are rarely used in popular music, but this
a new "seed" melody from the updated set. This
category is useful for our demonstration, as it bears
procedure creates, by definition, variations of various
an unambiguous definition. Typical serial melodies
"depths," i.e., similar as well as different melodies.
are dodecaphonic melodies in which all twelve pitch
classes are used. In the experiments described here, the number of
variations produced and explored is set to 20,000.
Additionally, we introduce two categories: long
The set of features we use for describing melodies is
and short. These categories are simply related to the shown in Table 1.
number of notes. Like the preceding ones, they will
Many other melody features could be introduced,
be defined only by a set of examples. but we restrict ourselves to this list in the context
of this experiment. As we will see subsequently,
Simple Experiment
the velocity feature is not used in this particular
We generated 72 examples of each of the three experiment and is inserted here just to show
primary categories (tonal, serial, and brown) the robustness of the algorithm. As mentioned

60 Computer Music Journal

This content downloaded from 67.184.195.145 on Fri, 14 Jul 2017 00:14:12 UTC
All use subject to http://about.jstor.org/terms
Figure 6. Initial melody though here not all pitch
created by the serial classes). The serial
melody generator. The classifier yields a
melody is perfectly serial probability of 1.0. The
according to our definition tonal classifier yields a
(all pitches are different, probability of 8 x 10-3.

previously, a feature-selection algorithm is applied


on the feature set prior to learning to select the
most meaningful features given the set of examples 4N',jMlf1 If JJliill
and counterexamples for a given tag. This feature
selection has the extra advantage of giving an
indication about how the classifier has generalized
these examples. Then, we introduce the "long" and
from the examples.
"short" categories by tagging the generated melodies
accordingly, and we also train the corresponding long
and short classifiers. After this step, the system is
Experiments able to predict each of these five categories for
any (possibly untagged) melody. We then perform
We first generate a set of examples using our three a series of experiments using these generated
primary melody generators: tonal, serial, and brown. melodies as starting points and the classifiers as
We then train the corresponding classifiers on modifiers.

Table 1. The Set of Features Used to Represent and Learn Melodies


Feature Notes
Number of notes
Mean value of the pitch sequence
Mean value of the "pitch interval" seq
Mean value of the velocity (MIDI inform
Tonal weight This feature giv
computed using a "pitch
each of the possible twelv
notes of the melody that
value of this count.
Pitch compressibility ratio This f
a data-compression algor
2004). Its value lies betw
sequence with the same n
Interval compressibility ratio This
sequence of intervals rath

Table 2. The Feature-Selection Mech


Tonal Brown Serial Long Short
1 0.791 1 0.795 1 1
meanPitch meanPitchlnterval meanPitch nbNotes nbNotes
2 0.788 0.948 0.649 0.67 0.75
tonalWeight intervalCompRatio pitchCompRatio meanP
3 0.546 0.437 0.489 0.586 0.723
meanPitchlnterval meanPitch meanPitchlnterval tonalWeight
4 0.49 0.139 0.399 0.264 0.634
pitchCompRatio tonalWeight tonalWeight pitchCompRa

Pachet 61

This content downloaded from 67.184.195.145 on Fri, 14 Jul 2017 00:14:12 UTC
All use subject to http://about.jstor.org/terms
Figure 7. The same Figure 8. Still a bit "more Figure 9. Again, "more Figure 10. "More tonal,"
melody, a bit "more tonal." Probability is now tonal." Probability is now again. Probability is
tonal" The differences are 1.4 x 10"1. 3.11 x 10"1. 5.3 x 10"1.
highlighted. The tonal
classifier has increased its
probability to 5.7 x 10"2. Figure 11. Probability is
now 7.7 x 10"1.

^ji'.jiThijmi i.. 4<j jf,jrlriiJ ,i ir JJ irijiJi


Figure 7 Figure 9

.i^gnniJjirJJiiggi
Figure 8 Figure 10

Training Phase

After the training phase, each of the five categories Figure 11


has been trained on approximately 72 positive
examples and 72 negative examples. The negative
"Tonalizing" a Serial Melody
examples are chosen automatically for the system,
for the brown, tonal, and serial categories, by
The first example consists in starting from a serial
picking up random melodies which are not tagged
melody (see Figure 6) and making it progressively
with the corresponding tag. In the case of the long
more tonal. Figures 7-13 illustrate the process step
and short categories, we help the system in telling
by-step. At each step, a new melody is generated
it to use iong examples as negative examples for
that is both similar to the preceding one and slightly
short, and conversely. This trick is used to avoid
more tonal. The initial melody is generated with
having "bad" counterexamples, as some generated
the serial generator. The last one is optimally tonal
melodies could turn out to be long (or short) without
while being still "close" to the original. We indicate
being tagged as such.
the probability of each classifier [tonal and serial) as
As a result, we give here the result of the feature well as the effective measure of "tonalness." Note
selection process applied for each tag. (Only the first
that such a measure is usually impossible to get with
four most-significant features are kept.) This gives
arbitrary categories, hence the use of control cate
an indication of which features were selected by the
gories for this experiment. These figures and the re
classifier and with which weight. These numbers
indicate how "well" the classifiers have understood sulting melodies indicate clearly that the system has
correctly learned the notion of tonal and serial, and,
the semantics of each generator. The most important
more importantly, that it is able to use these classi
features for brown, long, and short do fit with the
fiers as melody generators controlled by the tags.
corresponding semantics of the generator. It can
be observed that the "tonal" classifier did use the Table 3 indicates the progressive increase in
"tonalness" at each step of the process. This increase
feature "tonalWeight," but not in the first position.
is confirmed by the increase of real "tonalness" of
This slight discrepancy is owing to the limited
the melody as computed by the tonalWeight feature
number of examples given for training. It has a small
(described in Table 1).
incidence on the process, as shown in Table 2.
In a second step, we now consider a series of test
cases, illustrating the use of classifiers as melody Stretching a Tonal Melody
constructors using the description-based algorithm.
In particular, we show that the algorithm is able to The second experiment consists of using
produce variations that would otherwise be found description-based design to stretch a melody, i.e.,
only by very specific programs. adding more notes. Of course, an easy solution to

62 Computer Music Journal

This content downloaded from 67.184.195.145 on Fri, 14 Jul 2017 00:14:12 UTC
All use subject to http://about.jstor.org/terms
Figure 12. Final step; the "similar" to the initial Figure 13. A slightly more not learned perfectly by
melody is now perfectly serial melody. The tonal version of the the classifier, the
tonal (in B-flat major, accidentals are displayed melody. The melody is, algorithm found an
when notes are spelled without correction and strictly speaking, not more artificial way of improving
enharmonically, with a thus do not reflect the tonal than the preceding its "understanding" of
probability of the tonal tonality, which contains one. However, because the "tonalness" by reducing
classifier of 9.6 x 10_1J> yet flats rather than sharps. notion of "tonalness" was the number of notes.

I < j f j r if ii i r'iJ i
this problem consists of explicitly programming a is highlighted by the fact that the corresponding
function to add notes to a given melody. But we can probabilities of being tonal have shifted from 0.99
again avoid the use of such explicit programming. to 0.07 (see Table 4). This phenomenon is not unex
Instead, we can reuse the two tags long and short, pected, however, as the system has just been asked to
trained with examples generated with the other make the melody "more long," but it was not given
three generators. By convention, we tag melodies any constraint on tonality or any other property.
with fewer than eight notes as short, and melodies A natural way to address this problem consists
with greater than twelve notes as long. As a in issuing a compound conjunctive command of the
consequence, the system automatically learns long form "more long AND as tonal." Such a query is
and short, with examples coming from all three made by selecting the tags and the corresponding
generators. We can check that the system has cor modifiers through a specific interface (see Figure 15).
rectly learned the tags long and short by observing In this case, starting from the same melody, we
the selected features, as indicated in Table 2. obtain the melodies in Figure 16. We can observe
The feature "number of notes" was selected as that the algorithm has now progressively added
a primary feature for long and short. This feature only tonal notes (F and G). Most importantly, these
selection process also shows incidentally that the added notes have been chosen as a "side effect"
system has however not simply associated long and of the compound command, and not through the
short to the number of notes, but rather to a more introduction of an explicit representation of tonality
complex configuration of features. For instance, in the program.
it turns out that most of the long melodies also Note that this compound query triggers a non
have more repetition in their interval sequence. trivial search. Figure 17 illustrates the search process
The system has no way to generalize better in corresponding to a "more long AND as tonal"
our context ("better" would be to consider only query At each iteration (x axis), the probabilities
the feature nbNotes). But as we will see, this for tags long (dashed line) and tonal (plain line)
approximation does not prevent it from producing are displayed. A solution is found when the tonal
"meaningful" variations. plain line is within the two horizontal dashed
We now consider a melody that is tagged both lines (which represent the bounds ?10 percent of
as tonal and short, illustrated in Figure 14 (the first the starting "tonalness") and simultaneously the
melody). In a first step we will make it longer as in long (dashed) line is above the horizontal large line
the previous experiment, i.e., through the command (which represents "more long" by 15 percent). It can
more long. As we can observe, this command indeed be seen that the system explores about 300 melodies
results in a similar melody, with more notes. The before finding a solution.
sequence of "more long" commands is illustrated
in Figure 14, and it can be noted that the melodies
have all indeed progressively more notes (from seven
initially up to ten). Making a Tonal Melody More Brown
However, it can also be noted that the notes that
have been added to the melody by the combinatorial The last example starts from a tonal melody that is
algorithm make it not tonal any longer: The initial progressively made more brown. We illustrate again
melody is in C major, but added notes (D-sharp, C the process step-by-step in Figure 18; note also the
sharp, and F-sharp) are not in the C-major scale. This increase in brownness in Table 5.

Packet 63

This content downloaded from 67.184.195.145 on Fri, 14 Jul 2017 00:14:12 UTC
All use subject to http://about.jstor.org/terms
Figure 14. A short and probabilities of being long Figure 15. An interface for less, or ignorej and the
tonal melody as a starting are(b) 1.25 x 10"3, (c) specifying conjunctive corresponding ratios. Here,
point for repeated 9.77 x lO"1, and (d) 1.0. compound queries holding a query to produce an item
"stretching" operations. The latest melody cannot on several tags that is "more long by 15
The probability of being be stretched any farther, as simultaneously. For each percent AND as tonal by
long is initially (a) its probability of being tag the user can specify the 10percent," while the
1.06 x 10~7. Successive long is 1.0. type of action (as, more, other tags are ignored.

$r jJr MJ II _ (a)
jg^gj^^l Q as <?) more Q less Q?
any serial '^q =
| any short _
t as tonal by 10% . | OK I cancel j

: jj'ii^i'ubi
(d)

Table 3. The Progressive Increase in "Tonalness" at Each Step of the Process Leading to Sequences in
Figures 7-14
Melody Versions Serial Classifier Tonal Classifier "Tonalness"
1: Initial 1.0 8 x 10~3 0.54
2: More tonal 1.0 5.7 x 10"2 0.625
3: More tonal 1.0 1.4 x 10"1 0.66
4: More tonal 9.95 x 10"1 3.11 x 10"1 0.70
5: More tonal 8.15 x 10"1 8.15 x 10"1 0.75
6: More tonal 1.82 x 10"2 7.7 x 10"1 0.79
7: More tonal 3.87 x 10"5 9.63 x 10"1 0.87
8: More tonal 1.72 x 10"7 9.98 x 10"1 1.0

This experiment again shows that the in probability


this experiment are not able to fully gra
of a given tag (here, brown) does increase
notionafter
of brownness. However, the classifie
each modification query. It can be observed
enough that to produce meaningful "small vari
the resulting melody is indeed more Furthermore,
Brownian the user can tag at any step
in the sense that intervals are gettingresulting
smaller on new melodies and retrain the clas
average. There is a limit obtained by thecontinuously
system that fine-tune the system.
cannot increase brownness further after step 12,
as the probability reaches 1.0, although one could
imagine further small modifications of Discussion
the melody
to make it more Brownian (e.g., lowering the initial
G-sharp). This can be explained by the fact
We have that description-based design as a
introduced
the brown classifier either has too few examples
novel way of building musical objects that does
(and counter-examples), or that the features
not require chosen
any form of programming knowledge

64 Computer Music Journal

This content downloaded from 67.184.195.145 on Fri, 14 Jul 2017 00:14:12 UTC
All use subject to http://about.jstor.org/terms
Figure 16. The (a) short given in Table 4. The final
and tonal melody now melody is (b) close to the
progressively made "more original, (c) longer, and (d)
long and as tonal." still as tonal as the starting
Respective probabilities of one.
being long and tonal are

editing. The approach is, in essence, more suited to


the use of subjective, non-controlled descriptions,
and reaches its full potential when these descriptions
(a) are collected massively from social tagging systems.
Such an experiment is currently under way, using
tags collected from a melody competition Web site.
In this context, users can produce melodies, tag the
(b) melodies of others, and reuse tags for modifying
melodies. Another application of this paradigm is

jr iJJr U(jjUJ'lj' I
to use the system to control the generation of jazz
improvisations as an extension of the Continuator
(c) system (Pachet 2004). In this latter case, subjective
tags are used as control handles to influence, in real

<?"jJJriiJuJ(j)ii* ii (d)
time, the quality of generated solos.
Note that the working example showed here is
based on the SVM, but other classifiers could be
used. Decision trees, for instance, have been shown
from the user. The only programming constraint recently to exhibit better geometrical properties than
lies in the variation generators: They must be SVMs (Alvarez, Bernard, and Deffuant 2007). They
designed in such a way that they generate at least could be substituted for SVMs without changing the
the target objects (together with possibly many framework described here.
unwanted objects). However, this is a relatively The algorithm we propose is blind, bearing some
weak constraint, as these generators can be designed similarity with other blind search algorithms like
once for all, for a particular domain (here, melodies). genetic algorithms (GAs), often used in music gener
Of course, more- or less-efficient generators could ation (Biles 1994). GAs could indeed be used to build
be considered, but default naive generators are easy our variations, instead of specifically designed ran
to design. dom generators. However, GAs are more difficult to
This article only aims at demonstrating the nature control than random generators. Most importantly,
of the underlying algorithm using simple, well we believe that our naive search algorithm can
understood examples. The experiment presented be optimized by exploiting information about the
here used controlled categories to illustrate the features used by the classifier, and this constitutes a
algorithm and to show its capacity to produce current avenue of research. Such an optimization is
musical objects without explicit programming or not possible by definition in GAs, which operate on

Table 4. Variations on the Starting "Short and Tonal" Melody


Melody Versions long tonal Number of Notes
1: Short and tonal 1.0 x 10"7 9.9 x 10"1 7
2: 1 More long 1.25 x 10"3 9.2 x 10"1 8
3:2 More long 9.77 x 10"1 6.13 x 10"1 9
4:3 More long 1.0 7.0 x 10"2 10
5: 1 more long AND as tonal 3.8 x 10~5 9.9 8
6: 5 more long AND as tonal 1.4 x 10"1 9.96 x 10"1 9
7: 6 more long AND as tonal 1.0 9.87 x 10"1 10
Variations 2-4 are obtained by applying the "more long" command. Variations 5-7 are obtained by applying the "more long AND as
tonal" command to the same initial melody.

Pachet 65

This content downloaded from 67.184.195.145 on Fri, 14 Jul 2017 00:14:12 UTC
All use subject to http://about.jstor.org/terms
Figure 17. The evolution of
the search process during
the "more long by 15
percent AND as tonal by
10 percent" query.

* * long . tonal ^ ^ rnore long by 5* *ryr?-*?- less tonal by 10 *M3mm more tonal by 10

^ ^^^^^^^^ ^|^f
? - ?a a a s * is 58 r a * 3 lllili IB S S s??3 81 id 8 5 S "I ?

Table 5. The Progressive Increase in Brownness, thereby reducing the need for users to learn
Starting from an Initial Tonal Melody technical languages of the objects they have
mind." This approach is well suited to domain
Melody Versions Brown
which the features to describe objects are kn
1: tonal 0.0 and accurate, and users have the capacity to
2: 1 More brown 0.0 express subjective judgments in a consistent w
3: 2 More brown 6.65 x 10"35 This applies to most of the musical objects cr
4: 3 More brown 4.67 x 10"33 in the context of computer-assisted composit
5: 4 More brown 4.14 x 10~32 For instance, description-based design is curr
6: 5 More brown 1.15 x 10"29 being applied to other types of musical object
7: 6 More brown 2.88 x 10"28 particular, chords, chord sequences, and harm
8: 7 More brown 4.71 x 10"22 melodies.
9: 8 More brown 3.31 x 10"20
10: 9 More brown 1.41 x 10"8 Audio synthesis is also being investigated.
11:10 More brown 9.95 x 10"1 For instance, programming FM sounds requires
12: 11 More brown 1.0 notoriously complex knowledge of FM synthesis.
Several approaches have attempted to provide users
with more subjective means of programming sound
chromosomes, which are independent ofsynthesizers
the feature(Rolland and Pachet 1996; Sarkar,
sets used by the classifiers. Vercoe, and Yang 2007). But these approaches are
always
Description-based design attempts to bridge based on a fixed, pre-programmed represen
tation
the gap between description and construction, of supposedly universal subjective judgments.

66 Computer Music Journal

This content downloaded from 67.184.195.145 on Fri, 14 Jul 2017 00:14:12 UTC
All use subject to http://about.jstor.org/terms
Figure 18. Various steps in to remove a note, to later
making a tonal melody add another note back (h).
"more brown." Note that At the last step (1), the
at (f), the algorithm finds only way to improve
no other solution to (slight brownness) is to
increase brownness than remove a note.

Description-based design allows users to express

Y' ? 1 ^ ^(a) 1J J 1 ^ ^
personal subjective judgments about sound textures
and reuse these judgments to explore sound spaces
in an intuitive and personal way. In the case of audio
loops, our approach can benefit from two sets of tech
nologies: The large corpus of studies in the domain
4 - 1 J J J J 'tlJ J 1 J' * 11(b)
of audio features that yield efficient representations
of audio objects, and the emerging technologies
of concatenative sound synthesis (Schwartz 2006),
which provide us with the variation generators
j?r JtJJ JJ(c) W J 1 ^ needed for description-based design.

References

(d) Addessi, A.-R., and F. Pachet. 2005. "Experiments with


a Musical Machine: Musical Style Replication in 3/5
Year Old Children." British Journal of Music Education
22(l):21-46.
Alvarez, I., S. Bernard, and G. Deffuant. 2007. "Keep the
_(e) Decision Tree and Estimate the Class Probabilities
Using Its Decision Boundary." Proceedings of the
' l^tJ JuJ l|J J(f)
20th IJCAL Rochester Hills, Michigan: International
Joint Conferences on Artificial Intelligence, pp. 654
659.
Assayag, G, et al. 1999. "Computer Assisted Composition
at IRCAM: Patch Work and OpenMusic." Computer
(g) Music Journal 23(3):59-72.
Biles, J. 1994. "Genjam: A Genetic Algorithm for Gen
erating Jazz Solos." Proceedings of the 1994 Interna
tional Computer Music Conference. San Francisco,
(h) California: International Computer Music Association,
pp. 131-137.
Burges, C. J. C. 1998. "A Tutorial on Support Vector
Machines for Pattern Recognition." Data Mining and
Knowledge Discovery 2:121-167.
(i) Hamanaka, M., K. Hirata, and S. Tojo. 2008. "Melody
Morphing Method Based on GTTM." Proceedings
of ISMIR 2008. Philadelphia, Pennsylvania: Morgan
Kauffman, pp. 107-112.
G) Krumhansl, C. 1990. Cognitive Foundations of Musical
Pitch. New York: Oxford University Press.
Lerdahl, F., and R. Jackendoff. 1983. A Generative Theory

T' ? l|J?^ J J li'l J (k)


of Tonal Music. Cambridge, Massachusetts: MIT
Press.
Letz, S., Y. Orlarey, and D. Fober. 1998. "The Role of
Lambda-Abstraction in Elody." Proceedings of the
ft" ' lj tJ JJ I|JJ
(I)
1,1* I 1998 International Computer Music Conference. San
Francisco, California: International Computer Music
Association, pp. 377-384.

Pachet 67

This content downloaded from 67.184.195.145 on Fri, 14 Jul 2017 00:14:12 UTC
All use subject to http://about.jstor.org/terms
Levenshtein, V. I. 1965. "Binary Codes Capable of Programming." Computer Music Journal 20(3):47
Correcting Deletions, Insertions, and Reversals." 58.
Cybernetics and Control Theory 10(8):707-710. Sarkar, M., B. Vercoe, and Y. Yang. 2007. "Words
Mandel, M., G. Poliner, and D. Ellis. 2006. "Support that Describe Timbre: A Study of Auditory Per
Vector Machine Active Learning for Music Retrieval." ception Through Language." Paper presented
Multimedia Systems 12(1):3-13. at the Language and Music as Cognitive Sys
Pachet, E 2004 "Beyond the Cybernetic Jam Fantasy: tems Conference (LMCS-2007), Cambridge, UK,
The Continuator." IEEE Computer Graphics and 11-13 May.
Applications 4(l):31-35. Schwartz, D. 2006. "Concatenative Synthesis: The
Pachet, F. 2008. "The Future of Content Is in Ourselves." Early Years." Journal of New Music Research 35(1 ):3
ACM Computers in Entertainment 6(3). Available at 22.
www. acm.org/pubs/cie/. Temperley, D. 2007. Music and Probability. Cambridge,
Quinlan, J. R. 1993. Programs for Machine Learning. Los Massachusetts: MIT Press.
Altos, California: Morgan Kaufmann. Voss, R. E, and J. Clarke. 1978. "1// Noise in Music:
Rolland, P.-Y., and F. Pachet. 1996. "A Framework Music From 1// Noise." Journal of the Acoustical
for Representing Knowledge about Synthesizer Society of America 63(1):258-261.

68 Computer Music Journal

This content downloaded from 67.184.195.145 on Fri, 14 Jul 2017 00:14:12 UTC
All use subject to http://about.jstor.org/terms

You might also like