Heaps’ law and vocabulary richness in the history of classical music harmony
Centre de Recerca Matemàtica, Edifici C, Campus Bellaterra, 08193, Barcelona, Spain
2 Dolby Laboratories, Diagonal 177, P10, 08018, Barcelona, Spain
3 Departament de Matemàtiques, Universitat Autònoma de Barcelona, 08193, Barcelona, Spain
4 Barcelona Graduate School of Mathematics, 08193, Barcelona, Spain
5 Complexity Science Hub Vienna, Josefstädter Straße 39, 1080, Vienna, Austria
Accepted: 30 June 2021
Published online: 18 August 2021
Music is a fundamental human construct, and harmony provides the building blocks of musical language. Using the Kunstderfuge corpus of classical music, we analyze the historical evolution of the richness of harmonic vocabulary of 76 classical composers, covering almost 6 centuries. Such corpus comprises about 9500 pieces, resulting in more than 5 million tokens of music codewords. The fulfilment of Heaps’ law for the relation between the size of the harmonic vocabulary of a composer (in codeword types) and the total length of his works (in codeword tokens), with an exponent around 0.35, allows us to define a relative measure of vocabulary richness that has a transparent interpretation. When coupled with the considered corpus, this measure allows us to quantify harmony richness across centuries, unveiling a clear increasing linear trend. In this way, we are able to rank the composers in terms of richness of vocabulary, in the same way as for other related metrics, such as entropy. We find that the latter is particularly highly correlated with our measure of richness. Our approach is not specific for music and can be applied to other systems built by tokens of different types, as for instance natural language.
Key words: Heaps’ law / Entropy / MIDI scores / Harmonic richness / Culturomics
© The Author(s) 2021
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.