In my last post, I wrote about using burnside’s lemma to count the number of unique ways beads could be arranged if strung on a bracelet. After writing the post, I had two other questions: how many unique possible chords could be played using the Western 12-note scale? And, as someone who listens to a lot of music, it is likely that I have heard every possible chord which can be composed?
The question of how many unique notes could be played using the 12-note scale was fairly easy. I simply considered each note orientation to be a necklace (allowing rotations but not reflections) with 12 possible beads. I considered notes that were played to be of one color, and notes that were not played to be of a different color.
using Combinatorics
function comb(n, k)
return factorial(n) / factorial(k) / factorial(n - k)
end
comb (generic function with 1 method)
n_orientations_0_1_11_12 = 1 ## there is only one way to play all notes, play one note, or leave one note out.
print("There are ", Int(n_orientations_0_1_11_12), " unique orientations in which 0, 1, 11, or 12 notes can be played")
## 2/10
n_orientations_2_10 = (comb(12, 2) + 6)/ 12 # at 6 o clock, there are 6 different fixed points
print(", ", Int(n_orientations_2_10), " unique orientations in which 2/10 notes can be played")
## 3 / 9
n_orientations_3_9 = (comb(12, 3) + 4*2)/ 12
print(", ", Int(n_orientations_3_9), " unique orientations in which 3/9 notes can be played")
## 4/8
n_orientations_4_8 = (comb(12, 4) +
comb(6, 2) + #rotation at half
3*2 #at 3 and 9 o clock there are 3 possible combinations
)/ 12
print(", ", Int(n_orientations_4_8), " unique orientations in which 4/8 notes can be played")
## 5/7
n_orientations_5_7 = (comb(12, 5))/ 12 ## no way for there to be any fixed points beyond the null
print(", ", Int(n_orientations_5_7), " unique orientations in which 5/7 notes can be played")
n_orientations_6 = (comb(12, 6) +
2 * 2 + #2, and 10 o clock both have 2 choices
comb(4, 2) * 2 + #4 and 8 o clock rotations can distribute 1/3 the beads (2) among 4 possible slots
comb(6, 3) # 6 o clock can distribute 3 beads among 6 possible slots
)/ 12
print(", and ", Int(n_orientations_6), " unique orientations in which 6 notes can be played")
println("")
println("In total, there are ", Int(n_orientations_0_1_11_12 * 4 + (
n_orientations_2_10
+ n_orientations_3_9
+ n_orientations_4_8
+ n_orientations_5_7) * 2 + n_orientations_6), " orientations in which any number of notes can be played ")
There are 1 unique orientations in which 0, 1, 11, or 12 notes can be played, 6 unique orientations in which 2/10 notes can be played, 19 unique orientations in which 3/9 notes can be played, 43 unique orientations in which 4/8 notes can be played, 66 unique orientations in which 5/7 notes can be played, and 80 unique orientations in which 6 notes can be played
In total, there are 352 orientations in which any number of notes can be played
My second question of “it is likely that I have heard every possible chord which can be composed?” was more difficult to answer. Eventually I found the musicnet dataset, which was originally designed for training musicala ML models, but which had some labels which could be used to answer my question.
First, I’ll write a function to read the labels from the musicnet dataset into one dataframe:
using CSV
using DataFrames
"""
Read musicnet notes data from all songs into one dataframe
"""
function read_all_notes_data(labels_input_dir::String = "/Volumes/SanDisk/julia_testing/frequency_chords/musicnet",
metadata_input_dir::String = "/Volumes/SanDisk/julia_testing/frequency_chords/musicnet_metadata.csv")
dataframe_list = []
track_names = CSV.read(metadata_input_dir, DataFrame)
for labels_dir = ["train_labels", "test_labels"]
current_dir = labels_input_dir * "/" * labels_dir
for csv_file in readdir(current_dir)
if string(csv_file)[1] == '.'
continue
end
current_df = CSV.read(current_dir * "/" * csv_file, DataFrame)
insertcols!(current_df, "id" => parse(Int, csv_file[1:4]))
push!(dataframe_list, current_df)
end
end
full_dataframe = dataframe_list[1]
for i in 2:length(dataframe_list)
full_dataframe = vcat(full_dataframe, dataframe_list[i])
end
full_dataframe = leftjoin(full_dataframe, track_names, on = :id)
return full_dataframe
end
read_all_notes_data
Our resulting dataframe has a column “note” which contains an integer corresponding to the note which was played at a given time. “start_time” and “end_time” tell us when the note started and stopped being played. A “chord” may be defined a group of notes which all start at the same time. However, it could also be defined as a group of notes that are all being played at the same time. We’ll look at the frequencies that chords have been played in both of these ways.
However, before we can easily count up the frequencies of chords from this dataframe, we need to first get note a set of integer notes into a single “chord”. We want to make sure that the set of notes are transformed into the same chord regardless of whether the chord is transposed (i.e., a single integer is added to every note in the list) or if the chord was modulated (i.e., a note in the chord is replaced with a note in the chord from a different octave).
I’ll do this by defining a function get_aligned_chord that takes a set of notes and returns a given chord orientation in the same orientation every time, regardless of transpositions or modulations. I’ll need a couple of other helper functions to do this.
"""
Take a list of notes, and get the number of "unique" notes,
so notes in different octaves are considered the "same" note.
Also, account for transpositions of the same chord by making each returned note set start at 0
"""
function get_unique_notes(note_set::Vector{Int64})
unique_notes = unique(note_set .% 12)
return sort(unique_notes .- minimum(unique_notes))
end
"""
Get the "gaps" between each note and the next note in a set.
"""
function get_note_gaps(note_set::Vector{Int64})
## Get the gaps that happen after each note
vcat(note_set[2:length(note_set)], [12]) - note_set
end
"""
Score the gaps of a note list.
"""
function gap_score(gap_list::Vector{Int64})
score = 0
for gap in gap_list
score = score * 10 + gap
end
return score
end
"""
Get an aligned chord orientation corresponding to a note set.
This orientation will be the same even if a note set were transposed
(i.e., by adding an integer to each note), and it only conntains notes in one octave starting at zero
"""
function get_aligned_chord(note_set::Vector{Int64})
note_set_unique = get_unique_notes(note_set)
possible_orientations = [get_unique_notes((note_set_unique .+ i)) for i in 1:12]
gaps_lists = [get_note_gaps(orientation) for orientation in possible_orientations]
gap_scores = [gap_score(gap_list) for gap_list in gaps_lists]
return string(possible_orientations[argmin(gap_scores)])
end
get_aligned_chord
And lastly, I’ll define a function that creates a frequency table dataframe from a list of chords. These chord frequencies will be the thing that we will eventually use to answer our question.
using DataFrames, FreqTables
function freqtable_as_df(notes_list::Vector{String}, song_id::Int)
temp_freqtable = freqtable(notes_list)
output_df = DataFrame(values = names(temp_freqtable)[1], frequencies = Vector(temp_freqtable))
output_df[!,:song_id] .= song_id
return output_df
end
Finally, we can look at the different chords that have appeared in music pieces. We will look at notes, start times, and end times for each song in our dataset, and look at chords as defined by notes with the same start time, and notes whose start/end times overlap with another note’s end time.
using DataFrames
chords_df = DataFrame()
chords_overlapping_df = DataFrame()
for song_id in unique(notes_df.id)
## defining a chord as a set of notes that all start at the same time
unique_chord_list = [
get_aligned_chord(notes_df[(notes_df.id .== song_id) .& (notes_df.start_time .== start_time),:note])
for start_time in
unique(notes_df[(notes_df.id .== song_id) ,:start_time])
]
chords_df = vcat(chords_df, freqtable_as_df(unique_chord_list, song_id))
unique_chord_list_overlapping = [
get_aligned_chord(
notes_df[(notes_df.id .== song_id) .&
(notes_df.start_time .<= start_time) .&
(notes_df.end_time .>= start_time),:note]
)
for start_time in
unique(notes_df[(notes_df.id .== song_id) ,:start_time])
]
## defining a chord as a set of notes that are all played at the same time, but which may not all start at the same time
chords_overlapping_df = vcat(chords_overlapping_df, freqtable_as_df(unique_chord_list_overlapping, song_id))
end
UndefVarError: notes_df not defined
Stacktrace:
[1] top-level scope
@ In[45]:6
## don't put this one in the post, this just saves on time for making these graphs
function transform_ids(df)
df[!,"old_chords"] = df.values
df.values = [
get_aligned_chord(
[parse(Int, replace(x, r"\D"=> "")) for x in split(chord_str, ",")]
)
for chord_str in df[!,"old_chords"]]
return df
end
transform_ids (generic function with 1 method)
using CSV
musicnet_metadata = CSV.read("/Volumes/SanDisk/julia_testing/frequency_chords/musicnet_metadata.csv", DataFrame)
chords_df = leftjoin(chords_df, musicnet_metadata, on = :song_id=>:id)
chords_df_overlapping = leftjoin(chords_df_overlapping, musicnet_metadata, on = :song_id=>:id)
values_dict = Dict()
values_dict[get_aligned_chord([0, 4, 10])] = "Augmented Sixth (Italian)"
values_dict[get_aligned_chord([0, 4, 6, 10])] = "Augmented Sixth (French)"
values_dict[get_aligned_chord([0, 4, 7, 10])] = "Augmented Sixth (German)"
values_dict[get_aligned_chord([0, 4, 7, 10, 2, 6])] = "Augmented Eleventh"
values_dict[get_aligned_chord([0, 4, 8, 11])] = "Augmented Major Seventh"
values_dict[get_aligned_chord([0, 4, 8, 10])] = "Augmented Seventh"
values_dict[get_aligned_chord([0, 3, 6, 11])] = "Diminished Major Seventh"
values_dict[get_aligned_chord([0, 3, 6, 9])] = "Diminished Seventh"
values_dict[get_aligned_chord([0, 4, 7])] = "Dominant"
values_dict[get_aligned_chord([0, 4, 7, 10, 2, 5])] = "Dominant Eleventh"
values_dict[get_aligned_chord([0, 4, 7, 10, 1])] = "Dominant Minor Ninth"
values_dict[get_aligned_chord([0, 4, 7, 10, 2])] = "Dominant Ninth"
values_dict[get_aligned_chord([0, 3, 7])] = "Dominant Parallel"
values_dict[get_aligned_chord([0, 4, 7, 10])] = "Dominant Seventh"
values_dict[get_aligned_chord([0, 4, 6, 10])] = "Dominant Seventh Flat Five"
values_dict[get_aligned_chord([0, 4, 7, 10, 3])] = "Dominant Seventh Sharp Nine"
values_dict[get_aligned_chord([0, 4, 7, 10, 2, 5, 9])] = "Dominant Thirteenth"
values_dict[get_aligned_chord([0, 5, 6, 7])] = "Dream"
values_dict[get_aligned_chord([0, 7, 9, 1, 4])] = "Elektra"
values_dict[get_aligned_chord([0, 8, 11, 4, 9])] = "Farben"
values_dict[get_aligned_chord([0, 3, 6, 10])] = "Half Diminished Seventh Chord"
values_dict[get_aligned_chord([0,4, 7, 10])] = "Harmonic Seventh Chord"
values_dict[get_aligned_chord([0,3, 6])] = "Leading Tone Triad"
values_dict[get_aligned_chord([0,4, 7, 11, 6])] = "Lydian"
values_dict[get_aligned_chord([0, 1, 5, 6, 10, 0, 3, 5])] = "Magic"
values_dict[get_aligned_chord([0, 4, 7, 11, 2, 5])] = "Major Eleventh"
values_dict[get_aligned_chord([0, 4, 8, 11, 6])] = "Major Seventh Sharp Eleventh"
values_dict[get_aligned_chord([0, 4, 7, 9])] = "Major Sixth"
values_dict[get_aligned_chord([0, 4, 7, 9, 2])] = "Major Sixth Ninth"
values_dict[get_aligned_chord([0, 4, 7, 11, 2])] = "Major Ninth"
values_dict[get_aligned_chord([0, 4, 7, 11, 2, 6, 9])] = "Major Thirteenth"
values_dict[get_aligned_chord([0, 3, 7])] = "Mediant"
values_dict[get_aligned_chord([0, 3, 7, 10, 2, 5])] = "Minor Eleventh"
values_dict[get_aligned_chord([0, 3, 7, 11])] = "Minor Major Seventh"
values_dict[get_aligned_chord([0, 3, 7, 10, 2])] = "Minor Ninth"
values_dict[get_aligned_chord([0, 3, 7, 9])] = "Minor Sixth"
values_dict[get_aligned_chord([0, 3, 7, 9, 2])] = "Minor Sixth Ninth"
values_dict[get_aligned_chord([0, 3, 7, 10, 2, 5, 9])] = "Minor Thirteenth Chord"
values_dict[get_aligned_chord([0, 6, 10, 4, 9, 2])] = "Mystic"
values_dict[get_aligned_chord([1, 5, 8])] = "Neapolitan"
values_dict[get_aligned_chord([0, 4, 8, 10, 2])] = "Ninth Augmented Fifth"
values_dict[get_aligned_chord([0, 4, 6, 10, 2])] = "Ninth Flat Fifth"
values_dict[get_aligned_chord([1, 2, 8, 0, 3, 6, 7, 10, 11, 4, 7])] = "Northern Lights"
values_dict[get_aligned_chord([0, 1, 4, 5, 8, 9])] = "Ode To Napoleon"
values_dict[get_aligned_chord([0, 1, 4, 6, 7, 10])] = "Petrushka"
values_dict[get_aligned_chord([0, 7])] = "Power Chord"
values_dict[get_aligned_chord([0, 3, 7])] = "Psalms Chord"
values_dict[get_aligned_chord([0, 4, 7])] = "Secondary Dominant"
values_dict[get_aligned_chord([0, 3, 6])] = "Secondary Leading Tone"
values_dict[get_aligned_chord([0, 3, 7])] = "Secondary Supertonic"
values_dict[get_aligned_chord([0, 4, 7, 9, 10])] = "Seven Six"
values_dict[get_aligned_chord([0, 5, 7, 10])] = "Seventh Suspension"
values_dict[get_aligned_chord([0, 5, 7, 10, 3])] = "So What"
values_dict[get_aligned_chord([0, 5, 7])] = "Suspended"
values_dict[get_aligned_chord([0, 4, 7])] = "Subdominant"
values_dict[get_aligned_chord([0, 3, 7])] = "Subdominant Parallel"
values_dict[get_aligned_chord([0, 4, 7])] = "Subtonic"
values_dict[get_aligned_chord([0, 3, 7])] = "Supertonic"
values_dict[get_aligned_chord([0, 4, 7])] = "Tonic"
values_dict[get_aligned_chord([0, 3, 7])] = "Tonic parallel"
values_dict[get_aligned_chord([0, 3, 6, 10])] = "Tristan"
values_dict[get_aligned_chord([0, 6, 7])] = "Viennese Trichord 2"
values_dict[get_aligned_chord([0, 1, 6])] = "Viennese Trichord"
values_dict[get_aligned_chord([0,3,5])] = "Blues trichord"
## These ones come last because their names seem more common
values_dict["[0]"] = "Single Tone"
values_dict["[0, 1]"] = "Minor Second"
values_dict["[0, 2]"] = "Major Second"
values_dict["[0, 3]"] = "Minor Third"
values_dict["[0, 4]"] = "Major Third"
values_dict["[0, 5]"] = "Fourth"
values_dict["[0, 6]"] = "Diminished Fifth"
values_dict[get_aligned_chord([0, 4, 9])] = "Minor Triad"
values_dict[get_aligned_chord([0, 5, 7])] = "Suspended 4"
values_dict[get_aligned_chord([0, 2, 7])] = "Suspended 2"
values_dict[get_aligned_chord([0, 4, 7])] = "Major Triad"
values_dict[get_aligned_chord([0, 4, 8])] = "Diminished Triad"
values_dict[get_aligned_chord([0, 5, 10])] = "Augmented Triad"
values_dict[get_aligned_chord([0, 4, 7, 11])] = "Major 7th"
values_dict[get_aligned_chord([0, 4, 7, 10])] = "Dominant 7th"
values_dict[get_aligned_chord([0, 3, 7, 10])] = "Minor 7th"
values_dict[get_aligned_chord([0, 3, 6, 9])] = "Diminished 7th"
values_dict[get_aligned_chord([0, 4, 7, 9])] = "Add 6"
"Add 6"
function most_common_chords(input_df)
aggregated_df = combine(groupby(input_df, :values), :frequencies => sum => :sum_freq)
aggregated_df[!,"proportion_of_chords"] = aggregated_df.sum_freq ./ sum(aggregated_df.sum_freq)
sort!(aggregated_df, :sum_freq, rev = true)
aggregated_df[!,"chord_name"] = [
if chord in keys(values_dict) values_dict[chord] else chord end
for chord in aggregated_df.values
]
return aggregated_df
end
most_common_chords (generic function with 1 method)
aggregated_df = most_common_chords(
chords_df_overlapping[chords_df_overlapping[!,"composer"] .== "Bach",:]
)
aggregated_df[1:10,:]
aggregated_df = most_common_chords(
chords_df_overlapping[chords_df_overlapping[!,"composer"] .== "Beethoven",:]
)
aggregated_df[1:10,:]
aggregated_df = most_common_chords(
chords_df_overlapping[chords_df_overlapping[!,"composer"] .== "Mozart",:]
)
aggregated_df[1:10,:]
chords_df_overlapping_thirds = chords_df_overlapping[
[count(",", notes_string) for notes_string in chords_df_overlapping.values] .== 2,:]
chords_df_overlapping_thirds = chords_df
for composer in unique(chords_df_overlapping[!,"composer"])
aggregated_df = most_common_chords(
chords_df_overlapping_thirds[
(chords_df_overlapping_thirds[!,"composer"] .== composer) ,:]
)
println(composer)
println(aggregated_df[1:10,:])
end
Schubert
[1m10×4 DataFrame[0m
[1m Row [0m│[1m values [0m[1m sum_freq [0m[1m proportion_of_chords [0m[1m chord_name [0m
[1m [0m│[90m String [0m[90m Int64 [0m[90m Float64 [0m[90m String [0m
─────┼──────────────────────────────────────────────────────────────────────
1 │ [0] 35420 0.553965 Single Tone
2 │ [0, 3, 8] 5512 0.0862072 Major Triad
3 │ [0, 3] 4705 0.0735858 Minor Third
4 │ [0, 4] 4301 0.0672672 Major Third
5 │ [0, 3, 7] 2358 0.0368789 Minor Triad
6 │ [0, 5] 2275 0.0355808 Fourth
7 │ [0, 2, 6, 9] 1934 0.0302476 Dominant 7th
8 │ [0, 2] 1152 0.0180172 Major Second
9 │ [0, 3, 6] 859 0.0134347 Secondary Leading Tone
10 │ [0, 2, 9] 783 0.012246 Blues trichord
Mozart
[1m10×4 DataFrame[0m
[1m Row [0m│[1m values [0m[1m sum_freq [0m[1m proportion_of_chords [0m[1m chord_name [0m
[1m [0m│[90m String [0m[90m Int64 [0m[90m Float64 [0m[90m String [0m
─────┼──────────────────────────────────────────────────────────────────────
1 │ [0] 17885 0.493693 Single Tone
2 │ [0, 3] 3926 0.108372 Minor Third
3 │ [0, 4] 3092 0.0853507 Major Third
4 │ [0, 3, 8] 2897 0.079968 Major Triad
5 │ [0, 5] 1256 0.0346703 Fourth
6 │ [0, 3, 7] 1137 0.0313854 Minor Triad
7 │ [0, 2, 6, 9] 817 0.0225522 Dominant 7th
8 │ [0, 3, 6] 809 0.0223314 Secondary Leading Tone
9 │ [0, 2] 593 0.016369 Major Second
10 │ [0, 2, 9] 506 0.0139675 Blues trichord
Dvorak
[1m10×4 DataFrame[0m
[1m Row [0m│[1m values [0m[1m sum_freq [0m[1m proportion_of_chords [0m[1m chord_name [0m
[1m [0m│[90m String [0m[90m Int64 [0m[90m Float64 [0m[90m String [0m
─────┼──────────────────────────────────────────────────────────────────────
1 │ [0] 6203 0.44565 Single Tone
2 │ [0, 3] 1626 0.116819 Minor Third
3 │ [0, 3, 8] 1355 0.0973489 Major Triad
4 │ [0, 4] 1220 0.08765 Major Third
5 │ [0, 3, 7] 773 0.0555356 Minor Triad
6 │ [0, 5] 664 0.0477046 Fourth
7 │ [0, 3, 6] 278 0.0199727 Secondary Leading Tone
8 │ [0, 2] 253 0.0181766 Major Second
9 │ [0, 2, 9] 235 0.0168834 Blues trichord
10 │ [0, 2, 6, 9] 209 0.0150154 Dominant 7th
Cambini
[1m10×4 DataFrame[0m
[1m Row [0m│[1m values [0m[1m sum_freq [0m[1m proportion_of_chords [0m[1m chord_name [0m
[1m [0m│[90m String [0m[90m Int64 [0m[90m Float64 [0m[90m String [0m
─────┼──────────────────────────────────────────────────────────────────────
1 │ [0] 5537 0.48337 Single Tone
2 │ [0, 3] 1432 0.125011 Minor Third
3 │ [0, 4] 1337 0.116718 Major Third
4 │ [0, 3, 8] 906 0.0790921 Major Triad
5 │ [0, 5] 497 0.0433872 Fourth
6 │ [0, 3, 7] 347 0.0302924 Minor Triad
7 │ [0, 2] 232 0.0202532 Major Second
8 │ [0, 3, 6] 161 0.014055 Secondary Leading Tone
9 │ [0, 2, 6, 9] 137 0.0119598 Dominant 7th
10 │ [0, 2, 9] 118 0.0103012 Blues trichord
Haydn
[1m10×4 DataFrame[0m
[1m Row [0m│[1m values [0m[1m sum_freq [0m[1m proportion_of_chords [0m[1m chord_name [0m
[1m [0m│[90m String [0m[90m Int64 [0m[90m Float64 [0m[90m String [0m
─────┼─────────────────────────────────────────────────────────────────────────
1 │ [0] 1401 0.482438 Single Tone
2 │ [0, 4] 297 0.102273 Major Third
3 │ [0, 3, 8] 286 0.0984848 Major Triad
4 │ [0, 3] 273 0.0940083 Minor Third
5 │ [0, 3, 6] 118 0.0406336 Secondary Leading Tone
6 │ [0, 3, 7] 101 0.0347796 Minor Triad
7 │ [0, 2, 6, 9] 87 0.0299587 Dominant 7th
8 │ [0, 5] 54 0.018595 Fourth
9 │ [0, 2, 6] 44 0.0151515 Augmented Sixth (Italian)
10 │ [0, 2, 9] 41 0.0141185 Blues trichord
Brahms
[1m10×4 DataFrame[0m
[1m Row [0m│[1m values [0m[1m sum_freq [0m[1m proportion_of_chords [0m[1m chord_name [0m
[1m [0m│[90m String [0m[90m Int64 [0m[90m Float64 [0m[90m String [0m
─────┼──────────────────────────────────────────────────────────────────────
1 │ [0] 24311 0.500329 Single Tone
2 │ [0, 3] 4975 0.102387 Minor Third
3 │ [0, 3, 8] 3901 0.080284 Major Triad
4 │ [0, 4] 3848 0.0791932 Major Third
5 │ [0, 5] 2273 0.0467792 Fourth
6 │ [0, 3, 7] 1964 0.0404198 Minor Triad
7 │ [0, 3, 6] 1075 0.0221239 Secondary Leading Tone
8 │ [0, 2] 894 0.0183988 Major Second
9 │ [0, 2, 6, 9] 816 0.0167936 Dominant 7th
10 │ [0, 2, 9] 582 0.0119778 Blues trichord
Faure
[1m10×4 DataFrame[0m
[1m Row [0m│[1m values [0m[1m sum_freq [0m[1m proportion_of_chords [0m[1m chord_name [0m
[1m [0m│[90m String [0m[90m Int64 [0m[90m Float64 [0m[90m String [0m
─────┼───────────────────────────────────────────────────────────────────
1 │ [0] 6939 0.623899 Single Tone
2 │ [0, 3] 676 0.0607804 Minor Third
3 │ [0, 4] 668 0.0600611 Major Third
4 │ [0, 5] 666 0.0598813 Fourth
5 │ [0, 2] 343 0.0308398 Major Second
6 │ [0, 3, 8] 342 0.0307499 Major Triad
7 │ [0, 3, 7] 225 0.0202302 Minor Triad
8 │ [0, 6] 175 0.0157346 Diminished Fifth
9 │ [0, 1] 129 0.0115986 Minor Second
10 │ [0, 3, 6] 112 0.0100701 Secondary Leading Tone
Ravel
[1m10×4 DataFrame[0m
[1m Row [0m│[1m values [0m[1m sum_freq [0m[1m proportion_of_chords [0m[1m chord_name [0m
[1m [0m│[90m String [0m[90m Int64 [0m[90m Float64 [0m[90m String [0m
─────┼──────────────────────────────────────────────────────────────────────
1 │ [0] 4579 0.455124 Single Tone
2 │ [0, 5] 802 0.0797137 Fourth
3 │ [0, 4] 750 0.0745453 Major Third
4 │ [0, 3] 639 0.0635126 Minor Third
5 │ [0, 3, 8] 526 0.0522811 Major Triad
6 │ [0, 3, 7] 417 0.0414472 Minor Triad
7 │ [0, 6] 417 0.0414472 Diminished Fifth
8 │ [0, 2] 238 0.0236557 Major Second
9 │ [0, 3, 6] 206 0.0204751 Secondary Leading Tone
10 │ [0, 2, 6] 142 0.0141139 Augmented Sixth (Italian)
Bach
[1m10×4 DataFrame[0m
[1m Row [0m│[1m values [0m[1m sum_freq [0m[1m proportion_of_chords [0m[1m chord_name [0m
[1m [0m│[90m String [0m[90m Int64 [0m[90m Float64 [0m[90m String [0m
─────┼───────────────────────────────────────────────────────────────────
1 │ [0] 40158 0.807228 Single Tone
2 │ [0, 3] 3430 0.0689475 Minor Third
3 │ [0, 4] 2234 0.0449063 Major Third
4 │ [0, 5] 1073 0.0215687 Fourth
5 │ [0, 2] 535 0.0107542 Major Second
6 │ [0, 3, 8] 486 0.00976924 Major Triad
7 │ [0, 3, 7] 416 0.00836215 Minor Triad
8 │ [0, 3, 6] 323 0.00649272 Secondary Leading Tone
9 │ [0, 6] 276 0.00554796 Diminished Fifth
10 │ [0, 1] 180 0.00361824 Minor Second
Beethoven
[1m10×4 DataFrame[0m
[1m Row [0m│[1m values [0m[1m sum_freq [0m[1m proportion_of_chords [0m[1m chord_name [0m
[1m [0m│[90m String [0m[90m Int64 [0m[90m Float64 [0m[90m String [0m
─────┼──────────────────────────────────────────────────────────────────────
1 │ [0] 170155 0.587471 Single Tone
2 │ [0, 3] 25082 0.0865972 Minor Third
3 │ [0, 4] 20964 0.0723795 Major Third
4 │ [0, 3, 8] 17302 0.0597362 Major Triad
5 │ [0, 5] 10996 0.0379644 Fourth
6 │ [0, 3, 7] 7475 0.0258079 Minor Triad
7 │ [0, 2] 5763 0.0198971 Major Second
8 │ [0, 2, 6, 9] 4954 0.017104 Dominant 7th
9 │ [0, 3, 6] 4766 0.0164549 Secondary Leading Tone
10 │ [0, 2, 9] 3054 0.0105441 Blues trichord
chords_df_overlapping_thirds = chords_df_overlapping[
[count(",", notes_string) for notes_string in chords_df_overlapping.values] .== 2,:]
#chords_df_overlapping_thirds = chords_df_overlapping
proportion_major_list = []
proportion_minor_list = []
proportion_major_of_all_list = []
proportion_minor_of_all_list = []
composer_list = []
song_id_list = []
for composer in unique(chords_df_overlapping[!,"composer"])
for song_id in unique(chords_df_overlapping[chords_df_overlapping[!,"composer"] .== composer, "song_id"])
aggregated_df = most_common_chords(
chords_df_overlapping_thirds[
(chords_df_overlapping_thirds[!,"composer"] .== composer) .&
(chords_df_overlapping_thirds[!,"song_id"] .== song_id),:]
)
if ("Major Triad" in aggregated_df.chord_name ) & ("Minor Triad" in aggregated_df.chord_name )
major_triad_prop = aggregated_df[(aggregated_df.chord_name .== "Major Triad"), "proportion_of_chords"][1]
minor_triad_prop = aggregated_df[(aggregated_df.chord_name .== "Minor Triad"), "proportion_of_chords"][1]
push!(proportion_major_list, major_triad_prop / minor_triad_prop)
push!(proportion_major_of_all_list, major_triad_prop)
push!(proportion_minor_of_all_list, minor_triad_prop)
push!(proportion_minor_list, minor_triad_prop/ major_triad_prop )
push!(composer_list, composer)
push!(song_id_list, song_id)
end
end
end
prop_major_df = DataFrame(proportion_major = proportion_major_list,
composer = composer_list,
song_id = song_id_list,
proportion_major_of_all = proportion_major_of_all_list,
proportion_minor_of_all = proportion_minor_of_all_list,
proportion_minor = proportion_minor_list)
using Gadfly
plot(prop_major_df, x = :composer, y = :proportion_major,
Stat.x_jitter(range=0.5),
Geom.point#, Scale.y_log10
)
id="img-c4237f00">
]]>
using Gadfly
plot(prop_major_df, x = :composer, y = :proportion_minor,
Stat.x_jitter(range=0.5),
Geom.point#, Scale.y_log10
)
id="img-1cc5cbc3">
]]>
using Gadfly
plot(prop_major_df, x = :composer, y = :proportion_major_of_all,
Guide.xlabel( "Composer"),
Guide.ylabel( "Proportion Major Triads"),
Guide.title( "Proportion of all chords in songs that are major triads"),
#alpha = [.3 for x in prop_major_df.composer],
#Stat.x_jitter(range=0.5),
#Geom.point,
Geom.violin,
)
id="img-f6a9874c">
]]>
using Gadfly
plot(prop_major_df, x = :composer, y = :proportion_minor_of_all,
Guide.xlabel( "Composer"),
Guide.ylabel( "Proportion Major Triads"),
Guide.title( "Proportion of all chords in songs that are major triads"),
#alpha = [.3 for x in prop_major_df.composer],
#Stat.x_jitter(range=0.5),
#Geom.point,
Geom.violin,
)
id="img-3d31c885">
]]>
plot(prop_major_df[prop_major_df.composer .== "Bach",:], x = :proportion_major, Geom.histogram)
id="img-72c41217">
]]>
sum(prop_major_df[prop_major_df.composer .== "Bach","proportion_major"] .< 1)
22
println(prop_major_df[prop_major_df.proportion_major .< .5,:])
[1m8×3 DataFrame[0m
[1m Row [0m│[1m proportion_major [0m[1m composer [0m[1m song_id [0m
[1m [0m│[90m Any [0m[90m Any [0m[90m Any [0m
─────┼──────────────────────────────────────
1 │ 0.361111 Bach 2208
2 │ 0.232558 Bach 2209
3 │ 0.448276 Bach 2230
4 │ 0.318182 Bach 2282
5 │ 0.333333 Bach 2304
6 │ 0.375 Bach 2305
7 │ 0.448617 Beethoven 2390
8 │ 0.44227 Beethoven 2393
println(prop_major_df[prop_major_df.proportion_major .> 20,:])
most_common_chords(
chords_df_overlapping_thirds[
(chords_df_overlapping_thirds[!,"song_id"] .== prop_major_df[prop_major_df.proportion_major .> 20,:song_id]),:]
)
[1m1×3 DataFrame[0m
[1m Row [0m│[1m proportion_major [0m[1m composer [0m[1m song_id [0m
[1m [0m│[90m Any [0m[90m Any [0m[90m Any [0m
─────┼──────────────────────────────────────
1 │ 25.0 Beethoven 2538
sum(prop_major_df[prop_major_df.composer .== "Bach","proportion_major"] .> 1)
26
unique(chords_df_overlapping[!,"composer"])
10-element Vector{Union{Missing, String15}}:
"Schubert"
"Mozart"
"Dvorak"
"Cambini"
"Haydn"
"Brahms"
"Faure"
"Ravel"
"Bach"
"Beethoven"
most_common_chords(chords_df)
using CSV
chords_df_overlapping = CSV.read("chords_overlapping_df.csv", DataFrame)
chords_df = CSV.read("chords_df.csv", DataFrame)
chords_df_overlapping = transform_ids(chords_df_overlapping)
chords_df = transform_ids(chords_df)