Last Hottest 100 Revision

This is my last revised list for the Hottest 100.

This list is generated using a more sophisticated method (yet again) after a discussion over Twitter I had with @chrisjrn about the kinds of bias there might be in the sample.

I refined the list with three techniques.

Firstly, I address a line-wrap problem with the images. Because of the way the votes are displayed when you finish putting them in, songs get line-wrapped onto a second line if they’re longer than about 30-40 characters. That unfairly penalised long song/artist names, because the matching algorithm uses lines as a delimiter (it needs some sort of stop character to know when to stop counting text as a possible song name to match). I wrote some code to “unwrap” these lines most of the time, though some will still be unwrapped because it still requires character recognition to get things part of the way there.

Secondly, I moved to only counting “complete” ballots. In the list of images, some of the song/artist titles get munged as part of the OCR process. In my previous code, I was counting any song I could unmunge enough to find a match, but that meant I was discarding songs semi-randomly.

Only songs aren’t independent of each other: a person’s taste in music isn’t completely random, so their choices of songs won’t be random, so each “ballot” will have some similarity. If I only count completed ballots, I bias the sample towards readable ballots instead of just song names that are harder to OCR than others.

The new code counts a “complete” ballot as one that the program can match between 8 and 10 songs on. The others are marked for human intervention and could be fixed up by hand if I wanted to, which I don’t.

Finally, I swapped to using a much faster locality hash method: a Levenshtein distance ratio. This is a different way to measure string similarity, and the Python implementation here is much, much faster than the pure-Python Nimsimsa hash I was using. Each run now completes in 19 seconds instead of about 20 minutes.

I found the idea of a Levenshtein distance via the PyPI page for Fuzzy while looking into soundex implementations courtesy of an in-passing suggestion from another colleague of mine earlier today.

I ran both the “ballot only” and a “all matched songs” matching programs, and the results are beyond the click-through. They’re remarkably close, but there are a couple of significant differences that suggest to me that the ballot method will prove a better predictor.

We’ll find out in a little over 24 hours. I’m looking forward to it!

Here are the rankings for both methods in one table for a side-by-side comparison.

Ballot Rank Ballot Votes Simple Rank Simple Votes Song
1 368 1 405 Vance Joy – Riptide
2 325 2 369 Arctic Monkeys – Do I Wanna Know?
3 280 3 313 Flume & Chet Faker – Drop The Game
4 275 4 300 Violent Soho – Covered In Chrome
5 211 5 239 Preatures, The – Is This How You Feel?
6 197 6 230 Arctic Monkeys – Why’d You Only Call Me When You’re High?
7 188 7 207 James Blake – Retrograde
8 177 8 203 Rufus – Take Me
9 170 9 187 Lorde – Royals
10 165 10 185 Kite String Tangle, The – Given The Chance
11 162 11 181 Kanye West – Black Skinhead
12 162 12 178 Foals – My Number
13 161 13 174 London Grammar – Strong
14 144 14 166 Daft Punk – Get Lucky
15 142 16 163 Matt Corby – Resolution
16 141 17 161 Thundamentals – Smiles Don’t Lie
17 138 15 164 Rufus – Desert Night
18 132 18 150 Touch Sensitive – Pizza Guy
19 128 19 150 Arctic Monkeys – Arabella
20 126 20 140 Haim – The Wire
21 117 23 128 Grouplove – Ways To Go
22 117 21 133 Disclosure – When A Fire Starts To Burn
23 115 25 119 Chvrches – Gun
24 114 22 129 Lana Del Rey – Young And Beautiful
25 109 27 117 Flight Facilities – Stand Still {ft. Micky Green}
26 107 29 114 Wombats, The – Your Body Is A Weapon
27 106 32 114 Bloc Party – Ratchet
28 105 28 115 Lorde – Team
29 104 24 127 Vampire Weekend – Step
30 104 31 114 Chvrches – Recover
31 102 30 114 Lorde – Tennis Court
32 102 26 118 Childish Gambino – 3005
33 101 35 104 Vampire Weekend – Diane Young
34 99 33 113 Cloud Control – Scar
35 98 36 104 Haim – Falling
36 97 39 101 Goldroom – Embrace
37 91 40 98 Sticky Fingers – Australia Street
38 91 37 104 Amity Affliction, The – Born To Die
39 90 38 101 Safia – Listen To Soul, Listen To Blues
40 90 34 107 Arcade Fire – Reflektor
41 84 41 91 Dustin Tebbutt – The Breach
42 83 44 87 Max Frost – White Lies
43 81 46 84 Chet Faker – Melt {ft. Kilo Kish}
44 80 42 90 Panama – Always
45 76 43 88 Kanye West – Bound 2
46 75 50 79 Robert Delong – Global Concepts
47 75 47 82 Daft Punk – Lose Yourself To Dance
48 73 45 86 Remi – Sangria
49 70 49 79 San Cisco – Get Lucky {like A Version}
50 70 51 77 Kingswood – Ohio
51 70 52 77 Andy Bull – Keep On Running
52 69 57 71 National, The – Graceless
53 69 48 80 Disclosure – You & Me {flume Remix}
54 67 58 70 Rudimental – Free {ft. Emeli Sande}
55 66 56 72 Illy – Youngbloods {ft. Ahren Stringer}
56 65 59 69 London Grammar – Hey Now
57 65 54 75 Empire Of The Sun – Alive
58 65 147 29 Dillon Francis – Without You {ft. Totally Enormous Extinct Dinosaurs}
59 65 53 76 Big Scary – Luck Now
60 64 61 67 Mikhael Paskalev – I Spy
61 61 60 68 St Lucia – Elevate
62 61 63 64 Ms Mr – Fantasy
63 58 55 74 London Grammar – Wasting My Young Years
64 58 62 66 Daft Punk – Instant Crush
65 57 70 59 British India – Summer Forgive Me
66 55 71 58 Illy – Ausmusic Month Medley {like A Version}
67 54 66 60 Cloud Control – Dojo Rising
68 54 65 62 Arcade Fire – Afterlife
69 53 68 59 Vampire Weekend – Unbelievers
70 53 74 56 Haim – If I Could Change Your Mind
71 53 67 60 Bring Me The Horizon – Sleepwalking
72 52 64 64 Daft Punk – Doin’ It Right
73 52 72 57 Cold War Kids – Miracle Mile
74 51 75 56 Dune Rats – Red Light, Green Light
75 50 69 59 Rudimental – Waiting All Night {ft. Ella Eyre}
76 50 76 56 Danny Brown – Dip
77 50 78 55 Chvrches – Lies
78 50 79 55 Boy & Bear – Harlequin Dream
79 49 82 53 Two Door Cinema Club – Changing Of The Seasons
80 49 77 55 Jake Bugg – What Doesn’t Kill You
81 49 86 50 Horrorshow – Dead Star Shine
82 49 83 53 Disclosure – White Noise {ft. Alunageorge}
83 48 85 51 Fidlar – No Waves
84 48 163 26 A$ap Rocky – F**kin’ Problems {ft. Drake, 2 Chainz & Kendrick Lamar}
85 47 107 40 Queens Of The Stone Age – If I Had A Tail
86 47 84 51 King Krule – Easy Easy
87 47 73 56 Kanye West – Blood On The Leaves
88 47 80 55 Andy Bull – Baby I Am Nobody Now
89 46 90 48 Daughter – Youth
90 46 89 49 Bring Me The Horizon – Shadow Moses
91 44 87 50 Boy & Bear – Southern Sun
92 43 88 49 Rufus – Tonight
93 43 91 47 Jackie Onassis – Smoke Trails
94 40 94 46 Mr Little Jeans – Oh Sailor
95 40 97 44 Jungle Giants, The – Skin To Bone
96 40 92 47 Disclosure – F For You
97 40 95 45 Cloud Control – Promises
98 39 98 44 Josh Pyke – Leeward Side
99 39 99 44 A Day To Remember – Right Back At It Again
100 38 103 42 Broods – Bridges
101 37 105 40 Violent Soho – In The Aisle
102 37 106 40 Smith Street Band, The – Ducks Fly Together
103 37 108 40 Jungle Giants, The – I Am What You Want Me To Be
104 37 115 38 Frightened Rabbit – The Woodpile
105 37 81 54 Disclosure – Help Me Lose My Mind {ft. London Grammar}
106 37 225 18 Busta Rhymes – Thank You {ft. Q-tip, Kanye West & Lil Wayne}
107 37 96 45 Bliss N Eso – Act Your Age
108 36 102 42 Northlane – Quantum Flux
109 36 101 43 Blink-182 – When I Was Young
110 36 111 40 Alunageorge – Attracting Flies
111 36 104 41 Allday – Claude Monet
112 35 114 39 Flux Pavilion – Do Or Die {ft. Childish Gambino}
113 35 100 43 Fatboy Slim & Riva Starr – Eat, Sleep, Rave, Repeat
114 35 121 36 Born Ruffians – Needle
115 34 123 34 Duke Dumont – Need U (100%) {ft. A*m*e}
116 34 122 36 Birds Of Tokyo – Lanterns
117 33 119 36 Meg Mac – Every Lie
118 33 113 39 John Butler Trio – Only One
119 32 112 39 Ta-ku – I Miss You
120 32 93 46 Something For Kate – Sweet Nothing {like A Version}
121 32 151 28 Major Lazer – Jessica {ft. Ezra Koenig Of Vampire Weekend}
122 32 118 37 Kanye West – New Slaves
123 32 116 38 Bring Me The Horizon – Can You Feel My Heart
Bookmark the permalink.

One Comment

  1. Pingback: Booting out the Warmest 100 | chrisjrn's site

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>