And now for the dramatic conclusion to The Great MP3 Bitrate Experiment you've all been waiting for! The actual bitrates of each audio sample are revealed below, along with how many times each was clicked per the goo.gl URL shortener stats between Thursday, June 21st and Tuesday, June 26th.
| Limburger | ~160kbps VBR | 10,265 |
| Cheddar | 320kbps CBR | 7,183 |
| Gouda | raw CD | 6,159 |
| Brie | ~192kbps VBR | 5,508 |
| Feta | 128kbps CBR | 5,567 |
During that six day period, my overall Amazon CloudFront and S3 bill for these downloaded audio samples was $103.72 for 800 GB of data, across 200k requests.
Based on the raw click stats, it looks like a bunch of folks clicked on the first and second files, then lost interest. Probably because of, y'know, Starship. Still, it's encouraging to note that the last two files were both clicked about 5.5k times for those that toughed their way out to the very end. Of those listeners, 3,512 went on to contribute results. Not bad at all! I mean, considering I made everyone listen to what some people consider to be one of the bestworst "rock" songs of all time. You guys are troopers, taking one in the ear for the team in the name of science. That's what I admire about you.
I belatedly realized after creating this experiment that there was an easy way to cheat. Simply compress all the samples with FLAC, then sort by filesize.
10,836,505 We+Built+This+City+-+Excerpt+(Feta).flac 11,054,288 We+Built+This+City+-+Excerpt+(Limburger).flac 11,294,757 We+Built+This+City+-+Excerpt+(Brie).flac 11,731,999 We+Built+This+City+-+Excerpt+(Cheddar).flac 11,816,415 We+Built+This+City+-+Excerpt+(Gouda).flac
The higher the bitrate, apparently, the less compressible the audio files are with lossless FLAC compression. It's a small difference in absolute file size, but it's enough to sort exactly with quality. At least you can independently verify that I wasn't tricking anyone in this experiment; each sample was indeed different, and the bitrates are what I said they were.
But you guys and gals wouldn't do that, because you aren't dirty, filthy cheaters, right? Of course not. Let's go over the actual results. Remember each sample was ranked in a simple web form from 1 to 5, where 1 is worst quality, and 5 is highest quality.
The summary statistics for the 3,512 data points:
| Avg | Std dev | ||
| 160kbps VBR | (Limburger) | 3.49 | 1.38 |
| 320kbps CBR | (Cheddar) | 3.30 | 1.34 |
| raw CD audio | (Gouda) | 3.34 | 1.26 |
| 192kbps VBR | (Brie) | 3.27 | 1.29 |
| 128kbps CBR | (Feta) | 2.95 | 1.40 |
(If you'd like to perform more detailed statistical analysis, download the Excel 2010 spreadsheet with all the data and have at it.)
Even without busting out hard-core statistics, I think it's clear from the basic summary statistics graph that only one audio sample here was discernably different than the rest – the 128kbps CBR. And by different I mean "audibly worse". I've maintained for a long, long time that typical 128kbps MP3s are not acceptable quality. Even for the worst song ever. So I guess we can consider this yet another blind listening test proving that point. Give us VBR at an average bitrate higher than 128kbps, or give us death!
But what about the claim that people with dog ears can hear the difference between the higher bitrate MP3 samples? Well, first off, it's incredibly strange that the first sample – encoded at a mere 160kbps – does better on average than everything else. I think it's got to be bias from appearing first in the list of audio samples. It's kind of an outlier here for no good reason, so we have to almost throw it out. More fuel for the argument that people can't hear a difference at bitrates above 128kbps, and even if they do, they're probably imagining it. If we didn't throw out this result, we'd have to conclude that the 160kbps sample was somehow superior to the raw CD audio, which is … clearly insane.
Running T-Test and Analysis of Variance (it's in the spreadsheet) on the non-insane results, I can confirm that the 128kbps CBR sample is lower quality with an extremely high degree of statistical confidence. Beyond that, as you'd expect, nobody can hear the difference between a 320kbps CBR audio file and the CD. And the 192kbps VBR results have a barely statistically significant difference versus the raw CD audio at the 95% confidence level. I'm talking absolutely wafer thin here.
Anyway, between the anomalous 160kbps result and the blink-and-you'll-miss-it statistical difference between the 192kbps result and the raw CD audio, I'm comfortable calling this one as I originally saw it. The data from this experiment confirms what I thought all along: for pure listening, the LAME defaults of 192kbps variable bit rate encoding do indeed provide a safe, optimal aural bang for the byte – even dogs won't be able to hear the difference between 192kbps VBR MP3 tracks and the original CD.
| [advertisement] Stack Overflow Careers matches the best developers (you!) with the best employers. You can search our job listings or create a profile and even let employers find you. |
Very interesting. Good experiment. For your next one, you can eliminate the advantage of the first and second links by randomizing the order each time you serve the page.
Jpotisch on June 27, 2012 11:39 AMIf you account for the fact that you did 5 comparisons instead of 1, you may find that the anomalous results suddenly become statistically insignificant, ala http://xkcd.com/882/. (It's possible your spreadsheet accounted for that, didn't check).
If you're interested in these kind of things, check out HydrogenAudio and the Listening Tests forum there, they run these kind of public blind listening tests regularly and the result you got certainly isn't a surprise. (There's also some tools like ABC/HR that make running them much easier.)
Gian-Carlo Pascutto on June 27, 2012 11:40 AMI stopped listening partway through the first sample, because I realized that I was hearing harsh digital artifacts & crappy-sounding digital reverb tails that were almost certainly present in the original "CD-quality" master. That album was recorded in the mid-80s, a particularly bad time to be making a big-budget album, because all the studios were buying early digital gear.
Might I suggest for round two that you pick a record that actually sounds really *good* on the CD master?
Rossgrady on June 27, 2012 11:48 AMYou need to do ANOVA and then multiple-comparisons corrections using Tukey HSD.
R code looks roughly like this:
myaov <- aov( rating ~ quality, data=mydata )
summary(myaov) # See if anything's sig.
TukeyHSD(myaov) # See what's driving the significance with correction for multiple comparisons.
You also haven't accounted for what equipment people were listening on. There is quite a difference in what you'll hear between el cheapo Apple earbuds (probably what most people used) and a decent pair of headphones/speakers. Bass response is usually the first to go. I have 3 (two Sennheiser, 1 Sony) pairs of phones on my desk and they all sound very different
David Hayes on June 27, 2012 12:06 PMIt isn't insane to think that a compressed version sounds better than the CD. Isn't this the gist of why recording artists prefer analog tape over digital recordings - it smooths the sound out eliminating certain harsh transitions.
Corey Henderson on June 27, 2012 12:07 PMI agree with Rossgrady. You should do another round with a better quality song. And randomize the order of the samples.
About the first sample being highest ranked:
If the source material isn't so good there is a chance that the compression actually makes it better by filtering out bad stuff. Have you considered this?
Not that disagree with your conclusion, but I do have to take issue with the statement, "nobody can hear the difference between a 320kbps CBR audio file and the CD." All this experiment proves is that on average people can't tell the difference. To prove your original statement you'd need to show that given a large sample set of people there are none who can reliably select CD quality audio from 320kbps CBR audio given a selection of samples in different formats. A small number of people who have "dog ears" could easily be lost in the static. For what it's worth, I think you are likely correct.
Mike King on June 27, 2012 12:11 PMI'd like to add another vote to the "better quality round" suggestion, as this is interesting subject matter but the choice of song kinda ruins the experiment up front.
Perhaps some precision-crafted electronic music would better reveal the subtle artifacts introduced by compression than some ancient and, by your own admission, awful cheesy pop?
mdsharpe on June 27, 2012 12:34 PMWhat @Mike said, precisely. Only averages are shown here. Exceptionally good ears are completely disregarded, as are exceptionally bad ears.
> even dogs won't be able to hear the difference between 192kbps VBR MP3 tracks and the original CD
Well, I did. I admit it would have taken me more than the 15 seconds of listening to distingish the 320kbps from the flac, but 192kps was significantly unpleasant to listen to. But I had the two top-quality clips right within mere seconds of listening. Without a shred of doubt.
I'm absolutely convinced that if you did an actual test to see how much more accurate some people would classify this than others, you'd find that there are a lot of better-than-dogs persons.
Regardless, this _was_ very enlightening, because I would have guessed everybody would spot the quality differences without a hitch.
Wow.
Seth Heeren on June 27, 2012 12:56 PMI was almost right: you didn't sneak in anything lower than 128 Kbps, but you used CBR and it's VBR that I did my test with.
Other than that, I guess the other four are transparent, since I rated Brie top and my notes say I couldn't figure a difference between Cheddar and Limburger. =)
Skyborne on June 27, 2012 12:58 PMHellooooo confirmation bias!
Stannius on June 27, 2012 12:58 PMYou'll notice the number of downloads is almost directly proportional with the length of the cheese name given to each sample. Since longer words mean longer (and thus larger) links on the page, I think you've definitively proven we're all just monkeys randomly clicking the screen, giving us a greater chance of hitting the bigger targets.
Keith Grant on June 27, 2012 1:04 PMI'd like to see whether there's a set of people who did rank these in the correct order, and to see whether they could do it again, and again, on some more blind tests.
If they failed in accordance with pure chance on subsequent tests, then I'd believe the claimed result. :)
For now, I take it as a provisional likelihood. :)
Interesting experiment, but unsuprising results. I have just two things to say - first: the choice of the song was bad, you really can't tell audio quality of a rock song (many of the instruments use distortions, are intentionally badly recorded etc.). You should choose something with high dynamic range - either something classical or electronical (and I don't mean club dance music by that).
Second - most of the listeners won't have speakers/monitors/headphones with sufficient sound quality to distinguish the difference of bitrates over 192kbps. I personally think that 192kbps is absolutelly good for listening if you play it at standard volume. Only at high volumes and on good equipment are you able to tell the diffrence.
So why struggle to get better quality? Because people are emotional - it is technically almost identically hard to produce low or high bitrate files (or loseless at that matter) once you have clean master, so you naturally want to have the best thing you can. And if you are a musician than you might want to sample it somedays). P.S.: Sorry for my english.
"precision-crafted electronic music"
Die horribly.
No, what you want for this is an all-digital master of one of the best modern performances of one of the world's greatest works of art -- http://www.amazon.com/Beethoven-Symphony-No-9-Choral/dp/B000001G6W -- or, for those unwilling to follow links, the Berlin Philharmonic, under von Karajan, performing Beethoven's Ninth. (If I'm reading this right, it's the 1977 recording, digitally remastered in 1990. Pace whoever, with probably good reason, talked about early digital recordings -- I've got a halfway decent ear, I guess, for someone not rich enough to get away with 'audiophile', and I can't pick out any artifacts in this one. I bet you can't either.)
Ever wonder why CDs hold as many minutes of music as they do? Because Sony wanted to make sure this performance would fit on one disc. If they thought it was so worthwhile that it was worth designing the entire format around, who are you to say otherwise? No one, that's who. Put that "Skrillex" garbage down for an hour and treat your abused ears to one of the most transcendently beautiful experiences this planet has to offer, and get you some desperately needed culture besides.
Aaron Em on June 27, 2012 1:08 PMSTEREO SEPARATION and dolby pro logic encoding. The higher bitrates have a greater degree of stereo separation and retain more dolby pro-logic encoding.
This goes to personal preference. Do you want the sound to be more ambient or do you want similar sounds to seem like they are more centered? So, with less stereo separation, a lower bitrate can have slightly better vocals of the lead singer because more of those vocals are produced in mono at both the left and right speaker.
Sharad 4ocious on June 27, 2012 1:19 PMEveryone who claims that they need more than 192kbps will happily admit that they are part of a small minority and also that it may not apply to all types of music.
Therefore, the question is not really whether the population at large can hear the difference, but whether these individuals can hear the difference.
The current experiment design can not answer this question. In the spirit of Popper and falsification you must try your hardest to prove your own theory wrong by giving these individuals the best possible chance of showing that they can hear a difference.
So a better experiment would mean:
1. The listener should get to choose the songs.
2. The songs must be somewhat representative of the listener's music collection (not a collection of sounds artificially designed to reveal encoding artefacts).
3. Each listener should be considered on their own, not in aggregate, since we are measuring a human ability. This means that statistical significance has to come from the same individual performing the test several times for several songs.
At the end of this experiment you may find that a small proportion of people can hear a difference with statistical significance. These people would then be within their rights to argue against encoding at 192kbps.
I don't blame you for doing a simple experiment. I wouldn't have thought of the above points the first time either, and they may be some more to consider. But you should accept that this experiment doesn't conclusively answer the relevant question. The best experiment would be much more difficult and expensive to do. Unfortunately, it's quite common some areas of science for the cheap and convenient experiment to be deemed conclusive when it isn't.
(Personally, I probably can't tell the difference beyond 192kbps VBR but I wouldn't be surprised if some people could.)
Prashant Pathak on June 27, 2012 1:22 PMOne more thing - some mp3 players (almost all cowons, some sandisks, some sonys) have reconstructive algorithms which try to make the sound quality of mp3 better, so it is possible that those mp3 enhanced by these algorithms might sound nicer to human listener than dry CD quality audio (which is not enhanced by the algorithms because of sufficient bitrate). It has nothnig to do with your experiment though.
Milan Jirkovsky on June 27, 2012 1:22 PMI still rip all my CDs to FLACs since it gives me the freedom to transcode albums in the future without worrying about artifacts from lossy compression.
Daniel Cheng on June 27, 2012 1:27 PMJeff: You're drawing way too many conclusions from a test with only a single sample.
Milan: "Enhancement" is a misnomer. It does not make the audio better, it only makes it different.
Gordon on June 27, 2012 1:43 PMI really, REALLY don't hope you'll try using this experiment to justify ripping your entire CD collection to a lossy format.
First, there's the obvious issue of choosing a pretty bad track, and a pretty bad method of doing the test. Even if you want to make the tests "realistic" by including crappy headphones and the like, you at least need a decent variation in in music.
Second, there's the obvious issue of "what happens if I need another lossy playback format for my portable player sometime in the future?" - transcoding between lossy formats suck.
Yup, I'm fully aware that for the majority of my music, I might not be able to hear the difference between 192kbps MP3 and FLAC on a portable music player while jogging near a crowded road, but listening to certain tracks at home I just might.
And when storage is as cheap as it is, why would you rip your collection to a lossy format? Especially when you're doing it for archival purposes, with the intent of disposing of the physical media?
snemarch on June 27, 2012 2:24 PMThe problem with these blind tests is that accurately picking the difference between songs is a very difficult task, but not necessarily because people can't hear the difference.
Unless people have practice in this type of test, they start thinking too much and the results become biased for all types of unimportant reasons. That's why you have such an outlier in the first sample.
For those of you who don't know much about wine - think of those times when you've been wine tasting. You know that there is difference between the wines, but can you actually discern which is the 'best' when you're at the cellar door sipping on 10 different choices? How often have you bought something, only to find when you get home that it wasn't what you expected? Whilst the best wine will almost certainly be much more enjoyable when it comes time to drink it, actually picking the best wine is best left to the pros.
cbp on June 27, 2012 3:36 PMI'm pretty surprised I got everything right. The 320kbps and the uncompressed ones might have been a lucky guess. They were nearly indistinguishable, and I'd hate to resort to unquantifiable quack terms to describe them.
I wouldn't say I have dog ears, though.
I'll wager that spotting bad compression is an aquired skill, much like spotting bugs and code smells. As a musician and producer, I can reliably hear MP3 compression at 192kbs and below, given decent headphones and source material. It's mostly about knowing what to look for. Percussion with alot of random high frequency content is usually the easiest giveaway. I did a blind test on myself ten years ago or so, and found I was unable to discern between 192kbps and the original CD.
Another possible bias is the fact that a study at Stanford shows that more students each year actually prefer the MP3 compressed sound.
Elektronaut on June 27, 2012 3:46 PMI don't take issue with your results. I do take issue with your interpretation. Using your results to state that "people can't hear a difference at bitrates above 128kbps" and "nobody can hear the difference between a 320kbps CBR audio file and the CD" while accurate for the average person (based on your statistics) cannot be said to be true 100% of the time.
As you discovered in your experiment, there are always outliers.
The people who make a living off of audio and audio quality; (and I don't mean DJ's, I mean Audio Engineers, Mastering Engineers and Audio technicians), I guarantee that any of these people worth their salt will be able to accurately assess each sample.
You have conducted a good experiment and collected good data, on the AVERAGE person. it is important to make this clear when reporting your results. Because if your sample included more the above mentioned professionals, your results would look very different.
Four10510134414 on June 27, 2012 4:11 PMFor the love of god, please stop with the:
"I know this one dude that can tell the difference between flac and 320kbps, so your test is totally wrong"
Yes, there will always be audiophiles, there will always be experts. I'm sure there is someone who can tell the difference between a Maryland squirrel and a Pennsylvania squirrel, but to everyone else, they just look like squirrels. The outliers don't matter, and experts are outliers.
Brendan Abel on June 27, 2012 4:59 PMI had them correctly rated. On board laptop sound card and Sennheiser 595 headphones. Semi-audiophile.
Played them all full length, and entered the grades while listening.
Did some additional testing using random, came at the same grades .
My primary tell was the ease of listening next to the (lack of) dynamic range . I often listen to streaming radio, and sometimes it takes 15 minutes to figure that the quality stinks. I think this is a subconscious thing, it irks me to hear very low quality music. So I wonder the effect of lossy compression of non-dogs' mind...
Facebook on June 27, 2012 5:06 PMLooks like I was right on the mark other than thinking Gouda was the worst.
muntoo on June 27, 2012 5:26 PMThere are at least two "industry standards" AFAIK: ATRAC at 292kbps (Sony), AAC at 256kbps (Apple iTunes Plus). The bitrates are chosen carefully by audio engineers given the efficiency of the codecs they used. MP3 is inferior to both codecs especially AAC, so it is not surprising people prefer 320kbps "to be safe". All these references are designed to be perceptually indistinguishable human beings' "psychoacoustics". So if it is easy to tell the difference, the codecs at such bitrates are simply a failure in audio research and engineering. And I don't think this is true!
Of course the choice of sample makes a big difference too. The reference is supposed to be transparent to even the trickiest music we would ever hear most of the time. The sample we have here has no quality and we don't even know how the original should sound like (its highly processed and distorted anyway). I can tell Feta (128kbps) is worst from the drums at around 0:15. The pre-echo noise is a signature of bad MP3. I can also tell Limburger (160kbps) sounds different to the rest. Given I don't believe I can tell the difference of a 320kbps to the original in such a bad sample, logically Limburger (160kbps) should be the 2nd worst. The other three, I think it is completely reasonable for me not telling the difference!
The result from the poll is not surprising at all. I am one of the average who can tell Limburger and Feta are different. But I don't rate Limburger best simply by logic!
Yathei36 on June 27, 2012 6:42 PMThe thing is that while its not out of the question that someone might tell the difference a large quantity of the posters here claim they can. These are the same people who would have taken the test. The fact that the results came out like they did mean that a lot of people are simply fooling themselves.
Brickmaster32000 on June 27, 2012 7:35 PMInteresting to me (as an amateur): all of the compressions induced clipping, but the CBR samples had substantially less clipping than the VBR samples. However, since the first reliable clipping is about 0:56, I doubt it was significant.
Misterjericho on June 27, 2012 9:05 PMWe have 3TB hard drives nowadays. The question isn't why compress with FLAC, it's "Why not?"
I use FLAC for all my rips on the PC, but then transcode to Ogg when I copy to my mobile. Banshee and Rhythmbox both make this entirely transparent. I just needed to set a setting however many years ago. Surely everything does this.
Sunny Kalsi on June 27, 2012 10:02 PMI studied Audiology and I have read a few studies on these topics: One thing that could explain the abnormally good score of the 160 VBR is that this is the sound that people are used to. Thus, they tend to rate familiar sounds higher than objectively better ones.
On a different note, I wrote my thesis on a somewhat related theme. There, I discovered that different kinds of music indeed performed differently when evaluated for sound quality. Overall trends were constant over genres, but some music showed them more pronounced than other.
Paperflyer on June 27, 2012 10:19 PMMay I suggest a few quibbles with your data and analysis, Jeff? I'll explain here, but you can follow along with my updated spreadsheet if you like.
My first observation is that fully 58% of your respondents scored one sample a 1, one sample a 2, etc., all the way through 5. In other words, they most likely misread the instructions and ranked the samples by quality rather than rating them independently. I don't think you have 3511 sets of ratings; you have 2045 sets of ranking data and 1466 sets of rating data.
Both data sets are useful and informative, but you can't freely intermix the numbers and analyze them conjointly. The data types are not commensurate, and they require different statistical techniques for analysis.
As it happens, though, removing the ranked data and looking only at the 1466 sets of presumed true ratings doesn't really change the patterns you mentioned above. A t-Test matrix on the revised data shows that Feta (128kbps) is clearly distinct from all other samples. And no pairing from the pool of Cheddar (320kbps), Gouda (raw), and Brie (192kbps) shows a statistically significant difference.
But... Limburger (160kbps) is in fact rated higher than all other samples, and each of those pairings has a p value far smaller than 0.05. The largest p value is 0.00003. That is very strong and consistent statistical evidence, and you can't just wave it away because it's "clearly insane" (i.e., it doesn't agree with your preconceptions).
I agree with you that it's highly unlikely that 160kpbs MP3s actually sound better than their higher- and lower-bit-rate counterparts. My theory is that you've demonstrated that the order of presentation of the samples influences the responses. In other words, respondents may tend to interpret the first-heard sample as a reference baseline.
As near as I can tell, you didn't randomize the order of presentation in your original post (or at least, the samples keep coming up in the same order for me...). I bet that if you reran the test with Gouda (raw) listed first, the results would directly (though probably erroneously) contradict your original thesis.
It would be interesting to take a look at the ranked data as well to see what it has to say. This is getting beyond my level of statistical knowledge, but I suspect that something like the following treatment would be appropriate: 1) Restrict the data set to 1-5 rankings. 2) Drop the Feta column (since we all agree that Feta is distinguishable; we want to see if Limburger can be distinguished from the others). 3) Recode the rankings to the range 1-4; in other words, for each person's rankings, assign the lowest value the "1", the next-highest value the "2", etc. 4) Prepare a summary table of cheese vs. 1-4 with a count of the number of the appropriate responses in each cell. 5) Prepare a reference table similar to #4, but with 1466/4 in each cell (even distribution). 6) Use a chi-squared test to test whether the distribution observed in #4 is distinguishable from the reference distribution in #5.
As Barbie says, survey design is hard...
Some people can discern between bitrates if they originally listened to the song in high quality then listen to it in a lower quality format. My father does that often, where I play a song he knows on my computer in some mp3 192 kbps and he tells me it sounds a little off. These are not the kind of differences you can notice on mediocre audio equipment, which is something these kind of tests cannot account for. Obviously someone with crappy headphones won't be able to tell the difference. It would be like watching a 1080p video on a 640x480 screen and saying there's no difference between 1080p and 720p.
Kalium on June 28, 2012 2:33 AMWoohoo! I got all five in the right order. Do I get a cookie or something of equivalent value?
I could definitely tell the raw CD from the <320kbs samples (although the drop in quality was admittedly slight in all but the 128kbps, which sounded really nasty) but discerning the difference between uncompressed and the 320kbps was much harder.
Ultimately this test has reaffirmed my personal beliefs that 1) FLAC isn't worth it purely on terms of sound quality and 2) 320kbps CBR or V0 VBR are still superior to 192kbs if you have good quality headphones. The music just sounds richer and more detailed, I can't explain it beyond that.
Tested on a Samsung netbook with Sennheiser HD 25-1 IIs (no amplifier).
Hughcorner on June 28, 2012 4:11 AMThere is some research that suggests that people now prefer the "sizzling sounds" of mp3s to RAW CD quality. Which could be the reason why 160 kbps does so well.
You'd think 128 kbps would do even better, but it's probably SO bad that it's easy to detect.
http://radar.oreilly.com/2009/03/the-sizzling-sound-of-music.html
Congratulations on completely confirming that untrained ears can tell the difference between them!
There's an interesting bit of psychology that states that the first thing we hear is treated as the baseline. And you didn't randomize the order on page load.
So for most of us, the "Limburger" file (160 kpbs) was treated as the baseline, anomalies and all. Then in all the other files, the changes in audio quality were treated as anomalies - and they got a lower rating.
All your test proves is that most of us are not familiar with the exact notes of a single arbitrary song.
Izkata Paklena on June 28, 2012 5:02 AMMust say- routers and mp3s seem like commodity things these days that don't warrant so much attention. Is it just nostalgia?
Paul Whitaker on June 28, 2012 5:47 AMInterestingly I failed the test. I felt pretty certain as I took it. At least I'd figured I'd get a better result... Oh well. I made sure not to bias myself by listening in a particular order, I downloaded all files and just listened in a random order and sorted them during listening through 5 times.
Thanks for making the test Jeff!
Niklas Winde on June 28, 2012 5:50 AMWe're talking about a low quality item, with (relatively) high quality encodings. It's like comparing the conductivity of silver, copper and gold for transmitting electricity for your cheap radioshack light dimmer. Sure they make a difference, but can you really tell when it's just light?
Most can't, some experts probably can. A lot more would notice the difference if you were transmitting a tremendous amount of electricity over a long distance to run a huge spotlight, power a car, or (ironically) to run audio equipment. The variances would be noticeable, the light would be brighter on higher quality, the car would be faster.
But here, there is nothing that needs perfect tone quality: Good test items would be a perfectly balanced choir, or a full orchestra with various solos jumping out throughout the song.
But yeah, if you're just running a light in your house, cheap wires work fine. Give us something that has quality and we might be able to tell you if it degrades.
If what you are listening to is garbage to begin with, poor encoding it isn't going to hurt it much.
Jeffrey Davis on June 28, 2012 7:06 AMHere's how I voted:
5 - Gouda (raw CD)
4 - Limburger (160)
3 - Cheddar (320)
2 - Brie (192)
1 - Feta (128)
So, excluding Limburger, they were in the right order. I'll think about what conclusions I should make. This was an interesting experiment.
I don't have a special audio card and I didn't use a headset to test this. I listened to the music through the speakers integrated in my Asus 24T1E monitor (it also works as TV). It was connected directly to my ASRock motherboard.
The 160Kb VBR sample scored best overall. Why is this?
Some hinted this, but nobody really pursued this logic:
If you compress something, you lose information.
If you lose information, the information that remains becomes more prominent.
What if the info remaining in the 160Kb VBR experiment just happened to catch the essentials of the song?
In other words, another song with the same collection of bit rates might produce another "best" sample, because the information remaining may be best suited for that song.
If this is true, experiments like this are totally futile, because they cannot show a best bit rate in general (only for a particular song). I scored 160kB VBR better than the 320kB CBR and about equal to the uncompressed CD.
Internet makes people think they suddenly expert in every field they can imagine. Jeff Atwood also falls into this category. He is an official smartass :).
Manci Panci on June 28, 2012 8:25 AMThe computer boots up and starts Windows. The 1st snap/click of the amp sounds off. Let there be light. The 2nd snap click of the Denon AVR tells me that there is sound waiting to be heard. Heard at very high levels compared to what the average troll listens to audio at. Silly people with ½" plug 3" desktop speakers. O that nasty hollow crackle as you turn your tiny nob to adjust your volume. Or the user with the other 3 to 4 plug ½" PC surround sound kit. With a hand sized "powered sub woofer". (but can it hit 30Hz and below? NO!) Better yet, the headphones guy. They can hear eeeevvvvverything right? The headphones are "surround sound" (WTF? 2 Speaker surround? next you will tell me Jesus made them just for you). Then there are the men, the men who live with sound. Creators, producers, master's,audiophiles. Those who do not live with mother and father. Those who do not rent, where they stay. Those of us who can play songs so loud you can feel it in the fibers of your clothing. So loud you can hear when the song that you are playing was ruined by compression via some space saving chump with no sense of hearing. You can hear someones ipod crap mix. Crackle crackle fizzle fizzle. And dont let there be any cymbals. O my, did anyone hear that tweeter fry? Im sure those who cant/ couldn't hear the difference, really could not because their equipment could not reproduce the entire "experience". My system automatically changes its settings for best listening experience "if you want it to" depending on the source. It will display for you the original file/sources info, on either the face of the unit, remote, or on whatever monitor you are viewing. Now obviously, a 16BIT, 128kbps song, will not utilize anywhere near the full spectrum of sound my AVR is capable of performing. Not even close to say, half, which is 7.2 surround. It would simply click, and route the song to the front 2 channels. 32Bit, 356kbps, now we might get somewhere around 3 channels with sub woofer if the song is properly coded. Then there are the audio DVDs that I own. O what a treat when you have properly encoded blast beats, and double kicks, and insane hammer-on's disturbing air-waves and watering eyes.
I have hated the mp3 codec since forever. FLAC, AIFF, even the bloated .WAV is acceptable. But user beware, go to your restrooms, grab a Q-tip, and clean your ears out. Not too far in , might loose something there. Go buy a component of caliber, and finally bask in the embrace of sound.
Trustnowoman on June 28, 2012 8:34 AM@David Hayes
In addition to headphones sounding different: my pair of Sony earbuds (I wanna say EX-51s but I'm not sure, cheaper IECs) have virtually no bass response if you're listening to mono and only have one earbud in.
That and my home stereo (fairly old stereo receiver, Yamaha Natural Sound RX-7) sounds very different based on the setting of its "variable loudness." - Basically you set the loudness to its maximum, set the volume to your maximum desired listening level, and then adjust the loudness.
Adjusting the maximum volume adds bass very quickly, whereas increasing the loudness adds more treble than bass.
So really, imho, to get accurate results I'd need to use tone bypass (no treble/bass equalizer adjustments) as well as not using the variable loudness on my receiver.
tl;dr: It's incredibly complex to account for listening equipment! It's as unique as a fingerprint.
Robert Straw on June 28, 2012 11:08 AM2 things:
Jon Hoffman on June 28, 2012 12:48 PMI used to sell high end audio gear. While most of the people who worked there were convinced of the audiophile nonsense (now expressed via Gold Plated USB Cables and CAT 5, because bits can *tell* how they got transmitted), I was skeptical.
So I screwed with things. I went to the back panel and changed the connections on the demo boards so the wrong gear was being demonstrated. Of those who claimed to be "audiophiles" (the kind of guys who bragged about spending $3000 on each component of their system) I *never* had anyone figure out that they were glowing with praise for mid-range Sony gear, even though our demo room was soundproofed from the outside and had a precisely located chair at the sweet spot for the speakers (which were, admittedly, really good surround speakers).
I don't dispute that you can hear differences in audio quality *to a point*, but the reality is that as long as you avoided absolute junk that caused amp clipping and other obvious artifacts, the difference was negligible, even when listening to classical music with a massive set of spectral and volume ranges over time.
Personally, I'm hanging onto my CDs: they make an excellent backup for my computer's music collection and eventually I will rip them uncompressed because my drive will simply be large enough that I won't care. In the meantime, 192 bit variable is my choice for "good enough".
John Lopez on June 28, 2012 12:53 PMDisregard previous comment...
2 Things:
- I tried this experiment twice on different days. The first day, I used my high-quality Sennheiser HD515's. On the second day I used my Apple iPod earbuds. Using the Sennheisers, I almost perfectly ordered the tracks (I swapped the 192 VBR for 320 CBR). However, with the iPod earbuds, all of the tracks sounded roughly identical (I actually ordered the 128kbps track SECOND best). So, I wonder if the listening setup of the respondents caused some of the anomalies in the results.
- Also, I think the reason for the 160kbps track being highest could be due to the specific qualities of the song chosen. As others mentioned, this song contained a lot of compressed, synthesized sounds. It is possible that the compression artifacts were "masked" somewhat in the 160kbps track, or the resulting sound was most pleasing, but the artifacts could be clearly heard at higher bitrates, confounding the results. It would be very interesting to know the compression characteristics of the synthesized instruments used in the original.
"I'm comfortable calling this one as I originally saw it."
Well, it is hardly conclusive when the methodology does not eliminate all confounding variables that would likely work in favour of the initial hypothesis.
Still, the whole "no one can tell the difference" meme makes some people feel superior to those fancy-smancy engineers with their edumacation and those rich bastards with expensive stereos (obvious parallels to the psychology of conspiracy theories about 9/11 and the moon landing). Yes, we are expected to believe highly trained engineers went ahead and spent time and money creating high-resolution audio deliberately knowing (or being too stupid to realise) that it makes no difference. Please.
I suspect the average person has never heard real music and has no idea how it should sound and how much more musically involving it does sound when reproduced well.
"The first principle is that you must not fool yourself and you are the easiest person to fool." - Richard Feynman
For programmers your posts used to be interesting and useful but you have clearly lost it now. Unsubscribing, sorry Jeff, had good times. It is still a good blog for general readers and parents now I guess.
Harendra Bhandari on June 28, 2012 6:03 PMPeople are more used to 160kbps - 192kbps than 320kbps. Give everyone 320kbps MP3s and they'll start saying how much 192kbps sucks. The quality matters a lot, but habits matters most.
Just my .02
fardelian on June 28, 2012 10:13 PM"Well, first off, it's incredibly strange that the first sample – encoded at a mere 160kbps – does better on average than everything else. I think it's got to be bias from appearing first in the list of audio samples. It's kind of an outlier here for no good reason, so we have to almost throw it out."
You probably know this, now, but it's fairly common in this sort of "blind taste test" thing to not always offer the same product 1st, to "shuffle the deck" between survey takers. Even for in-person tests, there's a bias if you always serve the Coke first, then the Pepsi, or vice versa.
It must be a coincidence that the samples were rated in the order of the length of their respective codenames; as if such benign things as a name could alter the results. It's nonsense, right?
Or was it the order of their appearance in the article? It's the same as the order of the results.
*No flamewar intended.*
I think MP3 has serious limitations after 192kbps. It never sounds as full as an OGG or WMA file to me.
Miguel Silva Rodrigues on June 29, 2012 1:49 AMA flawed test shows the expected results! Classic.
But you'll never admit it, because of the pervasive confirmation bias.
Honestly don't really care about the results, but as a scientist, I have to lambast you for your procedure and your bias. Really atrocious. Sorry.
Tristan Harward on June 29, 2012 9:49 AMJesus, I can't even pick the 128kbit recording out. Either I need a new set of headphones, or a new set of ears. I'm not ruling out a new auditory cortex, either.
I guess that means I'll be able to save a bit of space on my iPhone.
Db2 on June 29, 2012 10:54 AMI never take seriously all those "dog's talks" about audio quality, compression, digital vs. analog and so on, after I read an article in authoritative Russian "Stereo" magazine with a review and comparison of 5 optical TOS-link cables. They used so many words to describe characteristics of sound of each cable: one was warmer, another was metallic... you know the stuff.
I understand that there are people, who can notice the difference, but they are not the people who talk about it.
Pablomedok on June 29, 2012 12:33 PMTo the people saying "Yes, but what if I need to transcode my collection some time in the future?" - do you people *really* believe that there is going to come a time when portable audio players don't play mp3 files? I really, *really* doubt that is going to *ever* happen.
NotMe on June 29, 2012 1:22 PMTo Pablomedok above, you see how people can really make things up. TOS-link cables transmit digital signal and it is only a yes/no question (signal/no signal). A good TOS-link vs a bad TOS-link is the amount of signal error. All you would hear is how many glitches and jitter, but never how warm or how sweet or how much bass or treble etc. If someone says he can tell these kind of audio characteristics out of a TOS-link, he is just fooling himself and the readers.
Back to MP3. As I mentioned in an earlier comment here, the industry standards (AAC@256kpbs etc) are developed to be transparent in most cases, and I presume MP3@320kbps can do that too. Of course I don't rule out exceptions and also people with good ears. If you know the technical details enough you can even generate test tones to trip up these codecs, say a square wave or some harmonic combinations. But most of the time it is very hard to tell the difference to most people.
On the other hand, when Jeff claimed "that difference should be audible in any music track", it is also flawed. For an extreme case of a pure 1kHz tone. It is accurately reproduced down to 64kbps or even less.
Nevertheless, I still believe someone with good ears and equipment can tell the difference for this bad sample although there is little audio quality. Indeed many here claimed they can, and I don't doubt that. I use average gears (Macbook + Sennheiser PX200II) and yet I can pick up the worst two. Which sample is better than the other is the big question. But hang on, aren't we talking about telling the difference? We should not be rating which one sounds better. But we don't even have a reference, where is the difference??? I think Jeff should think of a more proper methodology for conducting his next experiment!
Yathei36 on June 29, 2012 11:55 PMJust another note. Jeff is the lucky one who is satisfied with 192kbps files. I once kept all my music at this bitrate and one day I hated them for I can tell the artifacts. Then I had to redo all of them in 256kbps and that was a lot of work! I don't keep lossless simply because they are large. I might as well just store all the CDs under my bed!
Yathei36 on June 30, 2012 12:01 AMThe only sad thing is that most songs in online stores are only available in lossy mp3 format, a step backwards in comparison to CDs. And the more people are convinced that mp3 is enough for all times, the less likely lossless songs will be offered by the stores.
Will be again buy the same songs in 10 years (as we now do for those we buyed on tape or vinyl) when we then discover that after all mp3 is not the best format.
With storage being so cheap and music at top quality taking so little bandwidth anyway, I don't see the point in trying to be efficient. It takes all of 10 min to rip a full CD in full FLAC glory with EAC.I much rather have the lossless version that I can re-encode as I please if there was ever a need.
Similar to diamonds, most people can't tell apart a great one from a good one, but if you were relatively rich and could easily afford the great one, why settle?
thank you to Xentrax, who posted the gizmodo bitrate test music in the previous thread,
http://gizmodo.com/5251247/the-great-mp3-bitrate-test-my-ears-versus-yours
The Carmen and 'Feel Good' samples display vivid differences even to my old ears, using a Behringer DAC and Grado headphones. I could not hear anything in the cheesy samples at all, they were indifferently horrid..
Douglas Kretzmann on July 3, 2012 12:17 PMFascinating. I had a hard time hearing a difference between any of them. (Or the Gizmodo samples.) I was surprised even though I already knew I wasn't an audiophile. But that's good - it means I can spend less money on audio equipment and more on books!
Philip-sharman on July 4, 2012 9:50 AMI've recently gone thru the experience of converting my cd collection to both FLAC and variable bit rate mp3's. One of the things I learned about (sadly half way thru the process) was Vortex Box (http://vortexbox.org/content/134-About-VortexBox). Basically it's a headless OS you put discs into(cd,dvd, or blue ray) and it automatically rips and tags them. Music goes to FLAC movies to MKVs. You can also mirror those formats to others. It's pretty slick, it would have saved me a lot of time.
Andrew Walker on July 10, 2012 4:10 PMWe made a similar test at home, after investing in a nice DAC. We have a decent amp and nice vintage speakers. We used a range of music, but I think most of our test was conducted with a track of Sibellius (Finish classical composer) which has a nice dynamic range, and subtle wind and percussions which can be hard to render (and which we both happen to love). We are both amateur musicians, and love music, but we don't consider ourselves hardcore audiophiles.
Here's what we found out, in a blind test:
- Plugging in an iPod directly in the amp, through the audio jack, was awful. My husband's laptop, bought for the quality of its soundcard, was only marginally better. But in both cases, that setup was not only distinguishable, but annoying. Ok for party playing, but not for any serious listening. Unfortunately, that's how 99% of people listen to digital music, no?
- There was a slight difference between MP3 and CD, both driven through the DAC. In all honestly, I could hear a slight difference, but I couldn't decide which one I prefered. But my husband systematically distinguished the MP3 and CD (he has better ears than I do, apparently). But we both agreed that for everyday listening, both were acceptable. But we kept the CDs.
My point is that for most people, the quality of their audio system is so poor that trying to optimize the data source is a moot point.
When we were shopping for an iPod dock for our baby's nursery (just a cheap little something to play lullabies), we were appalled at the quality of what's offered. Most of those sold in large electronic stores sounded only marginally better than my iPod's built-in speaker. It took quite a while before I could find something reasonably priced and suitably compact that I wouldn't cringe listening to.
And this, along with car stereos and cheap earbuds, is what most of the population listens to.
Catherine on July 11, 2012 11:27 AMI don't know if you mentioned this earlier, but I think this is more a test of peoples hardware than their ears or the bitrate. I recently bought a pair of AKG headphones and noticed a lot of detail I has missed in the past using my speakers or in the car. So your test is more about do people have good enough audio equipment to hear the difference between bit rates.
Codemonkx on July 11, 2012 8:11 PMGreat experiment in mp3 bits. I really appreciate you.
live chat software
Very cool, now I can tell that I heard correctly but interpreted it incorrectly. I listened to the samples in order (on the first listening) so I unwittingly took Limburger as the baseline. I correctly recognized that Brie and Feta were closest to Limburger, with Feta being worse. I also recognized that Cheddar and Gouda were almost identical but different from the other three.
The compression did noticeably changed the sound, but I interpreted that as a sort of "punchiness" that was supposed to be there (because it was in Limburger) which the Cheddar and Gouda "failed" to capture, as if a bit of the high frequencies was missing. So my ranking was
1. Limburger
2. Brie (very close 2nd)
3. Feta (more distant 3rd)
4. Cheddar
5. Gouda (near tie but a hair "worse" than Cheddar)
Even knowing that Gouda is best and Feta is worst quality, and having read comments about "listen out for sibilants and hi-hats", I can't hear a difference between the two. Maybe it's time to de-wax my ears.
And to those saying that "storage is cheap, I can fit all my FLACs in a 500Gb hard drive", I say "show me a portable MP3 player with 500Gb of storage".
Richierocks on July 16, 2012 5:24 AMThe problem I had with the test is that that particular song has lots of audible processing in it normally. What I hear as artifacts in compressed music sometimes sound like the processing. Higher frequencies don't tend to have a stable stero location I find and dance around in your head. With audio as highly processed as this was, it was hard to tell.
Mike on July 16, 2012 12:07 PMThe problem that I see with this experiment is that not everybody used speakers/earphones capable of make the difference distinguishable. With some speakers or earphones you can hear the difference, but with others you can't. I tried using my cheap speakers and I couldn't hear any difference. Then I tried with my high-quality earphones and the difference was pretty clear.
As an experiment I think is great, but the speaker/earphone quality is an important factor that shouldn't be ignored. Of course, in an online experiment it's not really possible to control this.
Kyze on July 24, 2012 4:55 AMYou need to vary your audio samples and use music that has a greater tendency to display artifacts (e.g. Live Music). I've personally done extensive audio bitrate tests and I've found that I can hear artifacts in live music up to 256K CBR, thus I prefer either 320CBR or V0 VBR.
Other than that. Great choice of topic!
Matias Nino on July 24, 2012 8:00 PM+1 for Kyze; the quality of the speakers matters.
Cristian Ciupitu on July 31, 2012 11:16 AMRe: Aaron Em
You appear to have missed my point entirely.
I suggested electronic music because conducting this test with digitally produced music could potentially eliminate any inaccuracies introduced by recording anything with microphones. You could produce a sample containing any concievable sounds you wanted, and the lossless version would contain completely perfect reproduction, sample rate issues aside.
Perhaps you should stop and think for a minute before replying with such violent gestures. Not all electronic music is Skrillex, nor are all people who listen to electronic music uncultured.
mdsharpe on August 23, 2012 1:22 AMSorry I'm a little late to the party, but this is an example of laypeople skewing the data. I work with audio, and I sharply disagree with everything about this experiment. I mean, I'm a programmer by trade, but music is my life-long hobby, and nobody disses my hobby :D
On earbuds and common "PC speakers", you can't hear compression artifacts very well - heck, you can't hear the music very well either - because the equipment itself is very noisy, underpowered, and non-linear. Did you ever notice, on cheap speakers you tend to turn up the volume a lot more ? You're subconsciously trying to push over the noise floor! On a fancier system (or good headphones with an amp), there is significantly less noise in the electronics themselves, so you hear more of the signal. In other words, you hear more detail at the same volume level.
If you think about that for a moment, it means high-bitrate, high-fidelity MP3 files are wasted on low-power equipment, because those compression artifacts you're trying to avoid are almost entirely drowned out by your equipment. It's like trying to see through a foggy window, doesn't matter what's behind it, all you perceive is a blurry mess.
Along that same vein, many people have noted that the particular track you chose was very poorly mixed. You're feeding crappy audio in, and getting crappy audio out. If the source material had a wider dynamic range, the differences would be far more audible, even to untrained ears. The weakest part of the signal chain dictates the output quality, and in this case it was the original audio itself.
Twelve years ago, when I thought my Altec Lansing ACS54 were the bee's knees, 160kbps was sufficient. Those things had a constant hiss that I had tuned out of my consciousness. Years later, when I upgraded to proper home studio equipment, all those old MP3s sounded absolutely vile.
On my current setup, in a regular house with typical noise like fridges humming and my PC's fans, 256kbps is "CD quality" to my ears on most material, but 192kbps is noticeable in all but brickwalled pop or metal (again, crappy source material masks the encoder noise). In a club, on a kilowatt sound system, I can readily hear the muddiness in a 256kbps encode, because those tiny little aberrations and pumping effects are scaled up to a level where they stand out.
The same is true of earbuds vs sealed headphones. Those little white earbuds that come with iPhones ? Yeah, I'd rather sit in silence than use that junk, the stuff they output does not resemble music to me. You could play a 96kbps track through those and be none the wiser. Swap in my 300ohm Beyerdynamic 990 Pros, and you can actually hear the bass again, as well as the fingers barely squeaking down the guitar strings. If those subtle details get smudged by an encoder, you can be sure I'll notice.
Billco on August 23, 2012 1:41 AMWhen watch movies online you can stay lazy during weekends and holidays. You can stay in bed as long as you wish, keep your pajamas on, and watch your favorite movie on your own computer and with the comfort of your bedroom. It is especially beneficial during long winter evenings or rainy days.
Watch2x on September 12, 2012 5:42 PMMaybe someone here can help me. I am extremely picky about the audio quality of my music. I'm running into the problem lately that when downloading music some of it comes out sounding distorted - very faint popping, crackling noises similar to static. This stays true on every audio device I play these tracks on. Is this simply a bitrate issue or might there be something else involved?
So far I have tried to compare bitrates of the songs that sound distorted to the songs that do not, but I have soooo much music I'd rather not go through all of it! My findings have been that 320kpbs tends to sound fine and the distorted songs range from anywhere below that. My boyfriend doesn't seem to be able to hear these crackles and pops that drive me crazy and we actually end up having fights about it when trying to listen to music and I finally snap and can't take it anymore and have to stop listening to that album altogether - since it tends to be an entire album with a few exceptions.
I know I'm not imagining these distortions, they are not on every track I own and like I said it carries over to sound distorted on every device I play them on.
Shawnaj3 on October 29, 2012 12:07 PMOkay, I'm convinced. Any chance of showing how to convert wave to the 160kbps/192kbps VBR mp3?
lame --preset standard "cd-track-raw.wav" "cd-track-encoded.mp3"
Saw the above from your other article, presume that's the magic formula or some additional options in command line?
Mich on November 8, 2012 3:18 AMoccurs in the delayed audio, and also continue speaking to the next slide (in real-time) before it appears. Sort of like micro-time-travel....I am speaking in the real world right now, but also in the virtual world 15 seconds from now (and also right-now, as far as the slides are concerne. song mastering
I now own an iPhone 5, several retina iPads, and a Nexus 7. I'm sure there are many more of these devices on the way. In the calculus of deciding what kind of computing device I want with me, even the most awesome ultraportable laptop I can find is no longer enough. iPhone 3GS
I'm one of those people who claim to hear a difference, but... I think it has to do mostly with some kinds of music, that have a very broad detailed spectrum (drum n' bass and idm). Also, the difference is the largest when listening on studio monitors, or my studio headphones, not to mention what a huge difference it makes when a DJ plays mp3's, the squelchy character of mp3 really shines through when playing at club systems, which for me means screeching resonances that hurt my ears. I know I'm a minority, when I go to certain shopping malls I have to leave because high frequency noises from flourescent tubes are giving me a headache. Now, I've studied acoustics oriented mathematics and data compression with audio applications at university, and I can say that, mp3 is most definetly optimized to the general population and for them, it no doubtetly works.
Göran Sandström on April 14, 2013 3:00 AMThis is only a preview. Your comment has not yet been posted.
As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.
Having trouble reading this image? View an alternate.
|
|
Traffic Stats |
Posted by: |