The Great 89-90 Divide: Statistics and Explanations
by Joe Czerwinski
From time to time, we receive inquiries about why and how we rate wines. As I’ve already addressed some of the questions surrounding individual tastings and reviews in a previous blog, I thought I would examine some of the macroissues here.
As illustrated in the following figure, Wine Enthusiast’s wine ratings are normally distributed on a roughly bell-shaped curve.

Notice the strange “tail” on the left side of the graph, where it reads NR (Not Rated). That uptick indicates wines that were deemed unacceptable by our reviewers. Those wines are not scored, so that bar actually represents all of the wines that would have rated from 50-79. It’s not so much a breakdown in the distribution as an artifact of our policy not to waste time on detailed reviews of bad wines.
The other notable feature of the graph is the apparent inversion of the 89- and 90-point bars. In a textbook distribution, we’d expect to see more 89-point ratings than 90-point scores. In this case, I suspect the explanation is simply that our reviewers are human and inclined to give borderline cases the benefit of the doubt. Paul Gregutt, who reviews wines from the Pacific Northwest for us, held forth at length on the topic not long ago.
That 89-90 point divide is perceived among consumers and the trade as a huge break point, and it’s clear that our reviewers are affected by it as well. That statistical abnormality is something that we’ve consistently observed over time here at Wine Enthusiast and I have brought it to our reviewers’ attention on several occasions, yet it persists. I can’t help but feel curmudgeonly just for pointing it out, and it makes me wonder if it’s even worth worrying about.
What do you think? Do we need to toughen up?
Filed under: Wine Ratings
9 Comments
9 Responses to “The Great 89-90 Divide: Statistics and Explanations”
Please Wait



April 13th, 2010 at 3:11:51 PM
Thanks for this very interesting analysis of ratings distribution! Fascinating stuff.
The thing that strikes me more than the 89/90 discrepancy is the fact that the median is 87 points which seems inordinately positive. Nearly all wines are rated good or better and I don’t think this aligns with what consumers experience when they drink the same wines. There are a lot of wines that we as consumers drink that are average or even below average and yet that seems to garner an 84-88 point rating which implies good to very good.
I’ve plotted similar for Wine Spectator ratings and a similar trend appears. What do you think of this contradiction? Wines that we’d describe to one another as friends tasting as being “meh/average” are scored as good/very good? Why is that?
April 13th, 2010 at 4:09:16 PM
Robert,
Thanks for the comment. It’s worth keeping in mind that our data reflect what we taste, not necessarily the entire wine market, so there is a certain level of bias built in. Many wines are not submitted to us and I confess that when we exert energy to supplement what we receive it’s unlikely to be on wines that we feel won’t score well. We tend to go out of our way to find wines that we feel we’ll be able to recommend, not consign to oblivion below 80. I imagine this would apply equally to any of the major reviewing publications.
Specifically relating to our rating, given our scale, I guess I don’t interpret the data the same way. The Wine Enthusiast 100-point-scale gradations are as follows:
Classic 98-100: The pinnacle of quality.
Superb 94-97: A great achievement.
Excellent 90-93: Highly recommended.
Very Good 87-89: Often good value; well recommended.
Good 83-86: Suitable for everyday consumption; often good value.
Acceptable 80-82: Can be employed in casual, less-critical circumstances.
As you can see, we don’t have any “Average” as a quality descriptor in our rating system–it’s a relative expression that relies on a data set to have meaning. If I were to say I recently tasted an “average” Marlborough Sauvignon Blanc, that wouldn’t mean I had just tasted a 75-point wine, because on our scale 75 points would be Unacceptable. An average Marlborough SB on our scale probably rates in the mid to high 80s, because of the region’s generally sound winemaking techniques and benign climate for the growth and ripening of SB.
You might try thinking of it in terms of academic grades: Anything less than a B is unacceptable (at least in my house growing up, and still today for my kids). Above that, the praise and privileges rise with increased levels of performance–as long as the older one stays on the A Honor Roll he keeps his texting privileges–and when wines really wow us they get the big scores, article mentions and photo ops.
Then it’s up to consumers to let us know whether or not they agree with our critics.
April 15th, 2010 at 11:34:07 AM
Thank you for the explanation — it makes perfect sense to me. I tend to follow the ratings of basically three “experts” — WE, WS, and RP. Generally, my palate seems to mesh almost perfectly with the scores that these experts. On occasion, there might be a disconnect between my palate and the professional raters, especially if the wine is not a cab or high cab blend — that is to be expected, I guess. But I have one question, I generally think that 90 points is a more reasonable threshold for “good” wines. 87-89 is probably “meh”. Anything less is not very good. Is my thought here unreasonable?
April 15th, 2010 at 2:57:11 PM
Yeah, I’d say you do need to toughen up, but not on the 89/90 issue. You call it where you see. If consumers haven’t figured out that there are fantastic 89-pointers aplenty, then there’s only so much you can do. Where I would love to see you toughen up is in the NR category. Totally agree that you can’t afford to waste time on detailed reviews of bad wines, but how about some numbers? Knowing and not telling your readers suggests an insider’s secret.
It brings up an interesting ethical question. Which master do you serve? Readers or producers? To publish ratings only above a certain number implies a, well, strange curve in your graph.
April 19th, 2010 at 3:09:51 PM
I loved reading the old reviews in the Wine Spectator of the worst wines. They were the most fun, unique, with a whole new vocabulary. Skunky, funky, shoe polish,nail polish, motor oil, putrefied eggs, volatile acidity, Brettanomyces. How is the next generation going to learn what wine can really be when it is messed up by experts. Mousy is not just like a mouse, but like a mouse urinal in a cheap bar. If you never mention this in polite company, who is ready to proclaim with authority a wine mosey and give me a good bottle?
The buying public deserves to know from authorities what wine is cursed with fiends such as ethyltetra-hydropyridine (ETHP), acetyltetra- hydropyridine (ATHP), and acetyl-pyrroline (APY) and other unsavory fiends. Are we to assume that all non rated reviewed wines have these compounds? or just think they must be mousy?
Does not rated mean the critic is a mouse and not a man? or woman?
Advertisers be dammed, protect the public give us some good fun honest writing, save us some money and let the vintners that produce true swine slop receive the credit they have earned in their filthy infected, infested cellars.
There is a dark side to the wine business. It is your job to expose it and keep these wines from the market.
April 19th, 2010 at 3:24:39 PM
Amen, Gary.
April 20th, 2010 at 12:12:57 AM
What does toughen-up mean?
Does it mean increase the precision? That would require spreading the scale out.
…. innteresting upshot of your WE numbers, … ….? The bell-curve shows that the center of the WE scoring is 87-points, Upshot, WE is actually using a 26-point scoring system, i.e. from 13-points plus 87-points equal 100-points, etc.
It infers that the precision is modest.
So maybe WE could toughen up, ….
To test the toughness, have an independent agent, third party, submit wines to WE which are labeled differently, but are actually the same wine.
Naturally, magazines do not do this but it is a tough standard that you may want to hire an outside company to produe.
April 21st, 2010 at 4:19:14 AM
I am most interested in convergence of price/quality/point, not so much the point “toughness.” In my family, we first look for the best buys that hit our buttons below $10 (and at that price, I can hardly hope the wine will be above 90; actually at that price, I usually like the 80s better). Of course, then we hope retailers have the wine!
I am curious, is there any indication that a brand that is “Not Rated” actually changes winemaking if exposed? Or, does it just not send the wine anymore…therefore making publication of wines in this category somewhat false? Or, if it does clean up its act and make a better wine, does the original NR haunt the brand? I would assume there is differentiation between most of the NR and if a high profile brand ends up in this category.
May 3rd, 2010 at 7:56:03 PM
The reality then Joe, is that you are marking wines out of 20 not out of 100. You reject all below 80 … Using your own criterion you classify everything as a B plus to A minus on the standard academic scale. It makes a nonsense of the system – there are no C plus wines – ever?
Quite frankly I find all numeric scales irrelevant, even the wine-industry one of 20 (which in turn is only 13 to 20 … out of 7 … which all tasters then decimalize).
I acknowledge that some form of ranking is necessary. I would like to see a change to perhaps a double scale – one to five – for quality within its class (so that yellowtail doesn’t come up against a great Rhone or Grange); then a similar starred scale for your appreciation of value – again within its class. Were you to add a scale for your projected idea of cellaring potential – that too would be useful to those of us less practiced and experienced consumers.