The meaning of scores

Debates on whether scores are a good way to summarize a review have been floating around since as long as reviews themselves, moreover each position has its merits. The Verge opted for the easiest path, which is to provide one. The advantages are fairly obvious: many reader won't actually read the whole content, or may want a quick way to compare a bunch of products, it gives a sense of objectivity, and, at the end of the day, it's generally more appealing to people than it's counterpart.

The problem

Is knowing what a score means. If a score merely means a writer opinion in the sense of:

10. Love it, I would physically make love to this product if I could find a way.
9. I only disliked a couple of things, but nothing major.
...
1. I hate it, I will find whoever made this and kill him with my bare own hands.

Then we can dispatch them, as they serve little, if any, purpose, the phrase itself is clearer than the number. Then, if we opt for scores, we surely want to say a little bit more with them than "I like it or I hate it".

Moreover, The Verge chose an even more intricate option, a weighted scoring system. Why would they do that? Why would they complicate even more an already delicate issue? Simple, they ought for objectivity.

What are scores anyway?

From a mathematical point of view, scores are just a measure. Almost everything can be measured in one or another way. Let's us image we want to measure distances, in particular the distance between the Eiffel Tower and The Statue of Liberty, as soon as you raise the question of somebody answers with confidence "It's 1,234", being faced by such rare answer, an immediate response would probably be "That's not true", however, if you're being a little bit more insightful (and I'm sure you are), you would probably ask "1,234 what?". A scoring system is a measure of quality, just like Kilometers are a measure of distance, but unlike the later the former is defined over a closed interval between 0 and 10. In other words, we need a scale (since Palm died I can't write that word without laughing, go figure), otherwise the numbers are meaningless.

If the numbers are meaningless, they don't showcase objectivity, and therefore fail to fulfill what we assumed is the point of using such a complex scoring system. Among the simplest of all factors needed for objectivity one is to be able to swap people without affecting the results, so it doesn't matter whether you measure the distance from Paris to New York or I do, this is only possible if we got a scale. Completely fulfilling this requirement is out of the scope of any review. However the problem stands, those numbers are meaningless unless we enrich them by given them some kind of, at least, partially objective scale.

Real life implications

Or how your numbers are meaningless

Let us put some examples: Screen quality can be objectively measured in various ways, from color reproduction to brightness to resolution to DPI. There's obviously some room for personal preference (i.e. IPS-LCD vs Super AMOLED+ color reproduction), but many important bits of it can easily be quantified objectively. Here's a real world example:


The Samsung Galaxy S II has a Super AMOLED Plus screen, 4.3", with a resolution of 800x480, the Samsung Focus Flash has a Super AMOLED screen, 3.7", and the same resolution. As we all come to know, Super AMOLED uses a Pentile structure, this mean it has 50% less subpixels (it uses 8 subpixeles to make 4 pixels), and it isn't gorilla glass. However, both screens got 7, this means that both are equally pleasing (or unpleasing) to look at? Moreover, the Lumnia 800 got a 9 on the screen department regardless of it being merely an AMOLED screen (using Pentile). As it should be clear the scores are not reflecting any objective features of those displays. This case is particularly troublesome as they are all using the same technology, which greatly reduces subjectivity factors (as all of them, being *OLED are really saturated, etc).

How to partially solve it...

Without making reviews a scientific publication

Stick to your own objective measures, and prioritize them over your own personal feelings. Set an internal scale, what's a 10? Instead of reserving it for unicorns postulate which products has 10 in such a category (which is nothing but other way to say "this is the best on the market on this category"), if the iPhone's battery life is the best of the bunch, then make it's battery life 10, and scale the rest accordingly, and so on. If possible, make that public, so people know 9 means "almost as good as the iPhone's battery life" instead of "I dig this". Don't mix exclusively objective categories with mostly subjective categories, i.e. design and build quality shouldn't be mixed up in one single category. Take price into account, if a device is the best on all accounts, but it costs 3,000 dollars on contract, give it 10 everywhere it deserves to get one, just weight price vs quality afterwards. If you use your imagination you could even build a morphing system, scores could be adjusted on the fly when a new product claims 10 in any category.