How Not to Create an Index


We built this list using a proprietary algorithm that mines CNET's huge, industry-leading repository of internal and external data, then blends that data with our editors' ratings to identify the tech that is making the biggest waves.

That's how CNET describes their new "CNET Hundred" list, which has the sub-description of "the top 100 products you should care about based on your clicks, our editors' scores, and more."

Ugh. Proprietary. Blends. Internal data. In other words, you and I can't actually verify the rankings in the list, because CNET doesn't give us access to the data or even tell us how they use it to create the list. Moreover, the list is distorted by their own rankings of products (talk about self-referential), and could be distorted externally (user clicks).

This is classic Garbage In, Garbage Out (GIGO) statistics. Using what's likely a Garbage algorithm (proprietary) to boot. 

Let's see how that works out. The number one product in the initial list is the Nexus 5 phone from Google. CNET editor rating? 4 out of 5. Average user rating? 4 out of 5. Number of comments? 2.4k. Number two product is the Apple iPad Air. CNET editor rating? 5 out of 5. Average user rating? 4.5 out of 5. Number of comments? 3.1K. 

Okay, maybe a little further down the list, then? Number 6 is the Sony Playstation. CNET editor rating? 3.5 out of 5. Average user rating? 3.5 out of 5. Number of comments? 2.6k. Number 18 on the list is the competitor, the XBox One. CNET editor rating? 3.5 out of 5. Average user rating? 3.5 out of 5. Number of comments? You guessed it, 2.6k ;~). 

Hmm. So "based on our editor's scores" seems to be very lowly weighted in their algorithm, yet is one of the few things they point to that's used in their calculation. The CNET claim that "these are the gadgets that everyone is noticing and that the world cares about" can't be actually verified by a third party from the information they've disclosed, thus you have seriously worry about what the list actually tells you. 

Bottom line: I think CNET's new list is no better than someone's subjective list in telling you anything useful. So why did they do it? Maybe because it creates automated content that doesn't require hiring a writer? ;~). The usual next step with such indices is trying to get others to use or quote it. 

But the reason why I mention it at all has to do with cameras. Top camera on the list? The GoPro Hero3 at number 16. (Note that the Hero3 is not the most current model GoPro makes, that would be the Hero3+.) 

  • Canon PowerShot SX280HX is number 44
  • Canon EOS Rebel T3i is number 46
  • Nikon D5200 (also not the current model) is number 53
  • Canon EOS 70D is number 65
  • Sony Cyber-Shot RX100II number 74
  • Canon EOS Rebel T3 number 79
  • Nikon D7000 number 81
  • Nikon Coolpix P520 number 85
  • Canon EOS Rebel SL1 number 89
  • Sony Cyber-Shot HX50V number 100

What those rankings and numbers tell me is that retail sales data is playing a huge part in CNET's rankings. Given how many of these products are older ones, I also wonder whether or not there's a bias towards products that have been on the market longer, too. 

Still, the list seems meaningless in terms of sorting anything useful out. It certainly is "one of a kind" as CNET asserts. But as a "leaderboard" for tech's hottest products, it's not even close to getting the job done right. I'm sure that Nikon will be happy that two of the three models they managed to place on the list they don't even make any more and that an American upstart is the perceived camera tech leader according to CNET.

One final thing: by not publishing the algorithm, CNET also allows themselves the possibility of "tweaking" it without really having to say anything publicly. 

In the spirit of CNET's new rankings I've put together a proprietary algorithm based upon data sources that I won't disclose that ranks the various tech gadget sites on the Internet. CNET is number 6 by my calculations. (Yes, that was tongue firmly in cheek.)

text and images © Thom Hogan 2015 -- all rights reserved
@bythom on twitter, hashtags #bythom, #gearophile