Find articles from my Blog Archive:

Monday, 3 December 2012

Lies, Damned Lies and Statistics

The British Prime Minister Benjamin Disraeli (1804-1881) coined the phrase "Lies, Damned Lies and Statistics" to characterise the way that statistics are often used to prove whatever the protagonist wishes. Its a very incisive phrase and one my childhood training at the hands of my Statistics teacher Mr Sutton tought me well. Mr Sutton was on a mission not to educate us, but to train us for life. As a statistics teacher he took it upon himself to help us to always question numbers used in the press. His point was that the selective use of statistics can be used to prove anything you want. A simple example might be a stockmarket graph that shows an apparant crash in the price of assets - until you realise that the graph's y-axis for asset price starts not a 0, but at 5,300. Those apparent large price fluctuations are in reality just variability between 5,300 and 6000 - thus hugely accentuating fluctuations that are in reality much less dramatic than they at first appear. However, the press needs dramatic stories in order to sell newspapers and attract website visits. Few newspapers are sold with headlines like "gentle fluctuations in stock prices making analysts mildly concerned". Instead we get "dramatic stock crash over the past month reduces average pension value by $1m". Mr Sutton trained me well to always look behind the headlines and question statistics as they are so often used to distort the truth, rather than to explain it.

When I see so many headlines about Smartphones sales and market share it intrigues me. It seems that the old "publish some dramatic headlines" trick might be at play again. Here are some observations on how we are spun by the media.

So what are the real facts? What do we know for sure, rather than what the headlines are shouting at us? Lets examine a few points...

  • To my knowledge none of the Android licensees report sales figures (I qualify that statement not because there is doubt, but because there are a LOT of Android licensees so it's always possible there is an exception somewhere - but certainly it's true for the large licensees). Thus, any sales estimates are based on assumptions and estimates.
  • Apple reports 1/4ly sales figures for iPhone, but doesn't break it down by individual devices. Characterisation of relative sales of a particular iPhone model is therefore based on assumptions and estimates.
  • Many manufacturers have quite lumpy smartphone sales performances - typically with large sales around the launch of a new device that tail-off over a device's lifetime. The 1/4 before a big launch is often characterised by lack-luster sales as customers hold-off their buying decisions awaiting the new product. Thus, its very easy to take a snapshot view in a particular 1/4 and prove anything your headline writers wish - depending on the 1/4 chosen.
  • Actual relative sales vary quite significantly between geographical regions. Emerging markets like India and China tend to have a much higher proportion of lower-cost devices, whereas Western markets like the USA and UK tend towards a higher proportion of "premium" devices. Estimates are that Android/iOS share in the west is broadly 50/50, but lower-cost Android phones dominate much more in emerging markets. The truth is subtle and complex. A global average masks these real differences and can be very misleading.
Whenever we see assumptions and estimates we should be on guard - these are areas where the choice of assumption or estimate can radically change the output. Essentially we can prove whatever we want when we get to choose our underlying assumptions.

Respected analysts like IDC and Gartner often publish seemingly authorative statistics and market share reports. However, it's often impossible to make those estimates without a lot of extrapolation from the available facts. For example, how do you estimate Android market share if there are no reported sales statistics from licensees? There are of course a number of methods. One is to look at revenues reported, make assumptions on the device mix and calculate sales numbers from there. But the choice of underlying assumption can distort the numbers. You might think that firms with such reputations would have a sound basis for their estimates? But they get things wrong - for example, IDC estimated sales of 2.3 million worldwide for the Galaxy Tab in 2Q 2012. However, during the infamous Apple v Samsung patent trial Samsung were ended up releasing figures that revealed Galaxy Tab sales of only 37,000 in the US for the same period. Whilst its possible that the Galaxy Tab had enormous sales success everywhere other than the US, its perhaps more likely that the IDC estimates were very wrong.

Another example, this time relating to press reports that "Samsung Galaxy S III dethrones iPhone as world's top seller". There are a few problems with this headline, namely:

  1. Samsung hasn't released figures for S3 sales and Apple hasn't broken out 4S sales from the total iPhone statistics. So, the basis for both Samsung and Apple figures were estimates rather than actuals. So, what underlying assumptions were made?
  2. Here we go...the estimates relate only to Q3 2012 when iPhone sales had stalled awaiting the imminent release of the iPhone 5. Add both Q2 and Q3 figures together, to get a more representative view, and you get estimates of iPhone 4S at 35.6m and Galaxy SIII at 23.4m. And when the Q4 numbers were released we saw actual iPhone sales in one quarter of 25.9m. You see, the real story is quite complicated - you can prove any position you want by slecting the timeframe and statistics to suit your personal bias. Being 'selective' in the choice of statistics is another lesson Mr Sutton tought me.
Another complicating factor is that of supply constraints. Superficial analysis often assumes an infinite supply, with sales representing consumer demand. But what if that supply is constrained and therefore consumer demand for a particular device is much higher than sales might imply? Over time demand/supply tend to balance, unless there are serious imbalances, but in a given quarter supply constraints can easily distort the market. We know that the iPhone5 has had serious supply constraints during the past quarter, similarly googles's Nexus devices are difficult to come-by. How this distorts the market and what the actual underlying demand might be is difficult to tell. Another reason why trying to draw conclusions from sales stats is tricky.

So, the facts are complicated and spotty, with big holes in the available statistics. Analysts make estimates but sometimes get them spectacularly wrong. The press use very selective statistics to prove dramatic headlines that probably aren't true. And the smartphone market isn't one market, but a series of very different regional markets, each with their own dynamics and different winners/losers.My point here isn't to offer an opinion of relative sales, but to highlight how and why some of the more hysterical headlines should be treated with extreme caution. Statistics are tools that can and are bent to the writers requirements.I have opinions on the smartphone market, but my opinions will be affected by my environment. A lot of my friends and colleagues have bought into iOS, I see a lot of iOS devices every day on the train to work. Based on those experiences I might easily conclude that iOS is massively dominant. But I'm conscious that I move in certain circles, live in a certain region and self-select certain friends and acquaintances. If I lived somewhere else, had different friends or worked in a different industry I might have different experiences and perspectives. So we need to be conscious that our own personal views might not necessarily be representative of the whole market. Its sometimes hard to think that we live in a cocoon, but most of us do - the world is a big place. So be cautious that your instant reaction is often coloured by your own personal bias.

How do you make any sense at all of what's going on? You should perhaps be suspicious of anything that quotes press headlines and ignore hyperbole. And you should do your own research. Question numbers and check sources. Be aware of which data matters to you - do you need a view of data based on regions, demographics, usage rather than sales? There's no substitute for gathering your own data. Survey your customers, check your weblogs. Look for real data, rather than opinions. Cross-refernce sources and look for explanations that dig into the detail, rather than shouting a headline. Remember: there are lies, damned lies and statistics. Statistics tend to be the worse of the be cautious of those quoting numbers to justify their position!


No comments :

Post a Comment