Manish Bhatia, President Advanced Digital Client Services
The television industry is about to sail into a vast ocean of granular insights on TV viewing. The ocean we refer to is, of course, Set Top Box (STB) data. Other industries have also entered their own uncharted waters, with access to multiple sources of more granular information - manufacturers have shipment data and store data, magazines have circulation figures and sales data, the Internet has log files. In most of these cases, the industry players have found a way to navigate through huge, sometimes disparate, sets of data to make more informed business decisions. And where they have not, confusion has ensued.
We in the television business can learn from the experiences of other industries and add value and deeper insights to our currency measurements by incorporating complementary data, like STB data. Let's take a deeper dive into STB data to understand what insights STB can and cannot provide.
STB data has significant potential. The sheer size and granularity of STB data enables a much deeper dive into audience behavior than possible with audience samples used for currency measurement. We can provide insights into networks that currently are not reported, due to a combination of smaller audience and sample size, enabling what is known as ‘Long Tail Reporting'. We can start looking at smaller and smaller geographies, demographics and time periods; second-by-second viewing analysis anyone? Another big benefit of STB data is, with access to purchase data or some other data set, the ability to measure ROI on advertising campaigns with much greater fidelity than currently possible. Not only does the size allow for the deeper analysis, it also improves stability and reduces standard error in the analysis. As TV gets more advanced and interactive, viewers will engage in new behaviors. STB data will be an invaluable source of insights into these behaviors.
All of this is well documented and discussed at almost every conference where STB data is reviewed. There are whitepapers, power-point presentations and articles galore written about all this.
Nielsen is extremely excited about extending the insights it currently provides the industry. We view STB data as an extremely valuable source of information that, once adjusted for gaps, can provide deeper insights on TV viewing, advertising delivery and effectiveness. We bring to this endeavor our expertise in processing and integrating multiple data sets to create consistent and meaningful insights for our clients. We already do this for our Consumer Packaged Goods clients, our Internet clients and our Mobile clients.
Discussions around the usage of STB data are not merely academic discussions. A currency decision based on pure STB data would have unwarranted financial implications for ad buyers and sellers. The truth is that viewers of different viewing platforms (over-the-air, cable, satellite, telco) watch different shows. Winners and losers would be inappropriately determined if ratings were based only on cable or satellite data. Ratings can only be based on an inclusive and representative source of audience estimates that takes into account viewing across all viewing platforms within households.
Let's take a look at some data.
There are significant swings in ratings for networks and popular shows when we compare viewership from the Nielsen National People Meter (NPM) sample to viewership from various cut-back NPM samples representing Wired Digital Cable-only homes or Satellite only homes. Some examples from the '08-'09 season:
If all these types of changes were aggregated, the financial impact would be profound, with hundreds of millions of dollars shifting hands. For example, at the June 24 Advertising Research Foundation conference on audience measurement, Alan Wurtzel, NBC's President of Research and Media Development, reported that NBC had asked multiple set top box providers to generate ratings for the final episode of Heroes. According to Mr. Wurtzel, "We gave them the same request. What we got back were different answers." The difference in ratings from the same data source was six percent, which translated into a variance of $400,000 in ad sales for a single episode.
To look at it in a broader perspective, we looked at ratings and ad spending across multiple platforms, using the Nielsen's NPM and Monitor Plus services. Our estimate of the inappropriate result would be as follows:
These differences are a direct result of different viewing patterns between STB homes and non-STB homes. And that is all before one factors in the various gaps associated with STB data.
The facts and myths of Set Top Box data
Over the last few years, we have been analyzing STB data from multiple sources and would like to share some of our insights with you.
1. Larger samples are not always better than smaller samples: Usually when comparing a larger sample to a smaller sample, increased size provides a more stable estimate and a lower the standard error around the estimate. HOWEVER, this is true if the smaller and larger data sets being compared are of comparable quality - in terms of the completeness in representing the population and the accuracy of the data. A high quality smaller sample will provide more accurate information than a larger sample with systematic biases.
2. STB data is NOT Census Data: STB data is simply not available from all TV households. 11% of US homes have no cable, satellite or telco service. They continue to have their entertainment and information needs adequately met by free, digital, over-the-air TV (obviously without set top boxes). Another 19% have analog cable with no set top boxes. Altogether, non-STB households account for about a quarter of all TV viewing. We also know that in homes with access to viewing through STB, full TV viewing from the home would not be captured in the reported STB data. There is on average one TV set in such STB homes that currently do not have an attached set-top box, and therefore, such non-STB viewing in such a home would not be reported. This non-STB viewing in STB homes account for nine percent of total TV viewing.
3. Homes that view television thru cable, satellite or telco are different from other homes: While representing a very large data set and covering a major portion of US households, using STB data to deduce the overall viewing can be misleading because the viewing in these households is different from other HHs.
As mentioned above, even in homes that have cable, on average one set per HH does not have a set top box. Viewing on these non-STB sets is different. Kids networks (Disney, Adult Swim, Cartoon, Nick) have higher viewing levels on non-STB sets than on STB sets, suggesting that while the main TV set in the HH may have a digital set top box, the TV in the kids room may not.
Now let's get to the STB data gaps. STB data represents tuning from the Set Top Box, not the TV. STB boxes are frequently kept on, even when the TV set may be off (consider your own habits at home in this regard). We are finding that about 10% of boxes never get turned off for over a month. About 30% of boxes stay on for 24 hours on any given day. This varies from system to system and from box to box - which creates another interesting challenge in terms of harmonizing and standardizing the STB. All of this before we get to trying to figure out who is actually watching TV and being exposed to a program or a commercial message - STB data does not tell us who is watching.
Industry players are trying to address these issues by developing sophisticated models to account for these gaps. The question then becomes - what information is informing the models to take into account the gaps? How do we find out what the STB data may be missing? How do we know what we don't know? And how do we validate the models to determine their accuracy? In the world of finance, didn't Wall Street hire a lot of smart mathematicians and statisticians to develop models for them? How much did we lose due to the overconfidence and overreliance on those models that failed to incorporate the dynamics of the real world?
There is a way to complete the data set and harness the full potential of STB data in order to learn more and make better programming and advertising decisions. Nielsen's NPM sample tracks TV viewing for approximately 50,000 people across all platforms - cable, non-cable, satellite, new sets, old sets, sets in living rooms and bedrooms, in basements and kitchens. Nielsen's NPM sample is the currency that informs the US TV industry and is the gold standard for TV audience measurement. The NPM data set allows us to create the modeling between panel and STB data - enabling accurate persons' level viewing and extending it to a much larger data set for granular analysis and stable estimates. The model was developed and presented at the ARF conference in New York in 2007.
The key steps to creating audience estimates using STB data are:
1. Set on/off. Determine gaps between tuning records and TV set on/off from Nielsen's NPM sample to inform the STB data. This ensures that the overall tuning levels in the STB data are accurate. The industry is referring to this as ‘cap and edit' rules.
2. Viewers. In a very simplistic example, if 40% of the viewers to a show are men 18-34 making between $100K-$150K from Nielsen's NPM sample, then that probability, demo and income level can be assigned to tuning records coming from the STB data.
3. Viewing for environments not covered by the STB data. This would include TV sets within cable homes that don't send data back (which tend to show higher levels of viewing to kids' networks) and over-the-air only homes (which constitute about 10% of the total homes and by definition, have much higher viewing to national broadcast and local programming). Again, the NPM panel data would provide the basis for modeling these gaps from STB data.
As NBC's Alan Wurtzel said, "We're learning that there is less and less there, there. This is really hard stuff." We agree and we can help with the solution. Nielsen has the relevant assets to inform the STB discussion and extend the insights on television viewing. We look forward to working with the industry to take advantage of the promise and avoid the perils of set top box data.