The Promise and Perils of Set Top Box Data

Manish Bhatia, President Advanced Digital Client Services

The television industry is about to sail into a vast ocean of granular insights on TV viewing. The ocean we refer to is, of course, Set Top Box (STB) data. Other industries have also entered their own uncharted waters, with access to multiple sources of more granular information – manufacturers have shipment data and store data, magazines have circulation figures and sales data, the Internet has log files. In most of these cases, the industry players have found a way to navigate through huge, sometimes disparate, sets of data to make more informed business decisions. And where they have not, confusion has ensued.

We in the television business can learn from the experiences of other industries and add value and deeper insights to our currency measurements by incorporating complementary data, like STB data. Let’s take a deeper dive into STB data to understand what insights STB can and cannot provide.

The Promise

STB data has significant potential. The sheer size and granularity of STB data enables a much deeper dive into audience behavior than possible with audience samples used for currency measurement. We can provide insights into networks that currently are not reported, due to a combination of smaller audience and sample size, enabling what is known as ‘Long Tail Reporting’. We can start looking at smaller and smaller geographies, demographics and time periods; second-by-second viewing analysis anyone? Another big benefit of STB data is, with access to purchase data or some other data set, the ability to measure ROI on advertising campaigns with much greater fidelity than currently possible. Not only does the size allow for the deeper analysis, it also improves stability and reduces standard error in the analysis. As TV gets more advanced and interactive, viewers will engage in new behaviors. STB data will be an invaluable source of insights into these behaviors.

All of this is well documented and discussed at almost every conference where STB data is reviewed. There are whitepapers, power-point presentations and articles galore written about all this.

Nielsen is extremely excited about extending the insights it currently provides the industry. We view STB data as an extremely valuable source of information that, once adjusted for gaps, can provide deeper insights on TV viewing, advertising delivery and effectiveness. We bring to this endeavor our expertise in processing and integrating multiple data sets to create consistent and meaningful insights for our clients. We already do this for our Consumer Packaged Goods clients, our Internet clients and our Mobile clients.

The Perils

Discussions around the usage of STB data are not merely academic discussions. A currency decision based on pure STB data would have unwarranted financial implications for ad buyers and sellers. The truth is that viewers of different viewing platforms (over-the-air, cable, satellite, telco) watch different shows. Winners and losers would be inappropriately determined if ratings were based only on cable or satellite data. Ratings can only be based on an inclusive and representative source of audience estimates that takes into account viewing across all viewing platforms within households.

Let’s take a look at some data.

There are significant swings in ratings for networks and popular shows when we compare viewership from the Nielsen National People Meter (NPM) sample to viewership from various cut-back NPM samples representing Wired Digital Cable-only homes or Satellite only homes. Some examples from the ’08-’09 season:

Cable networks would do much better in Digital Cable-only homes – as we would expect – with some networks getting a lift of 20+% in audiences.

The Fox broadcast network would do 4% better with Digital Cable-only homes, while the CBS broadcast network would lose 6% of its audience.

American Idol would do 12% better with Digital Cable-only homes, but 7% better with Satellite homes.

Ratings for Desperate Housewives would be 12% higher with Digital Cable-only homes, but 6.5% lower with Satellite only homes. That is a swing of 18.5% for a single show.

The Mentalist gets slightly lower ratings with either Digital Cable-only or Satellite only homes.

If all these types of changes were aggregated, the financial impact would be profound, with hundreds of millions of dollars shifting hands. For example, at the June 24 Advertising Research Foundation conference on audience measurement, Alan Wurtzel, NBC’s President of Research and Media Development, reported that NBC had asked multiple set top box providers to generate ratings for the final episode of Heroes. According to Mr. Wurtzel, “We gave them the same request. What we got back were different answers.” The difference in ratings from the same data source was six percent, which translated into a variance of $400,000 in ad sales for a single episode.

To look at it in a broader perspective, we looked at ratings and ad spending across multiple platforms, using the Nielsen’s NPM and Monitor Plus services. Our estimate of the inappropriate result would be as follows:

If C3 ratings estimates were based upon viewing only from Digital Cable homes, it would have cost the broadcast networks approx. $340 million in ad revenue so far this season. If ratings estimates were based only on Satellite homes, the broadcast networks would have 4% lower ratings and would be $730 million poorer.

Cable networks, as we would expect, would benefit from using Digital Cable homes only – to the tune of $2.5 billion in additional ad revenue. If Satellite homes only were the basis for rating, it would have generated $600 million in additional ad revenue for cable networks.

These differences are a direct result of different viewing patterns between STB homes and non-STB homes. And that is all before one factors in the various gaps associated with STB data.

The facts and myths of Set Top Box data

Over the last few years, we have been analyzing STB data from multiple sources and would like to share some of our insights with you.

1. Larger samples are not always better than smaller samples: Usually when comparing a larger sample to a smaller sample, increased size provides a more stable estimate and a lower the standard error around the estimate. HOWEVER, this is true if the smaller and larger data sets being compared are of comparable quality – in terms of the completeness in representing the population and the accuracy of the data. A high quality smaller sample will provide more accurate information than a larger sample with systematic biases.

2. STB data is NOT Census Data: STB data is simply not available from all TV households. 11% of US homes have no cable, satellite or telco service. They continue to have their entertainment and information needs adequately met by free, digital, over-the-air TV (obviously without set top boxes). Another 19% have analog cable with no set top boxes. Altogether, non-STB households account for about a quarter of all TV viewing. We also know that in homes with access to viewing through STB, full TV viewing from the home would not be captured in the reported STB data. There is on average one TV set in such STB homes that currently do not have an attached set-top box, and therefore, such non-STB viewing in such a home would not be reported. This non-STB viewing in STB homes account for nine percent of total TV viewing.

3. Homes that view television thru cable, satellite or telco are different from other homes: While representing a very large data set and covering a major portion of US households, using STB data to deduce the overall viewing can be misleading because the viewing in these households is different from other HHs.

STB homes have more TV sets – 2.8 sets per HH vs. 2.4 for the average home

STB homes watch more TV – approximately nine hours a day vs. 8 ½ for the average home or approximately six hours for broadcast only homes.

STB homes are larger HHs and make more money

STB homes do more time-shifting

As mentioned above, even in homes that have cable, on average one set per HH does not have a set top box. Viewing on these non-STB sets is different. Kids networks (Disney, Adult Swim, Cartoon, Nick) have higher viewing levels on non-STB sets than on STB sets, suggesting that while the main TV set in the HH may have a digital set top box, the TV in the kids room may not.

Data Gaps

Now let’s get to the STB data gaps. STB data represents tuning from the Set Top Box, not the TV. STB boxes are frequently kept on, even when the TV set may be off (consider your own habits at home in this regard). We are finding that about 10% of boxes never get turned off for over a month. About 30% of boxes stay on for 24 hours on any given day. This varies from system to system and from box to box – which creates another interesting challenge in terms of harmonizing and standardizing the STB. All of this before we get to trying to figure out who is actually watching TV and being exposed to a program or a commercial message – STB data does not tell us who is watching.

Industry players are trying to address these issues by developing sophisticated models to account for these gaps. The question then becomes – what information is informing the models to take into account the gaps? How do we find out what the STB data may be missing? How do we know what we don’t know? And how do we validate the models to determine their accuracy? In the world of finance, didn’t Wall Street hire a lot of smart mathematicians and statisticians to develop models for them? How much did we lose due to the overconfidence and overreliance on those models that failed to incorporate the dynamics of the real world?

The Solution

There is a way to complete the data set and harness the full potential of STB data in order to learn more and make better programming and advertising decisions. Nielsen’s NPM sample tracks TV viewing for approximately 50,000 people across all platforms – cable, non-cable, satellite, new sets, old sets, sets in living rooms and bedrooms, in basements and kitchens. Nielsen’s NPM sample is the currency that informs the US TV industry and is the gold standard for TV audience measurement. The NPM data set allows us to create the modeling between panel and STB data – enabling accurate persons’ level viewing and extending it to a much larger data set for granular analysis and stable estimates. The model was developed and presented at the ARF conference in New York in 2007.

The key steps to creating audience estimates using STB data are:

As NBC’s Alan Wurtzel said, “We’re learning that there is less and less there, there. This is really hard stuff.” We agree and we can help with the solution. Nielsen has the relevant assets to inform the STB discussion and extend the insights on television viewing. We look forward to working with the industry to take advantage of the promise and avoid the perils of set top box data.