Tuesday, July 19, 2011

The new WP7 App Hub reporting is great – and it’s even better with analytics!

Warning – this is a cliff hanger post. If you don’t like mysteries, come back in two weeks…

Like anyone else who has an app inside the WP7 App Marketplace, I noticed that the App Hub was down most of yesterday with the promise of a functional upgrade in the works – and today I was very pleasantly surprised to see the result; a streamlined experience with expanded capabilities.

One of the first things that caught my attention was the exception reporting by app and by date; very useful indeed. Of course, MSFT is quick to point out that (and I quote) “Crash count alone isn’t a direct measure of app quality. Popular apps may have higher crash counts due to higher usage.

Well that seems self-evident, but without usage metrics how can I evaluate the severity of my exception report counts? …. (and now, unless this is the first post of mine that you have ever read, you must know what’s coming).

To the cloud! (Sorry, I couldn’t resist). Using Runtime Intelligence for Windows Phone, I’m able to measure total sessions – by extracting these counts by day and mashing it up with exception counts from the marketplace – I can now supply the missing ingredient to make the exception count on the App Hub meaningful. (NOTE – I had to manually transcribe exception counts from the App Hub as there is no tabular option and the detailed download drops the daily count as it de-dupes the exceptions).

The App Hub is careful to point out that only apps running NODO (or Mango) can report exceptions, so I first had to remove the Runtime Intelligence session data coming from earlier versions of WP7 (an interesting statistic on its own).

Here is what I see… (and a warning here – the numbers aren’t pretty)

I took two apps of mine; Yoga-pedia and A Pose for That and looked at their respective usage on NODO+ phones via Runtime Intelligence and exception reports from the App Hub and then calculated the ratio of sessions to exceptions.

The time period I used for this test was the two weeks from June 12 to June 25. During that time, this is what I observed:
  • 66% of A Pose for That sessions were run on NODO.
  • 58% of Yoga-pedia sessions were run on NODO.
Here is the ratio of exceptions reported by MSFT and sessions from Runtime Intelligence… (click to enlarge)

Ratio of session counts and exception counts by day

Now there are three likely scenarios here.
  1. Over this two week period, both apps were crashing every 1 in 10 times they were run (HORRIBLE). I don’t think this is the case because I have run these apps myself on multiple phones hundreds of times and they have NEVER crashed.
  2. The App Hub is over-reporting exceptions (or somehow incorrectly associating exceptions with these apps). This is a beta feature on the App Hub – it’s certainly possible.
  3. Runtime Intelligence is way under-reporting the total number of sessions in a given day. Certainly possible, but given the unit testing I have done, I don’t see this as being a major contributing factor to these ratios – but certainly a possibility.

Now, I had already put a “feature tick” on the default unhandled exception handler to count how many times it was invoked during this same period. The counts I have are well below the App Hub numbers (which might suggest number 2 above is the culprit – BUT NOT SO FAST). It is more than likely that certain exceptions (perhaps a majority) would interrupt the normal feature tracking transmission mechanism so I would expect that count from Runtime Intelligence to be artificially LOW.

As is often the case when managing an application "in the wild", an unanticipated question has arisen and I find that I don’t have enough data. That’s why its ALWAYS so important to
  • plan in advance what data is worth collecting to minimize the likelihood that you will end up in this situation and
  • be sure that your analytic solution supports rapid and easy iterations and refinements to compensate for when your planning falls short.

So how am I going to determine if
  1. my apps offer a LOUSY customer experience everywhere except for my personal phones or
  2. one or both exception reporting counts and session tracking counts are flawed?
Easy - I’m going to post an update of my apps to the marketplace this weekend with Runtime Intelligence Exception reporting turned on. What?

Runtime Intelligence for Windows Phone includes its own exception tracking capabilities – it does require that the developer activate it (that’s why I don’t have that data now), but it offers a lot more data and it can be invoked for unhandled, handled, and thrown exceptions. Further, it can be configured to collect additional information (custom for the app), AND it can be extended to offer the user a dialogue to provide additional feedback if they like.

I will post my results over the next few weeks – meanwhile, if anyone has any suggestions or ideas – please let me know… I honestly have no idea how this little mystery will play itself out.

Before I sign off – here is one more tantalizing clue (although it may also be a red herring). When I look at the limited unhandled exception data currently being returned by Runtime Intelligence (I can see tower location, device manufacturer, OS, etc.), I see that well over 50% of the phones that had an exception were localized to a language OTHER THAN en-US – and that is way out of proportion to the actual usage trends that I have been tracking (and posted in earlier entries). Further, the localizations that had the greatest “disproportionate” number of unhandled exceptions were de-DE and de-AT. Coincidence? Conspiracy? We don’t need to guess – we will soon have the facts!

PS here are two links that may be of interest:

Enjoy!