Thursday, July 30, 2015

Can your Application Analytics APIs do this?

Writing data to disk is easy – developing a database is not.

Posting data to a URL is easy – developing an application analytics ingestion pipeline is not.

If you’ve written even a single line of code (in any language), I probably don’t have to explain why writing data to disk is easy – but developing a database is not (for those that have never written any code – it’s the extra database “machinery” required to handle scale, concurrency, resilience, security, etc. that demands a horde of PhD's and rock-star developers).

…and so it is with application analytics…

Posting data to a URL is easy – developing an application analytics ingestion pipeline is not.

Unlike the well-understood database scenario described above (which, ironically, includes analytics repositories - really little more than a specialized database use case), I still find development organizations that don’t have the same respect for application analytics instrumentation and ingestion stacks.

…just yesterday I was speaking with a senior developer from a large, extremely successful ISV (who shall remain anonymous); he confessed that their homegrown application analytics solution (built off of an analytics ISV acquisition of theirs) was generating WAY too much data, wreaking havoc on their infrastructure, alarming their clients, and yielding very few actionable insights.

Coincidentally, while I was hobnobbing away at this conference, our development team released updates to our Linux and Win32 (C++, etc) application analytics APIs – reminding me once again how deep you need to go if you want to have commercial grade application analytics. 

As with database development – unless it is actually your core business – you should not get into the business of developing this kind of “machinery.” Here’s a sampling of new features included in these latest API releases – with references to the existing “machinery” already in place:

1. New app analytics API capability: Cached message envelopes are automatically deleted if they are deemed “too old” – the aging threshold is configurable by development at build or runtime. 

Does your API
  • Provide automatic offline caching? 
If yes…
  • How is the size of that cache managed and how will it behave should the cache hit capacity?
  • Can your application “prune” a growing cache based upon the aging of its data to avoid the cache growing too large because of prolonged isolation?
2. New app analytics API capability: Application analytics message envelopes are split if/when they exceed a configurable size.

Does your API
  • Bundle telemetry packets into larger envelopes and queue them for asynchronous transmission away from your production app?
If yes...
  • Can the frequency of transmission be optimized to minimize bandwidth requirements (important for mobile connections)?
  • Can the timing of transmission to optimized to minimize network and CPU contention (important for real time systems like simulators, games, transaction processing, etc.)
  • Can developers further control the size of envelopes – and specifically the size of custom data payloads)? 
  • …can your API accommodate custom payloads at all?
3. New app analytics API capability: Development can set user-defined HTTP headers to better support more intelligent routing and distribution of incoming telemetry PRIOR to unpacking the larger envelopes.

Does your application analytics ingestion pipeline 
  • Support dynamic routing of incoming production telemetry without having to re-instrument or redeploy your application?
If yes...
  • Under what conditions can you re-direct incoming to new analytics endpoints?
  • Can incoming telemetry be distributed to multiple analytics endpoints in parallel?
These “deep” capabilities are required to safely and effectively scale application analytics implementations inside applications that may run on-premises and/or in 3rd party environments and/or be isolated from networks and/or subject to regulatory or compliance obligations, etc.

And I have not even scratched the surface on this topic – does your API
  • Enforce opt-in policies? - dynamically?
  • Consistently capture data across devices, OS’s, and application tiers?
  • Capture runtime stack, application session, all manner of exception, feature and workflow, and custom data dimensions?
  • Integrate into your SDLC and DevOps tooling and processes?
The bottom line – no matter how awesome your developers are – and even if they can literally build anything – nobody can do everything; respect the stack.

Want more information on PreEmptive Analytics APIs (for Linux, Win32, iOS, Android, Windows Phone, WinRT, Java, and/or JavaScript)?