It’s always something that I’ve been dubious about in the over reliance upon the number of downloads of software as a way to say that that is the number of active users of that software. Sure an increased number of downloads of a specific version can imply a potential overall increased usage of the software but is it really going to a 1:1 mapping between downloads and active users….?
Almost definitely not is how I’ve come to see this matter. But how can that be you may ask ? Well, the following are a few common cases I’ve come across directly where download numbers don’t match up with known usage (but there’s more cases not covered).
- you can have one download which is then re-deployed on multiple machines from a local cache (as I’ve seen often happen with server software)
- you can have multiple downloads on the same machine (e.g. due to setup issues) irrespective of those then being installed (e.g. the i’ll try it later and forget option)
- an auto-updater error may cause repeated downloads despite a valid copy already being present in the update cache
- issues with a website crawler which causes repeated downloads (e.g. HTTP HEAD requests being incorrectly handled and processed as GET requests)
- users moving to another machine and re-installing the software (e.g. due to a machine migration)
But even if you get a valid download which can be determined to be a potential real user, what’s to say that they will even install it and become an active user ?
Well you cannot be sure that that will even happen. Or the alternative is that it is installed and then quickly uninstalled or that it’s just left installed and is only used a handful of times (which I’ve done a number of times with media players and broadcasting software over the years).
In such cases, can you really count those as users if it’s only a token usage? Most likely not if you’re being 100% truthful but then that leads on to how can you even be sure of who is even using the software?
As unless the software phones home in some manner (e.g. during the install process, on start-up or via update checks, etc) and that information is correctly received, detected and processed (which assumes that the user is allowing that information to be obtained to begin with), how can you ever be sure of how many people are actually using the software.
Since what is to stop a user having multiple installs / running instances (e.g. I’ve done that with Winamp over the years to have a player and a development install) and then that leads to the big issue of how do you categorise that usage into number of users as you need to define what a user is to begin with (which can get messy very quickly).
Looking at my Winamp usage example, should I be counted as 1 user or 2 since I’m the only person using them in such a setup, but there’s nothing to stop me having both (or more) of the copies running at the same time.
It can get further complicated if you’ve got desktop and mobile instances of software as you then really need to decide what the criteria of user is and that’s going to vary based on what those in charge of the software end up deciding upon what a user is.
So in summary, I’m not saying that download numbers are not helpful as they definitely can be used as one of the possible data metrics which can be used to help see what traction a version is potentially getting (or just what the.most common url in bots, etc happens to be at that time :) ).
But solely relying upon downloads as the way to measure user numbers and even going as far as to make the naive assumption that it represents your active users is a 1:1 mapping of the downloads is just not a good thing to do as it can very much give you a false impression of active usage.
-dro