On being data driven in wicked environments

The past two years we have been working on a digital car subscription venture Juicar – electric mobility by monthly subscription – which regretfully has been sunset September 2020.

One of the ongoing challenges has been the interpretation of data, classic 42 problem; you have some numbers but how to interpret, and are they reliable? Being a digital venture in, at the time, uncharted territory, we relied heavily on data and analytics to understand success and plot our way forward.

Concerning reliability, methodologically there are simple calculations you can perform to find out if the amount of data is sufficient to discover differences that sufficiently different to be interesting to look at (i.e. significantly different from pure chance). Nabler has a nice page explaining the ins and outs, and giving a simple formula to calculate the required sample size. Of course it is great to know that you need 10000 visitors, but that also takes a certain amount of time, and running a start-up, being at the start of the development life cycle, time is a luxury. So, if the impact of an improvement takes that amount of data to validate, it may not be too relevant to pursue, in the larger schema of things.

Basically any effect will be statistically significant if the
sample size is large enough, which obviously does not always means practically significant. Especially in early phases I am looking for large effects detectable in small data sets.

How to interpret data is a bit more complicated because in real life (i.e. not in the research lab) your control is limited. Typically, if you test or do an experiment in a lab, you control every single detail. Control is much less when you are testing in the field., but involves not only the ‘experimental’ set-up, but also contextual information you cannot control. For example, we were initially quite excited about our updated website when traffic went up until we realised there were school holidays so people probably had more time to browse and explore. At least, we initially assumed this could be an factor, and it later was ‘confirmed’ when school started again.

Collecting reliable insights in real life environments often is a challenge. When testing, preferably you control everything so you can understand cause and effect relationships, which even in lab environment often is difficult. The more you move out of the lab environment, and the more wicked the problem at hand is, which is typically the case with start-up companies, the more difficult the test is going to be. One way to compensate the lack of control is to increase the sample size, collect more data, which goes hand-in hand with increased elapsed time, typically in limited supply with start-up companies.

Awareness is fundamental to learning, which translates in expectations (or hypotheses) that can be validated by controlling the relevant parameters.

Faced with the challenge of a wicked problem under real life conditions, we contacted an expert for proper advise, that basically translated in: “Just throw everything at the wall, check if something sticks, and if something sticks, try to discover as to why.” It reminded me to trust my instincts and academic education more.

Throwing at the wall to see if something will stick is an approach in the realm of ‘failing fast’, invented to motivate early test with the purpose of discovering and focussing on what works without loosing time on what does not. Contrary to common understanding, this does not mean ‘shooting from the hip’. The purpose is to learn fast (a much better ambition to strive for than ‘fail fast’), and to learn means to discover cause and effect relationships. This demands thinking and contemplation; defining expectations (hypotheses) and finding out if they are correct (enough).

In the end, what worked well was an approach of sufficient control, combined with a focus on major effects. Controlled was for example the budget of online advertising, platforms used, timing, target segmentation etc. Variated, in a controlled and systematic way, were the messaging and presentation (visualisation and wording). Measured was the funnel performance. Every week, the data was analysed (looking for convincing trends rather than statistical significance) and discussed. Within a couple of weeks we had a sense of what we could control, and in a few months to get the numbers there were we wanted them to be.