How to See Areas in Your Organization Where Data can Make a Difference

Written by damjan | Published 2019/12/02
Tech Story Tags: data-analytics | management-and-leadership | hackernoon-top-story | data-science-tools | datascience-workflow | data-question-audit | understanding-and-scaling-data | analyzing-data

TLDR A data question audit is simply running a series of interviews to surface the questions that exist within your organisation, that can be answered with data. This is especially true if you have joined a nascent data function, or are working in a start-up environment. You need to get the easy stuff out of the way to get you working on the real value adding tasks. From knowing to (scaling) mastery is a continuum that takes you from reporting what is happening, to deployment and scaling of predictive analytics solutions.via the TL;DR App

What is the first thing that you do when you start a new data science or analytics role?
You know, besides installing all the necessary tools and packages that you will need, and getting access to data?
If you haven’t already: it is wise to do a data questions audit, to identify all the areas in your organisation where data can make a difference.
A data question audit is simply running a series of interviews to surface the questions that exist within your organisation, that can be answered with data. 
Surfacing all these questions is valuable whether you are an individual, a team leader, or even someone that has been entrusted with developing a data strategy. In my experience doing all of the above, four categories of questions regularly come up.
They are:
Tier 1 Analytics (Knowing)
Tell me the value of <x>
Tier 2 Analytics (Understanding)
Don’t just report the trend, tell me why it happened
Tier 3 analytics (Mastering)
I love the trends, but can we comprehensively once and for all conclusively answer <y>?
Tier 4 analytics (Scaling Mastery)
I love everything that you have done to date, can we take you out of the equation and build something that works in perpetuity and to scale
An example of a Tier 1 question is if you are being asked what the delta is between actual and target. It should take analysts less than a day to answer these types of questions. The main skillset to respond to these is data mining and the ability to develop visualisations.
Tier 2 is if you are asked to look into the factors that contributed to that delta. This could take a few hours, up to a few days to answer depending on the complexity of the question.
Tier 3 is if you are asked to comprehensively answer what the relationship is between actual and target and how it has evolved over time and what factors influence it. The time taken to do this would usually go beyond one week. Tiers 2 and 3 would require a combination of data mining, analysis and data science expertise.
Tier 4 is when you are asked to develop a model to look at predicting a value or outcome (in some instances, the one that you were initially being asked to simply report — in our example “actuals”). This would require the production, testing and deployment of a ML models and can take month(s). This is the domain of data science as well as of data engineers for less trivial builds.
This continuum takes you from simply reporting what is happening, to deployment and scaling of predictive analytics solutions. From knowing to (scaling) mastery. So now that your audit has told you what is out there, what do you do about it?

Automate the simple stuff

Your Tier 1 Questions are BAU questions that are light on context. These are questions where your stakeholders may simply need to know a data point, rather than dig any deeper into what it actually means.
These tasks often fall on analytics teams, though some of them may still remain in the commercial domain (although the two should be in lock step anyway). Given that a lot of organisations have just started investing in and building out their analytics and data science teams, you may be surprised by the proportion of questions in your audit that fall into this space. This is especially true if you have joined a nascent data function, or are working in a start-up environment. Your organisation has made the commitment to building data capability and they probably have a long backlog of items that they want the analytics team to tackle, many of which will be in this stream.
Although there may be exuberance in getting these previously backlogged questions answered, and they are relatively easy to address, they themselves pose some risks that you’ll need to mitigate. They include:
The risk of analysts churn: Not many analysts are in the game to just unearth data points — they want to make an impact and use their skills to get to the bottom of things. Additionally failing to automate your Tier 1 responses can lead to repetition of tasks due to the BAU nature of these queriesThe risk of not getting to the important stuff: Remember that knowing is at the beginning of the continuum towards mastery. You need to get the easy stuff out of the way to get you working on the real value adding tasksThe risk of expectations not meeting reality: Data resources are not cheap. If you don’t manage expectations and are working on BAU / low value add problems in perpetuity, this may have consequences as far as how much value the organisation is getting (or perceives to be getting) from data
Fortunately your remedy for tackling Tier 1 Questions is easy. As soon as you have identified these questions, you will be well served to carve out the time to automate the delivery of their responses.
This is usually achieved through the development of self service dashboards. Invest a couple of months time in developing an automated self serve solution that addresses all the Tier 1 reporting needs that your organisation has, and free up your teams time to work on higher value adding work. Even better, if you do it right, this will take care of these questions for the foreseeable future, with only the need to expand if the business enters new verticals, or acquires new (important) data sources. It will also serve as a lead and a source of supply for the Tier 2 and 3 questions that you will want to spend the majority of your time working on.

Tier 2 & 3 Where your bread is buttered

All the fine work that you did in automating Tier 1 responses, is done with the view of freeing up time so that you and your team can operate in this space. These are the questions that are going to be valuable to answer. Additionally, they will have a regular and organic demand flow as your organisation begins to use more data and ask you to investigate the trends that surfaced whenever you answered a Tier 1 Question. Tier 2 and 3 are fairly similar, differing mostly in the time that it takes to answer each, and the conclusiveness with which they are answered.
For example a Tier 2 question may be: “we want to investigate why sales are down year on year”, whereas the Tier 3 version would be “we want to have an understanding of what factors drive sales and how we can sell more product period”. The former is more confined to the specific scenario whereas the latter is more open ended, conclusive and requires you to exhaust all possible lines of enquiry.
Another distinction is that a Tier 2 question may be used to inform strategic actions whereas Tier 3 is likely to inform strategic direction (or the overarching strategy from which certain strategic actions will cascade). One goes into powerpoint packs the other into board papers.
It is important to recognise the importance of these two areas, and manage expectations so that you are spending ~ 80% of your time working on these (post automation of responses to Tier 1 questions of course). You should also be proactive in identifying the areas that your organisation may want to comprehensively have answered (tier 3) and start carving out time to tackle these questions as they may take weeks to months to comprehensively answer.
These are the questions that you will be working on the majority of your time, so take your time to structure your teams workflow and cadence to do them justice.

Importance at scale

So now that you have automated responses to your simplest questions, as well as developed a nice cadence that has you working on the most important stuff, what is left? The final stage is identifying questions that your organisation wants answered into the future and in perpetuity — the Tier 4’s. It is the step that requires you to automate responses to your complex questions — are you able to generate models or tools that are able to provide a sufficient level of accuracy into the future whilst tackling important organisation questions? In essence, this entails the production and deployment of predictive models and even potentially full scale data engineering.
The challenge here is identifying the use cases that a) exist and b) need to be prioritised given that you will be executing only a few per year (if that ). From a prioritisation perspective, it is best to start with the easiest to execute, providing that value will be realised. This will allow you to pick off the low hanging fruit and build confidence in your approach, and allow you to ask for more resources and funding once you have demonstrated the ability to show returns. This is especially important given that the majority of responses to questions in this tier will require a business case, funding and other governance measures to be successfully executed.
As you run your questions audit, comprehensively identify questions that may fall into this tier (i.e. would we be able to use data to predict…?) and begin to prioritise them. More than any other previous tier, you may have to educate your stakeholders on what is possible and what you should be working on as certain initiatives may:
Unlock other capabilities — re more sound decision making based on the development of a Customer Lifetime Value modelProvide technology that can be scaled or re-appropriated with a little effort to tackle a different question with some similar featuresRequire investment in technology
Also be cognizant that this will largely be project work, so the nature in which they will be delivered will change and require a different management style compared to your BAU work.
Finally, manage expectations so that you are spending ~ 20% of your team's time working on these projects and depending on your resources you are delivering at least 2–3 such project(s) a year (again). It is ok as well to leave these projects for last and completely off the radar until all your tier 1 analytics have been addressed and you have made good headway and developed a cadence for your tier 2 and 3’s. However, you need to know where these opportunities are and find the time (and right strategy) to execute on them, once you have the opportunity to do so.

Final thoughts

You may note that both tier 1 and 4 have an automation component. Whereas in tier 1 you automate simplicity, in tier 4 you automate complexity. The remainder of the questions are going to be specific, and context driven and hence difficult to automate. They are at times, also going to require interpretations of the results that your Tier 1 and Tier 4 responses produce.
As such they are also going to be your most reliable and valuable source of work. In some ways, you will be delegating parts of interpretation and decision making of Tier 1 to your self-serve end users and Tier 4 to algorithms (which depending on the technique may be black boxes themselves). You however will be the key interpreter and decision maker for anything that sits in the tier 2 and 3 space, so take responsibility and ownership of these questions and ensure they make up the bulk of the time you invest in managing your efforts.
Good luck and happy (questions) auditing!
This article was originally published here
If you like this content and you also like BTC, feel free to send a tip to:

34EEVMDNrrfS4j496SBKENMdbA1ypohN4o

Or alternatively scan the QR Code below to tip / donate

Written by damjan | Data Artist & Influencer, Human - Machine mediator, Commercial connect-the-dotter, ML Trainer
Published by HackerNoon on 2019/12/02