Consultant 2020: how analytics is changing graduate skillsets
- The new set of standard graduate skills will include being able to manipulate large datasets (beyond the reach of spreadsheets)
- Graduates who can (learn to) communicate the commercial implications of statistical models will be at an advantage over their peers
- We see analytics changing industries around the world and believe that new graduate business entrants will be increasingly asked to show potential to understand data and its manipulation. This post considers the extra skill requirements of future graduates for consulting – but the lessons are equally applicable to junior positions across the business world
Graduates (and millennials in general) are getting a lot of stick about how much they need to be great, and so I am loathe to add more skills requirements to Generation Z but as analytics changes how businesses operate I think there are a few new skills that will make great graduates stand out from the crowd. Not everyone will need to be deep coding and statistics specialists and not every company needs cutting edge systems and tools, but everybody (or every company) would benefit from the basics.
1. Manipulating large datasets
Manipulating a dataset with 200 items is very different to manipulating a dataset with 2 million items. The key difference is that manual intervention is not practical on that large scale, so you cannot look at each individual item to check that it makes sense – you have to build algorithms to do it.
There are two skills here, firstly to be able to do a few basic manipulations yourself so that you don’t have to wait for support, and secondly to have sensible expectations in terms of analytics specialists.
This would include being able to join two datasets together, for example joining a dataset from finance with SKU sales to a dataset from merchandising with the number of stores, so you can understand the SKU’s sales per store. Another simple operation might be to remove all SKUs with zero sales, so you can add more accurate sales per SKU.
There are a number of different tools available to do these things, including alteryx, SPSS, R or SQL. They vary in complexity and user friendliness, but should all help you to understand the concept of using an algorithmic approach to unlock your data.
Art of the possible
Analytics driven approaches will have a different rhythm of results than a project with a smaller dataset, and how long something might take on a large dataset can be very counter-intuitive if you are only familiar with a spreadsheet. Knowing the basics of how to do large data analysis helps to avoid having the wool pulled over your eyes in terms of timelines and will help get the most out of colleagues.
For example, changing the hierarchy you are using to summarise your data can be a nightmare in Excel, where it is child’s play in Tableau. Hiding those ‘funny’ results in your file that are a couple of delete strokes away in Excel could result in several hours of experimenting when working with hundreds of thousands of rows in R.
In general, analytics will take longer to set up but it is easier to manipulate later. One advantage of learning to do some basic manipulations yourself is better understanding of the art of the possible when working with analytics specialists.
I hope this is not too harsh a generalisation to say that most people’s statistical knowledge only covers averages (though they may have to look up the difference between mean and median), quartiles, and perhaps standard deviation. This is natural because our school teachers didn’t follow us into the real world and for most things with a small dataset, a simple average (mean) is enough to get a useful understanding of what is going on. However as datasets get larger, there tends to be more variation. And the more variation there is in the data, the less useful simple averages become. Making sense of variation requires some understanding of statistics.
Graduates will increasingly need to understand how likely it is that a result has happened purely by random variation (statistical confidence) and to roughly understand how to test this. For example, understanding how well a line can be drawn through a group of data points is important to show whether there is a trend. One test for this is the r2 test, and people should know (or know to google) that a result with an r2 of 0.95 fits better than one with an r2 of 0.3.
Grasping this logic unlocks other powerful concepts such as:
- Multi-variant regression (which is used to disentangle multiple effects, such as asking whether sales are most impacted by weather or price changes)
- Clustering (which is used to group similar things together, for example grouping customers into customer segments)
- It is for the true statisticians to understand exactly how to perform each type of test and know why one is better than the other but business graduates will need to be able to interpret the output and ask smart questions
When misused, statistics can be used to show pretty much anything but understanding the principles in core areas will help people to avoid being misled. Being told convincingly that store sales has a correlation with store proximity to bee hives with an r2 of 0.15 should raise more questions than it answers.
3. Communicating complexity
Technical excellence is necessary but not sufficient to using analytics effectively.
We are already at a point where graduates are learning basic coding and analytics, but people in leadership positions are likely to have gone through their careers on nothing more than Excel. This means that graduates experimenting with new approaches may be using tools and techniques that their seniors don’t understand, so working out how to communicate what they are doing and why it matters is becoming an even more important skill to practice from the start of your careers.
For more information on how we think analytics is changing the business world see Consulting 3.0: The digital revolution is changing the consulting industry.
Luke Lishman, Consultant