“Big data” has become such a ubiquitous phrase that every function of business now feels compelled to outline how they are going to use it to improve their operations. That’s also true for Human Resources (HR) departments, which is where most of a company’s money is spent, and where — we’d like to believe — the real value lies.
One of the reasons for the special attention being given to big data in HR is that the department is always under pressure to be more analytic — which is justified to some extent. Some wishful thinkers believe that the application of big data techniques will somehow rid HR of the some of the attributes they don’t like about it, such as the perception that they’re focusing on “soft” issues and not detailing the return on HR-related investments.
As with most of “the next big thing” stories in business, big data is really important in some areas, and not so important in others. As a literal definition, HR does not actually have big data, or more precisely, almost never does. Most companies have thousands of employees, not millions, and the observations on those employees are still for the most part annual. In a company of this size, there is almost no reason for HR to use the special software and tools associated with big data.
For most companies, the challenge in HR is simply to use data at all — the reason being that the data associated with different tasks, such as hiring and performance management, often reside in different databases. Unless we can get the data in those two databases to be compatible, there is no way to ask even the most basic questions, such as which applicant attributes predict who will be a good performer. In short, most companies — and that includes a lot of big ones — don’t need fancy data scientists. They need database managers to clean up the data. And they need simple software — sometimes even Excel spreadsheets can do the analyses that most HR departments need.
Another major difference in HR analytics is that the questions that really matter have been under investigation longer than most other business topics. What determines a good hire, for example, has been studied in almost the same way since WWI. So the idea of bringing in exploratory techniques like machine learning to analyze HR data in an attempt to come up with some big insight we didn’t already know is pretty close to zero.
Consider Google’s very prominent efforts over the years to analyze their people data with initiatives such as Project Oxygen, a multi-year research project that was designed to try to figure out what makes a good manager — a much more substantial effort than most any other company could pull off. Most of the conclusions from that very intensive exercise were ones that research discovered decades ago and which could have been found in textbooks. That doesn’t mean it’s not a worthwhile exercise to test how those standard assumptions of management play out in our own organizations, but expecting to find big and new insights is simply a bad bet.
The very nature of HR data imposes some unique limitations on analyses. Companies operating in the European Union, for example, know that employee data cannot be moved legally and easily across other national borders. Multinational companies can’t legally examine employee data across countries at the same time. In the U.S., analyses on employee data that could reveal the possibility of adverse impact on protected groups — e.g., our female employees in this unit are paid less than the males — triggers the need for legal and then management responses that wouldn’t happen in other parts of the business. HR has to be careful not to turn their data over to other departments that don’t understand these limitations.
So, what should HR be doing with data, after we clean up our datasets? Anytime we analyze data, it helps to start with the basics. First, just look at the big picture — graphs plotting outcomes across the organization and then over time: Where has turnover spiked, and when did it happen? Are there places where there are consistent employee complaints? Second, look at more of this data, more often. For example, the move to pulse surveys (short, very quick, sometimes daily surveys) of employees that replace the annual and ponderous morale surveys are a good idea. Smart companies like IBM compile data that the employees themselves generate on company-sponsored social media, for example, to monitor morale and identify workplace concerns.
Finally, HR should be analyzing relationships among the data. Start by asking how your hiring criteria relates to actual performance. This is important not just because hiring is arguably the most important task an organization does (partly because it happens so often), but also because we are required to use criteria in hiring that do not have adverse impacts on protected groups.
At the end of the day, everything starts with the quality of the data: If we don’t think our performance appraisal scores are good measures of actual performance, for example, then no analyses that try to predict who will be a good employee will be worth doing.