The Problem with Collecting, Processing, and Analyzing More Security Data

Security teams collect a heck of a lot of data today.  ESG research indicates that 38% of organizations collect, process, and analyze more than 10 terabytes of data as part of security operations each month. 

What types of data?  The research indicates that the biggest data sources include firewall logs, log data from other types of security devices, log data from networking devices, data generated by AV tools, user activity logs, application logs, etc.

It’s also worth mentioning that the amount of security data collected continues to grow on an annual basis.  In fact, 28% of organizations say they collect, process, and analyze substantially more data today than two years ago, while another 49% of organizations collect, process, and analyze somewhat more data today than two years ago.

Overall, this obsession with security data is a good thing.  Somewhere within a growing haystack of data there exist needles of value.  In theory then, more data equates to more needles.

Unfortunately, more data comes with a lot of baggage as well.  Someone or something must sort through all the data, interpret it, make sense of it, and put it to use.  There’s also a fundamental storage challenge here.  Do I keep all this data or define some taxonomy of value, keep the valuable data, and throw everything else out?  Do I centralize the data or distribute it?  Do I store the data on my network or in the cloud?  Oh, and how do I manage all this data:  RDBMS?  Elastic search?  Hadoop?  SIEM?

Let’s face it, security is a big data application so it’s time that the security industry and cybersecurity professionals come together, think through security data problems, and come up with some communal solutions.