Previously, we presented five key components to an effective cybersecurity management program: 1) data classification, 2) security control implementation, 3) regular verification of control performance, 4) breach preparedness and planning, and 5) risk acceptance and risk transfer. This article takes a deeper dive into the data classification process and provides an approach to complete your own data classification.
Data classification is the process an organization follows to develop an understanding of its information assets, assign a value to those assets, and determine the effort and cost required to properly secure the most critical of those information assets. Data classification is an important first step in establishing a cybersecurity management program, as it allows an organization to make managerial decisions about resource allocation to secure data from unauthorized access. For purposes of this article, we’ll assume the organization has already completed a risk assessment and understands its regulatory and contractual privacy and confidentiality requirements.
Today’s business systems and the nature of connected enterprises generate a tremendous amount of data. While this data has organizational value, it’s important to recognize that not all data has equal value. Accordingly, all data doesn’t need to be protected the same way. Conducting a data classification allows you to determine the different types of data resident within your organization and will provide insight into the protection requirements for each type of data.
There are four essential activities a successful data classification effort will include:
It sounds straightforward: identify the data. However, systematically identifying the data within your organization can be quite challenging. This isn’t an activity you can delegate to your database administrator. It needs to be cross-functional, organized around business processes, and driven by process owners. We typically see these completed as walkthroughs of each business process – tracing the data flows with process owners to identify the data. You will need to ask several key questions to identify the types of data maintained by your organization:
Identifying and cataloging this data is the first step in the classification process. It will serve as a basis for all other classification activities.
Moving beyond the initial identification, it is important to next identify all the places where this data is stored electronically. You will want your database administrator or enterprise architect to support and help complete the location identification process. Data loss prevention tools can even be used as part of this step to help scan networks looking for certain types of data.
Today’s business systems are so tightly integrated that data is often interfaced or shared across systems. This means data might reside in multiple systems, not just the system where it is originally created or entered. Be sure to think about reporting systems and data warehouses too. Many times transactional data might be archived in these systems. It’s also important to consider systems like document imaging and photocopiers where electronic copies of once physical data now reside.
One last system to think about when identifying where data is stored is your actual backup system. Nearly all organizations retain backup copies of data. It’s not uncommon for organizations to retain multiple copies of system data to support record retention requirements. However, it’s nearly as common for organizations to retain these records in excess of the business requirement. Identifying where data is retained excessively so that it can be properly purged is an important outcome of a data classification process.
By now, you’ll have a solid understanding of the type of data your organization maintains and where it is stored. The types of data should be readily apparent now and will probably fall into major categories including:
At this point, it’s important for management to determine the classifications that are relevant to the organization and what to protect. The list above is a good start, but there might be more for your organization. The main objective at this point is to be able to have an index to the different types of data within your organization. Resist the urge to get too granular. More than ten types of business data in your classification? It’s a good sign that you’ve gone too granular. Too granular tends to be less manageable over time. Five to eight classification categories is reasonable.
Assigning a value to the data you’ve classified so far is a critical step as it will allow you to make informed decisions about how much you spend to protect that data. Multiple factors contribute to the overall value of a data set. Organizations need to consider the penalties associated with a loss or breach of data. By understanding the potential hard and soft costs associated with a breach of a given data set, an organization can set realistic expectations for the cost to protect the data set. Consider these examples:
Having completed these steps to classify data, an organization can make decisions about how to manage and secure its data. The data classification results should be used in the overall cybersecurity management program. By understanding where the data resides and the organizational value of that data, an organization can implement cost effective and efficient cybersecurity controls based on the business risk associated with each type of data maintained by the organization.
For more information on this topic, or to learn how Baker Tilly specialists can help, contact our team.