Data Collection (+ 9-Page PDF)

The data collection methods that you use lay the foundation for the eventual success of your continuous improvement projects.

Simply put, data are the facts of the case. Raw data is then complied and processed into useful information that helps gain more insight into whatever you are trying to learn about. (Note: Data is plural. Datum is the singular form.)

Data collection is the process of getting the data from the real world to some form where you can manipulate it to get information. This information is then used to make decisions, the ultimate goal of data collection.

There are many choices you’ll have to make when collecting data. Will you be sampling or doing 100% data collection? Will the data be self-reported, or will it be collected by an external observer? Will you be eyeballing the measurements or will it require precise measuring devices?

From Our Data Collection Training Module

From Our Data Collection Training Module

The choices you make about your data collection plan will be influenced by how you intend to use the data. All data should be collected with a purpose in mind. The planned use of your data will shed light on what you need to gather.

Practical Guide to Continuous Improvement

Click the banner above to see where this term fits into our practical guide to Lean.

 

If you like this reference guide, please help us spread the word about it!

Data is extremely important to continuous improvement. It takes the emotion out of decisions, and peels back the cloud of misperception. The results of a data collection effort can stun a person when it doesn’t match what the person believes to be true. It is surprising how little people really understand about their environment.

I suspect that there are a few reasons for this. First, people tend to think they know more than they really do. Because they are confident in what they believe to be true, they see no point in spending the time and effort to collect data.

As a result, people routinely skip data collection because it takes time, and is hard to do properly. While the actual methods of data collection may be simple (i.e. check sheets), the planning, compilation, and use of this data takes time. And if they do collect it, despite the time it takes, data frequently sits unused.

Make sure to take the time to create a data collection plan. It will minimize the impact on production as well as increase the likelihood that the data will be used.

Data Collection Tips

  • Look at the data you need to make a decision. Then determine what data you already have, and figure out where the gaps are. That analysis lays the foundation for your data collection plan.
  • Think about how the data will eventually be used, and if it is going to be compared externally. If it is, you will need to make sure you know how it will be normalized, and if there are any additional pieces of information you will need to do that.
  • In general, it is easier to gather qualitative and discrete data than it is to collect continuous data. Continuous data tends to be more valuable, though. It lends itself to more meaningful ‘slicing and dicing’.
  • In some cases, you will want to look at trends. In that case, you have to collect data for a long enough period to get a picture of how the data moves over time. Be aware, though, that if you get too demanding on your data collection plan, your production may suffer. Collecting data has a cost.

Types of Data

Data falls into several main categories. The first is whether it is qualitative or quantitative in nature.

In truth, the distinction is actually one more of choice than of the actual nature of the condition you want to evaluate. Most conditions can be measured quantitatively. Color, for example, has a wavelength that is numerical. Emotion has blood pressure changes, skin temperature, perspiration rate, voice pitch, etc. Even gender can be quantified as the number of ‘X’ chromosomes a person possesses.

The challenge is that all of these things are difficult, often prohibitively so, to get numbers for. When that is the case, qualitative data is a reasonable compromise between costly, intrusive data collection and no data at all.

Quantitative data (occasionally also called variable data) falls into two main categories. Continuous data is able to be subdivided infinitely. A person’s height falls into this category. You can record any value that you measure. A person’s height can be recorded as 71.673452 inches, with a sensitive enough measuring device. 

From Our Data Collection Module

From Our Data Collection Module

Discrete data, on the other hand, is still numerical, but can only take on values from a limited set. The number of wheels on a vehicle, or the number of bad parts in a lot both offer limited options to choose from. The bad parts data is limited to whole, positive numbers less than or equal to the size of the lot. For the vehicle example, the choices can only be 1 (unicycle), 2 (motorcycle or bike), 3 (three-wheeler), and so on up to an 18-wheeler.

Qualitative data has a few more categories than quantitative date. Attribute data is not quantifiable—color, pass/fail, etc. Again, keep in mind that attribute data, in many cases, comes from something that could be measured numerically. For example, ‘Red’ is really a range of wavelengths along the spectrum of visible light. If we want more flexibility down the road, we could measure the wavelength, but that has an added cost to it. 

  • Don’t over-rely on existing data. If it doesn’t match what you really need, you may be swayed to change the questions you need answered.
  • You will need to use consistent time periods for your data collection so you are comparing apples to apples. This is especially important if you are collection data from multiple locations.
  • Make sure you close the loop on data collection—your team should be able to understand the value of the data they gathered for you.
  • Select your methods of data collection carefully. If you are not familiar with creating data collection plans, find a mentor. Making mistakes in data collection will waste valuable resources.
  • Decide how long it will take to get relevant data. You generally need a ‘statistically significant’ number of data points to be valid. Many people use a small run and make an erroneous assessment because of it. Essentially, the smaller the sample size, the more likely it is that the data is not really reflective of the whole data set. Look at how you will be chopping the data, too. If you chop a good-sized data set, you may still end up with a sample that is too small.
  • If you want to compare your data to other sets, you will probably have to normalize it. There are some sophisticated statistical meanings for normalization. But for the average manager, it will simply mean adjusting the scale to fit one in common use. For example, you might collect daily numbers and want to compare them to monthly recordings, or you might use a standard unit that is different from what you use. One of the most common normalizations in use is ‘PPM’ when reporting quality.
  • Be careful about collecting data to use for individual evaluations. It may be warranted in some cases, but that is rather uncommon. In most cases, data collection should be focused on measuring processes.
  • Don’t go overboard in your data collection. It is easy to go into information overload if you try to measure everything. Focus on the important things first.

Data collection can feel like a burden because it is done in addition to the rest of your workload. The purpose, though, is to make the operation run more smoothly. If the data is never used, your frustration is certainly warranted, but I urge you to look at data collection as a sign that your leadership team is taking a proactive step to solving the problems that plague you.

In addition to the benefit of leading to problem resolution, data collection also helps manage your relationship with your boss. One of the challenges leaders and teams face is when they assess a situation differently. Data collection helps take the ambiguity out of a situation. Rather than debate knowable facts, they discussion shifts to the heart of the matter. Countless disagreements between management and workers could be avoided if they shared the same understanding of a situation.

Most managers can become markedly better by using more data in their decision making. That’s not to say that experience and ‘gut feel’ has no place in being a leader, but all too often managers make quick decisions with limited or no information.

Sometimes they get it right, but they could be more correct more often if their decisions were grounded in fact. Weigh the risk and the potential benefit of a decision with the effort to collect a bit of data before taking action. In more cases than not, you’ll see an improvement in the results of your problem solving efforts.

Be careful to keep data collection from straining your relationship with your team. Done right, it can dramatically improve how you get along. Done wrong, it can set you back substantially. The most common problems are overdoing data collection, not using data that the teams work to collect, measuring individuals, and using data as a weapon against the team.

  • Data collection is an underused skill that has a strong causal relationship with successful projects.
  • Data collection requires a solid plan to make sure the date is useable and meets your requirements.
  • Data collection has the potential to greatly improve or to destroy relationships between managers and employees.
  • Data collection is not free. You’ve got to find the balance between useable information and spending too many resources in information that will sit idle.

Add a Comment

Share Your Thoughts    |No comments|

Table of Contents

Leave a Reply

You must be logged in to post a comment.