A Lot of Data Does Not Always Mean Big Data

Posted by Dan Miklovic on Fri, Oct 23, 2015

Find me on:

A lot of manufacturing companies believe that they are already doing Big Data when it comes to plant operations. Even with the data compression that data historians provide, it may be terabytes of data collected over decades of operation. Surely, this is Big Data in their minds. We don’t think so. Although this is particularly true in heavy process plants where data historians are present and they may be collecting and storing tens of thousands of process variables or tags across an entire enterprise; having a large quantity of essentially similar data doesn’t mean you have Big Data.

Big Data does imply large quantities of data, but it also implies a high amount of variability and variety in the data. There are numerous Big Data opportunities in a manufacturing plant, but just sorting through and displaying a stream of process measurements isn’t one of them. A data historian probably is not the best tool to try and do Big Data in your manufacturing operations, at least by itself. It takes more to qualify as Big Data. For many manufacturers and asset intensive companies, one of the best examples of what it is, what the use is, and the associated Predictive Analytics is asset performance management (APM).

Variety of Data Types is Important Attribute of Big Data

As I noted above, Big Data is more than just a large amount of data. One of the key attributes of Big Data is the variability in the types of data that one is trying to collect and make sense of. This means that a single data type such as time series analog process variables, even if in the thousands, does not qualify as Big Data. This is because all of the data is of a single type and is readily indexed. The value of modern Big Data processing techniques, and what has made it so interesting to business, is the technological progress. It has allowed businesses to examine a myriad of data types, determine the correlations between them and do analysis of those linkages. In this case, they can interpret the data to deliver new information that any single data stream alone does not contain.

Besides numerical data, Some data types that are components of a Big Data database or data lake are textual information, image data, geographical or geological information and unstructured information; such as that obtained via social media or other collaborative platforms. Big data is complex, high velocity and highly variable.

APM is a Powerful Source of Big Data

APM can generate a lot of information that meets the tests for what is Big Data. APM often deals with data generated in multiple domains. Certainly, time domain information such as that obtained from a historian or in real-time from the process qualify. However, frequency domain information is another component of many APM programs, generated by the vibration sensing and monitoring solutions used to monitor rotating machinery. Another data type frequently encountered in APM programs is image data, usually obtained via tools like infrared thermography used to detect overheating or insulation failures. Positional information and associated climatic data is also used in many APM programs, particularly if mobile assets are part of the fleet being maintained. This also applies if the assets are linear in nature, such as in the transmission and distribution segment of power or in pipelines. In the mining industry the underlying geology may even be useful in determining the operating conditions mining equipment may be exposed to.

When you combine all of these data sources, and then apply deductive and predictive analytics against this data set, you have a real opportunity to take your APM to the next level. Just monitoring a thousand pieces of equipment for actual running time to schedule preventative maintenance doesn’t qualify as a Big Data problem. When you use vibration analysis, thermography, process condition data, real-time position information, textual parsing of operator logs on performance, and scour the Internet for failure modes associated with pieces of similar equipment; you are now operating in the realm of Big Data. Of course, just doing this for a few pieces of equipment doesn’t qualify as a Big Data exercise because one other key criteria of Big Data is volume. There has to be a sufficiently large volume of information to qualify as Big Data.New Call-to-action

Categories: Big Data, Asset Performance Management (APM)