The #MondayMusings blog series provides executive level insights and analysis for the Industrial Internet of Things (IIoT) and Digital Transformation from the previous week’s briefings, events, and publications @LNSResearch.
The Divide between Data Scientists and Engineers
I have spent the first half of 2016 travelling to many of the leading IT and OT user events around the world, including those for companies like: SAP, Oracle, Dassault Systems, Siemens, Rockwell Automation, Schneider Electric, GE, and may others.
At all of these events the overarching themes have been around Digital Transformation, IIoT, and IT-OT convergence.
These are also all issues that LNS Research has been working on extensively and one of particular interest is IT-OT convergence. For many companies, the gap between groups focused on enterprise applications and automation systems is large, but at least closing. However, as this gap closes and a single view of technology and data is established, another gap is emerging. I refer to it as the Data Scientist Divide.
The Data Scientist Divide
Industrial companies have always had a deep engineering identity. Industrial companies have also been doing predictive analytics and prescriptive analytics for many years without the help of data scientist; it’s just not what engineers called it. Instead, engineers refer to it as simulation and modeling, and it was not based on Big Data. Instead it was based on a robust model; incorporating the mathematical approximations of the physical laws of the universe. The algorithms driving these models are those many of us who are engineers learned in school, linear programming, advanced process control, statistical process control, fluid dynamics, parametric equations, and more.
Over the past several decades models have been built to simulate products, production processes, supply chain logistics, demand shaping, and more. To a large extent these models have been very successful in helping industrial companies optimize the value chain and many ERP, PLM, and SCM vendors have built business around these models.
Starting several years ago data scientists and many Big Data analytics companies have targeted industrial companies as a new vertical industry where they can add value. Most of these companies have come from the web environment and were built to analyze vast amounts of unstructured web and machine data.
These vendors are the companies that have defined the concepts behind predictive and prescriptive analytics as we understand them today, and have built the tools for Data Scientists to analyze Big Data. Although these data scientists are trying to now solve many of the same problems engineers have traditionally been solving in the industrial space, they are using very different tools and techniques. Data scientists don’t rely on a model or approximation of the physical world. Instead, Data Scientists rely exclusively on data, dirty or not, and the more the better. Data Scientists also rely on a whole new set of mathematical tools to analyze this data, like Machine Learning, largely unfamiliar to engineers.
With these new approaches, Data Scientists can often do as well or better at predicting future behavior than the engineers with a deep understanding of the underlying processes. Unfortunately, at many companies, this lack of understanding and competitive positioning between engineers and data scientists are the underlying forces creating the Data Scientist Divide.
Take for example dirty data. In an engineering-centric model based view of the world, dirty data = bad results. We have all heard “junk in junk out” many times in reference to failed attempts to model and predict manufacturing or logistics performance with incomplete or error filled data sets. Data scientists however take a very different approach, they assume dirty data, and have built into algorithms the ability to deal with it. Over time, the system can learn, improve, and make continually better predictions, even with dirty data. Just this small bit of misunderstanding between the two views of the world can make it very difficult for engineers and data scientists to work together and more importantly trust the work each are doing.
LNS Research believes that those industrial companies, and vendors serving industrial companies, that are best able to close the Data Scientist Divide will be most successful in Digital Transformation initiatives.
Is this a challenge that resonates and you see in your organizations?
In future posts we will further explore the Data Scientist Divide and what to do about it. For now, you can read more about how LNS Research believes how Big Data analytics should be built into existing Lean and Six Sigma programs.
SplunkLive! In Boston: “Splunk > Needle. Haystack. Found.”
Last week Jason Kasper had the pleasure of attending a user conference from one of the leading Big Data platform companies targeting the industrial space and at the center of trying to close the Data Scientist divide.
IIoT Platforms continue to gain steam which was evident at this year’s well attended SplunkLive! Event in Boston, MA. Splunk, is targeting their solutions initially to enable IT organizations to support applications for Operational Intelligence, but as the event progressed, it is clear they have an eye on moving to the realm of Operations Technology (OT).
The users attending this event were clearly engaged and hungry to learn more about using the tools effectively and understanding the language of Splunk to enable improved searches of data; essentially finding the right needle in a stacks of needles to gain insight and understanding to improve IT operations.
Some marquee customers like Partners Healthcare, Dunkin Donuts, and Aetna Health provided their view of implementing and using the solutions to gain new insight into IT operations and increase response rates for requests from the business. What is interesting is once the business units understood what the Splunk solutions were capable of, there was a huge influx of requests for more!
While Splunk gains momentum with IT, it is also is looking to move into the Operational Technology side of the business, and was evident by showing integration of data sources like Kepware. Another proof point was using predictive analytics and machine learning to understand track conditions and car performance. This work with Schmidt Peterson Motorsports at the Indy 500, was providing insights in real-time so the team can make adjustments proactively, instead of reactively.
We are looking forward to understanding how this progresses and resonates with the OT side of the business, especially those working in the Asset Performance Management space, and look forward to hearing more at Conf2016 in September.