Monochrome pyramids

Industry 4.0: The Unstructured Data Perspective

You know what? Let’s skip the typical intro to Industry 4.0 where we review the history of the past industrial revolutions—it’s so overdone—and cut to the chase. The goal of every previous industrial revolution has been to optimize and scale, and this one is no different. Arguably the biggest drivers behind Industry 4.0 are the prevalence of the Internet and Artificial Intelligence (AI). We have more data than ever, and we’re more connected than ever (think smartphones, cloud, IIoT…), which in turn drive innovation in AI and machine learning. 

Industry 4.0 (IR 4.0) comprises a multitude of technologies, from augmented reality to additive manufacturing to cybersecurity. 

Industry 4.0 components

When you Google “Industry 4.0”, though, you’ll mostly find articles around the Industrial Internet of Things (IIoT). This is probably because IR4.0’s end goal of real time decision making hinges on connected smart machines and products. The problem is that this leads businesses to believe they need to deploy IIoT first, not realizing they will see limited return on this investment until they've put a will see limited return on this investment until they’ve put a contextual platform in place.

What’s a contextual platform? Think about it like this: When a sensor signals that a machine may fail, your decision on how to handle it and the speed at which you can make that decision impacts cost and revenue. To make a well-informed decision as quickly as possible, you need details around the machine’s maintenance history, spare part requirements, and the system it’s connected to—all at your fingertips. 

Building and deploying your contextual platform requires dealing with unstructured data, particularly text and image data, and that’s often an area where there’s a major information gap for facility managers designing their IR4.0 implementation strategy. 

Let’s change that.

What is unstructured data?

Unstructured data is not organized in any predefined manner, like a database. Common examples include textual information like emails and maintenance reports; image information like photos and engineering drawings; and time series data like heart rate monitoring and temperature readings. This type of information contains many irregularities and ambiguities, so conventional software, like expert systems, struggle to handle it. Deep learning, however, is a very good fit for dealing with this kind of high dimensional, ambiguous data—especially for text and images.

Industry 4.0 design principles

Smart facilities are the focal point of IR4.0. They’re “smart” because digital twins allow rapid processing of massive data. To benefit fully from your smart facility investments, including handling unstructured information, you need to follow four key design principles:

  1. Information transparency
    Cutting across data silos is one of the main goals of not just Industry 4.0, but the broader sphere of digital transformation. Silos impede decision making because they limit your view of the business, discourage collaboration, threaten data quality, and ultimately slow the whole business down. When only maintenance engineers have access to spare parts inventory, and only purchasing knows pricing, how much time gets wasted before the departments can agree and dispatch a purchase order for critical parts?
  2. Interconnectivity
    IIoT cannot exist without interconnectivity, but it involves more than just setting up network infrastructure. You need to build automated data pipelines that analyze unstructured data like maintenance reports (text), photos of equipment damage (image), temperature readings (time series from sensors), and pull them all together in a meaningful way.

    On top of that, your staff need to be able to interact freely with the data, machine learning models, and one another—again, only achievable when you have information transparency.
  3. Technological assistance
    You and your team are inundated with high dimensional data that varies in type, quality, and source. This is where you need machine learning to augment human intelligence and productivity, allowing your team to spend less time on tedious tasks and more on decision making.
  4. Decentralized decisions
    Where it’s possible, enable your physical equipment, powered by machine learning, to perform tasks autonomously based on data. For example, if a temperature sensor picks up that a piece of equipment is starting to overheat, and supporting contextual data from maintenance reports indicate that the machine was recently serviced, the AI monitoring the equipment can decide to temporarily power it down until it has cooled off. This is a nod to the ultimate “dark factory,” where the facility functions without lights because systems are fully automated, and humans only come into the loop when anomalies require higher level decision making.

As we delve deeper into how to properly handle unstructured data to capture the full potential of migrating to IR4.0, it’s important to keep these design principles in mind.

Building your contextual platform: A portfolio of digital twins

A smart facility will be composed of a portfolio of digital twins, where twins that are equipment-specific correspond to the lower portion of the asset hierarchy, and twins to integrate the equipment-specific ones correspond to the upper portion. Your facility’s digital twins are central to all Industry 4.0 initiatives. Without them, you wouldn’t know how a sensor alert on failing equipment will affect the rest of the system, robots wouldn’t know which equipment to service, and maintenance engineers wouldn’t be able to use augmented reality to visualize pipes and other infrastructure hidden behind walls.

Digital twins for legacy facilities

If you’re starting with an older facility, much of the legacy data associated with it is unstructured. The goal is to extract this unstructured data and organize it meaningfully into a database—this entails building an asset hierarchy, linking P&IDs and data sheets, and generating spare parts lists for maintainable assets—in line with a standard like CFIHOS. For example, you’ll need to extract instrument tags, equipment IDs, instrument loops, and more from P&IDs; as well as parts tables from general arrangement drawings and equipment datasheets. 

These are terribly tedious tasks for humans, but a perfect fit for machine learning systems. However, because accuracy is extremely important (since this is the foundation for your entire digital facility moving forward), a human still needs to come into the loop to remediate any errors from the machine. No machine learning system can perform at 100% accuracy, so there will definitely be errors—don’t trust anyone who tells you otherwise! To accomplish this seamlessly, it’s a good idea to implement a System of Intelligence (SOI).

Systems of Intelligence are powered by a combination of domain expertise, deep learning algorithms, and expert-in-the-loop machine learning to deliver a seamless experience in a single platform. They are designed to execute a series of workflows spanning several disciplines for large volumes of data, and are operable by subject matter experts who have no knowledge of machine learning or data science. For example, Cenozai is developing an SOI to span several aspects of Facility Maintenance, including building a hierarchical asset register using AI to help facility management teams embark on their Industry 4.0 journey.

Supplement document data with 3D laser scans

It may also be valuable to laser scan parts of your facility. Over the years, your facility may have undergone updates that weren’t documented in your engineering drawings, or revised drawings may have somehow been lost. Laser scans will allow you to fill those gaps. Once you’ve got an up-to-date 3D representation of your facility, you can integrate it with your contextual platform and build robust visualizations of your digital twin portfolio.

Digital twins for new facilities

It’s much easier, of course, to set up digital twins for a new facility. As you’re going through the design and construction process, ensure your vendors are complying to a standard like CFIHOS. They should be keeping all new drawings in a native digital format that’s directly connected to your database. You may need a rule-based system like robotic process automation (RPA) in place to automatically send the new information over. You’re unlikely to need machine learning or an SOI for this if your vendors are following the standards and templates you’ve specified. 

Tying back to IR4.0 design principles

Building out your asset hierarchy lays the foundation for your digital twins and the rest of your Industry 4.0 initiatives. Following the design principles, this helps address information transparency because all of this data can now be stored in a centralized database. Departments will be better able to collaborate, decisions can be made faster, and maintenance teams can vastly improve facility uptime. Additionally, you’ve now set the stage for interconnectivity. You can begin to attach sensors to critical equipment because you now have the contextual information, such as how the greater system is connected, and which spares to use in case a sensor indicates a part might fail.

Operationalizing your digital twin

The next step after digitizing and contextualizing your facility is to start putting the digital twin portfolio into action. Let’s consider an analysis to optimize the spend between preventive and corrective maintenance to illustrate this.

Conventionally, you’d do this by crossplotting preventive (PM) and corrective maintenance (CM) costs for a particular piece of equipment. In the chart below, the red region indicates where a dollar reduction in PM spend may result in CM cost increasing by less than a dollar. Therefore in this scenario, you’re actually saving money by spending less on PM.

Chart of PM vs CM cost

This analysis, however, produces many false positives. It’s constrained because data access is limited to these two dimensions. Facility engineers would actually like to add a third dimension: Failure data. With this information, you gain insight into which equipment actually benefits from PM. For instance, it’s worth spending extra on PM for an older compressor that has failed three times in the past year (that you can’t replace yet due to budget constraints), but save by just applying CM to a more reliable compressor.

The graphic below demonstrates how adding this third dimension enables you to better distinguish true positives (red area - equipment where you should reduce PM spend) from false positives (green area - equipment where you should maintain PM spend).

3D chart of failures vs PM vs CM

Adding this third dimension is difficult without a contextual platform, and that’s why your team is constrained to the conventional 2D analysis. With a digital twin in place, you can understand why a piece of equipment is failing, how much it costs to maintain it, and consequently make better decisions on the type of maintenance to apply to it. Do this for every piece of equipment, and you can optimize maintenance costs across your entire facility.

With your portfolio of digital twins in place, collecting data in real time and putting it into context, you can make further use of machine learning to enhance decision making. Instead of spending time gathering and making predictions on limited data, your team can use this technology to explore more complete information, optimizing their time and improving efficiency of the facility as a whole. For some systems, you may even be able to deploy decentralized decision making, again freeing up resources for more critical and complex tasks.

Dealing with the challenges of sensor (IIoT) data

Now that you have dealt with all of your unstructured text and image data, you are ready to capture the full value of incoming sensor information (also unstructured data). 

There are numerous challenges that operators face when analyzing sensor data, but a few can be mitigated by doing cross analysis with all of that text and image information you’ve poured into your digital twin:

  1. Noise generated by sensor errors
    Some of the data recorded by your sensors may be inaccurate. Sensors can malfunction, or if they’re older, could simply lose sensitivity and degrade the collected data. For example, a sensor might return a temperature reading that is off by a couple of degrees. Depending on the sensitivity of the equipment, this may be a problem, especially if it’s located in a hard-to-reach area. It’s difficult to send a human to check on it, so how do you determine remotely if it’s the equipment or the sensor that’s malfunctioning? First, you should check readings of other sensors on the equipment. Then, you should quickly pull up maintenance records to assess your next step.
  2. Uncovering hidden patterns
    What if you find your sensor data is accurate, but there are slight, not immediately understandable fluctuations in the temperature readings? There may be a meaningful pattern there, but it’s challenging for even an expert to pinpoint the cause with just temperature data alone. By supplementing it with other sources of information, such as a pressure reading from another sensor, notes from the last few maintenance reports, and the instrument loop logic, you can build a much richer picture for root cause analysis.

A solid contextual foundation is essential

The investments you make into your digital twin’s contextual foundation are of utmost importance. By digitizing all of your legacy text and image data and linking it to an asset hierarchy, setting up automated machine learning pipelines to handle new text and image information, and then finally integrating your sensor data, you will draw significant gains from the wealth of unstructured data that many other organizations fail to work with.

You will see benefits such as:

  • Increased productivity because AI handles the tedious tasks so that your experts can spend more time on decision making
  • Greater collaboration among teams and therefore improved business efficiency
  • Higher facility uptime due to improved performance and maintenance planning
  • Realizing full potential ROI on IIoT investments, as well as mitigating risk of degraded data from sensor malfunctions and ageing
  • A more complete view of your facility, enabling you to optimize any inefficiencies

In your journey to Industry 4.0, don’t fall prey to the hype around IIoT. It’s definitely valuable, but it’s far from being the most important component. Instead, examine your facility through the lens of unstructured text and image data—your engineering drawings, SPIR documents, datasheets, maintenance reports—only then will you be able to capture the full value of all of your other investments, from sensors to augmented reality to drones and robots.

Recommended Posts

Empty white photo frames

8/15/2024

A new era of enterprise search for facilities
(Part II): Data preparation

Before implementing new search technology, data cleanup and preparation is critical. Here's what you need to do.

Interconnected streaks of light

8/1/2024

A new era of enterprise search for facilities

Enterprise search is changing for facility operators—out with the file hierarchies and in with the information networks.

A big pile of toilet paper

8/9/2022

Data hoarding is a major risk for
facility operators

Data hoarding is widespread, and has a major impact on business performance. What’s the problem, why does it happen, and how can you fix it?