Having data has never been the same as knowing what it means.
At our June fireside on the shift to cognitive operations, Deva Nareswara, General Manager at Sampoerna Kayoe, a wood-based manufacturer exporting across Japan and Korea, described the problem with unusual honesty. His plant runs on Excel and Google Sheets, everything in the cloud. It works. Until it doesn’t.
“The data is there. But when the person who built that spreadsheet leaves, the way they read it leaves with them. You go back to zero on the insight.”
The logs exist. The spreadsheets are full. The maintenance history is documented, dated, and backed up to the cloud. And yet the line still goes down without warning, because the one person who could actually read all that data just handed in their resignation.
This is the quietest failure mode in industrial maintenance, and one of the most expensive. The expertise that keeps a plant running rarely lives in a system. It lives in a person. And people leave.
Why Does Maintenance Data Fail to Prevent Breakdowns?
Because most maintenance systems are systems of record, not systems of intelligence. They are very good at telling you what already happened. They were never designed to tell you what happens next.
A spreadsheet, a CMMS, a well-kept logbook, they help you see history, generate reports, and track the work you’ve done. None of that stops the failure that’s quietly developing on line three right now. As Deva put it, you can have excellent documentation, well-trained people, and a clean maintenance record, and the breakdown still arrives.
The gap isn’t data. It isn’t a weak maintenance team. It’s that most organisations have no way to turn their historical data into an early, actionable signal: the kind that tells you a fault is coming while there’s still time to plan around it.
A system can tell you that a task was completed. It can show you when the pump was inspected, when the motor was serviced, or when the last work order was closed. But it does not always tell you whether that activity actually controlled bearing damage, misalignment, overheating, lubrication breakdown, or the specific fault that is now moving toward failure.
That is why many plants look organised on paper but still get surprised on the floor. The records are there. The PM schedule is there. The CMMS history is there. But the early signal has not been translated into diagnosis and action.
And when the diagnosis finally happens, it’s brutal. Deva recalled a single electrical fault that took roughly a shift and a half to trace. Capacity down nearly 18% the whole time.
“The worst downtime isn’t the machine stopping. It’s the war situation, when nobody knows what’s actually wrong yet.”
The repair is rarely the expensive part. Not knowing is.
What Happens When Your Best Technician Leaves?
Every plant has one. The engineer who can walk past a motor, listen for a second, and say that bearing has three days left in it. No sensor told them. No report flagged it. They just know. From twenty years of pattern recognition that exists nowhere but in their head.
That person is your early-warning system. And that’s the problem.
When they retire, rotate to another site, or take a better offer, the plant doesn’t just lose a pair of hands. It loses its institutional memory, the undocumented, unscalable intuition that was quietly preventing failures no system ever caught. The data they generated is still there. The judgment that made it useful is gone.
This is the brain drain that doesn’t show up on a balance sheet. And in a manufacturing sector where skilled people are mobile and demographics are shifting, it’s compounding every year.
The numbers make the risk harder to ignore. The latest manufacturing skills-gap research points to a need for millions of workers through 2033, with a large share driven by retirements. For maintenance-specific roles, U.S. labor data shows more than half a million industrial machinery mechanics, maintenance workers, and millwrights already in the workforce, with tens of thousands of openings expected each year. That is not just a hiring problem. It is a knowledge-continuity problem. Every unfilled role means more pressure on the remaining technicians, more dependence on tribal knowledge, and less time to turn experience into repeatable decisions.
Why Do Spreadsheets and CMMS Store History But Can’t Predict?
Because they were built to answer “what did we do?” — not “what’s about to break, why, and what should we do about it?”
That’s 3 different questions, and the gap between them is where unplanned downtime lives:
-
Detection — something is changing on this asset.
-
Diagnosis — this is the fault, and this is the root cause.
-
Decision — here’s the specific action, and here’s how urgent it is.
A spreadsheet gives you none of these in time to matter. Your best technician gives you all three, but only while they’re still in the building. The goal isn’t to digitise the logbook faster. It’s to make that expertise permanent.
A good example is a refinery cooling water pump reviewed through an RCM process. On paper, the asset looked well managed. It had preventive maintenance, condition-based maintenance, documented CMMS history, and acceptable operating performance. But once the team reviewed the pump against its actual failure modes, they found that much of the work being done was maintenance activity, not failure control.
In fact, the review found that about 73% of work orders were not effectively contributing to controlling failure modes. The team was busy. The system was updated. The history was documented. But the maintenance strategy was not fully connected to the real ways the pump could fail.
That is the same problem many plants face with spreadsheets and CMMS. The records are there, but the intelligence is missing. The question is not “Did we complete the work order?” The better question is “Did this action detect, prevent, or control the failure mode that matters?”
How Does Cognitive Maintenance Capture Expertise that Doesn’t Retire?
This is the real shift. Cognitive Maintenance takes what your most experienced engineer does, listen to the machine, recognise the pattern, name the fault, and turns it into a system capability that runs on every asset, every shift, without depending on any one person being present.
It works by reading the machine directly through its sound, vibration, and thermal signals (the same cues a veteran technician uses) and matching them against the Groundup.ai Asset Library™, a proprietary machine intelligence engine trained on millions of tri-parameter machine health data points spanning sound, vibration, and thermal signals.
That matters for the brain-drain problem specifically: the system doesn’t start from zero when your spreadsheet expert walks out. It arrives already carrying the pattern-recognition of thousands of operations. It detects, it diagnoses the root cause, and it prescribes the intervention. So the insight no longer evaporates when a person resigns.
It is, in effect, the senior engineer who never retires. Watch Groundup.ai’s AI Lead, Darren Toh’s take on this.
Doesn’t This Take Years and Perfect Data?
It’s the most common assumption. And, honestly, the one that stops most factories from ever starting. In our session, most attendees guessed three to six months, or more.
The reality is closer to two to three weeks to a reliable baseline, precisely because the system isn’t learning machine behaviour from scratch. It builds on the Groundup Asset Library™ and sharpens to your specific environment over time.
But Deva offered a caveat worth keeping:
“Don’t start by asking what percentage you’ll save. Start by asking whether your data is mature, your sensors are enough, and your team is disciplined. Garbage in, garbage out.”
He’s right. Cognitive Maintenance is an enabler, not a crystal ball. It doesn’t need to be perfect. But it does need usable signals, enough asset context, and a team disciplined enough to act on what the system shows.
That discipline matters. A predictive alert only creates value when it moves into planning, scheduling, and action. In many maintenance teams, once weekly schedule compliance falls below 70%, alerts risk becoming a wish list instead of a work plan. The stronger sites build the operating rhythm first, so early warnings can become scheduled interventions instead of ignored notifications.
The practical starting point is not to monitor everything at once. Start with the asset where failure-mode control matters most: the machine that affects production, safety, quality, customer commitments, spare-part lead time, or maintenance cost.
Start focused, prove the business case, build trust with the team, then scale.
What Changes When Knowledge Lives in the System?
Everything downstream of the morning briefing.
Deva described his teams walking in on a Monday already knowing the state of the floor. Calmer, more confident, no longer in permanent firefighting mode. Less fear of making the wrong call. More time for actual improvement instead of chasing breakdowns. Lower anxiety, all the way up to his directors.
That is the operational shift. The system does not just store information. It helps move information into action: from machine signal, to asset health, to failure-mode context, to maintenance priority, to planned intervention. The team is no longer asking, “What went wrong?” They are asking, “What needs attention first, why does it matter, and what should we do next?”
And it shows commercially. When his machines run reliably, he can make confident commitments to customers. I.e. Scaling an order from four thousand units to six because he can trust the asset, not hope it holds. Better planning. Fewer surprises. A team that innovates because it isn’t bracing for the next emergency.
Which leads to the line from the session worth ending on:
“The winner is never the biggest or most sophisticated plant. It’s the one that keeps its assets running — reliably, consistently, efficiently.”
In Indonesia’s manufacturing race, and across Southeast Asia, that’s the whole game. And reliability can no longer depend on whether your best engineer happens to still be on the payroll.
The Transformation Isn’t Really About Technology
It’s about making sure the knowledge that keeps your plant running doesn’t walk out the door with the people who hold it. That’s a question of culture, of trusting your data, and of having the courage to act on the insight in front of you.
The technology to do it already exists.
The only real question is who starts using it first.
See what unplanned downtime is actually costing your operation. Your assets, your numbers, with the Groundup.ai ROI Calculator →
Or contact us to book a 30-minute Asset Assessment with our engineers to map your most critical assets and where to start. ⚡️
Frequently Asked Questions
What is the brain drain problem in industrial maintenance? It’s the loss of undocumented expertise when experienced engineers retire, rotate, or resign. Much of a plant’s fault-detection capability lives in individual judgment. Recognising a failing motor by sound, for example: rather than in any system. When that person leaves, the data they generated remains, but the ability to interpret it is gone.
Can AI really replace an experienced maintenance engineer? No, and it isn’t meant to. Cognitive Maintenance captures and scales an engineer’s expertise so it runs on every asset continuously, then points the team toward the assets that genuinely need attention. The engineer’s role shifts from firefighting to improvement.
How long does Cognitive Maintenance take to deploy? A reliable baseline is typically achievable in two to three weeks, because the Groundup.ai Asset Library™ brings pre-training on millions of tri-parameter data points spanning sound, vibration, and thermal signals. By combining advanced signal processing with agentic reasoning, the platform transforms raw telemetry into autonomous operational intelligence.
brings 5,000+ validated anomaly signatures rather than learning machine behaviour from scratch.
Do I need perfect data before starting? You need reasonable data maturity. I.e. Clean inputs, adequate sensors, a disciplined team. Garbage in, garbage out. The most practical time to deploy is once a facility is in commercial operation and its data is reliable.
Is Your Plant Ready to Move from Reactive to Cognitive?
Let us show you how quickly you can secure predictable uptime.
👉 Watch the full webinar video here to future-proof your operations
P.S. If you’re a plant manager or operations leader trying to build the internal case for this transition, the ROI calculator is the fastest way to anchor the conversation in your own numbers.
