Utilities and power grid operators walk a fine line when it comes to providing reliable and resilient power while also protecting valuable assets in the electricity transmission and distribution system.
Power outages are disruptive, and in some instances life-threatening. More frequent outages, often caused by extreme weather events, mean utilities and power grid operators are looking for better ways to detect outages, restore electric service more quickly, and even prevent blackouts altogether.
It can be a daunting task. The ever-growing global list of major outages has grown in just the past few weeks and months. Thousands of electricity customers were left without power as winter storms moved across much of the U.S. in January. Millions were left in the dark on New Year’s Eve across Puerto Rico due to blackouts there, as the island continues to struggle with upgrading its electricity transmission infrastructure after numerous storms, beginning with Hurricane Maria in 2017.
A major storm left more than 600,000 without power in Washington state in November. Cuba, with a population of about 10 million, sustained a nationwide power outage in October due to the unexpected shutdown of a major power plant—adding to a series of blackouts that have plagued the island in recent months. Ecuador has sustained several power outages in the past year, with the country’s leaders calling the situation an “unprecedented energy crisis.”
Utilities and power grid operators are looking at infrastructure upgrades, along with technology such as outage management systems (OMS), to help mitigate the impact of blackouts.
“Our daily lives depend more than ever on reliable electricity service to our homes and businesses: computers, smartphones, and networking equipment for home-based work, electric vehicle [EV] chargers for transportation, etc.,” said Tom Eyford, global industry specialist at Oracle Energy and Water. “At the same time, electric grids are operating much closer to their limits than ever before as utilities strive to keep electricity affordable while seeing a resurgence in system load growth due to the expanding consumer energy technology as well as larger, centralized loads such as cloud-based computing and artificial intelligence (AI).
“Given the shrinking operating margins, modern grid control capabilities such as advanced distribution management systems [ADMS] are needed to predict where problems are likely to occur on the network, and to recommend and execute pre-emptive actions to mitigate them before they happen,” said Eyford. “All of this requires a steady diet of real-time data from grid devices, sensors, smart meters, weather stations, and more.”
1. Eaton Experience Centers in Houston, Texas, and Pittsburgh, Pennsylvania, provide hands-on training in real world application environments, advancing industry education on the technologies vital to power reliability, safety, and security. Courtesy: Eaton
“Real-time insights into grid conditions are critical for reducing outages caused by extreme weather or managing peaks in electricity demand,” said Jasmin Giroux-Maltais, product manager-feeder automation at Eaton, a company with training centers (Figure 1) that provide hands-on experience across the electricity sector. “We at Eaton are working closely with utilities to identify the event signatures needed to better detect disturbances early and enable rapid response through automated grid services. This capability helps utilities minimize the impact of potential outages by identifying and addressing issues before they escalate, ensuring the reliability and resilience of the electric grid under the most demanding conditions.”
2. With severe weather occurring more often across the U.S. and the world due to the effects of climate change, finding ways to harden power grids and transmission and distribution infrastructure has taken on new urgency. Courtesy: StockCake
Thomas L. Keefe, vice chair, and U.S. Power, Utilities & Renewables Sector leader for Deloitte, said having real-time information available “is very important because the impact of electric disturbances due to weather events [Figure 2] has significantly increased over the past few years, and they mostly hit the distribution grid, where 90% of outages originate. Demand losses have more than doubled between 2014 and 2018, and over the past five years, the disruption of 317 gigawatts of electricity impacted 66 million customers for longer durations [Figure 3].”
3. Research by Deloitte based on data from the U.S. Department of Energy shows how power disruptions are occurring more often, and lasting longer, in recent years. Courtesy: Deloitte
“Real-time information is absolutely crucial for modern grid management, particularly when dealing with extreme weather events or demand surges. The key is having systems that can process and analyze millions of data points simultaneously, enabling operators to identify potential issues before they escalate into outages,” said Nate Walkingshaw, CEO and founder of Torus, a company building energy storage and management products. “Our approach integrates multiple data streams to pinpoint outage locations and impact assessment. Our real-time monitoring gives us incredible visibility and insights into the distributed energy resources [DER] across the network. This comprehensive view allows us to track system performance at both macro and micro levels. We’ve developed sophisticated algorithms that can rapidly assess outage scope and impact, enabling more efficient resource allocation and faster response times.”
Several companies have designed OMS packages, offering a wide range of services. At the core of a modern OMS is a detailed network model of the distribution system. A utility’s geographic information system, or GIS, is often at the heart of this network model, finding outage locations and also areas where a failure may be imminent. Those include locating equipment issues along the grid that could lead to a blackout.
An OMS also can help organize power restoration efforts, and manage resources (such as the number of workers needed), based upon the size of an outage and its likely duration. An OMS also can improve safety performance and ensure regulatory compliance. A highly integrated system can leverage distribution automation that supports fully automated fault location, isolation, and service restoration (FLISR), which will minimize the impact of an outage. It can provide higher visibility into localized outages, and allows for better planning and restoration of larger network outages.
Safety is improved through increased situational awareness and real-time connectivity between the control room, the field crew, and SCADA (supervisory control and data acquisition), which helps utilities better monitor the network and crew activity for potential safety hazards. The OMS also can provide a record of performance, including during major weather events, and an electronic audit trail helps ensure compliance with regulatory reporting requirements such as system and customer average interruption and duration indices.
“Oracle offers an outage management system used in many utility control rooms around the world, but because outage management has been and remains such a critical process for both utilities and facilities, there are many mature options in the market to choose from,” said Eyford. “GE Vernova, Hitachi, Hexagon, AspenTech, Schneider, Survalent, Minsait, Trimble, and ETAP are some of the more common platforms, but there are many others as well. In addition, industrial automation systems and microgrid controllers from providers like Schneider Electric, Siemens, Eaton, and GE can act in much the same fashion for the local devices and networks that they supervise, and in many cases, they can provide additional visibility and flexibility to the utility’s central outage management capabilities.”
An OMS also supports the improvement of distribution reliability by providing historical data that can be studied to find common causes, failures, and damages leading to blackouts. Improvement programs can be prioritized with an understanding of the most common modes of failure. Hitachi Energy said its Network Manager OMS is part of the group’s ADMS. It said the system features a “highly integrated system leveraging distribution automation that supports fully automated fault location, isolation, and restoration, minimizing the impact of an outage.” The company said its system “provides higher visibility into localized outages, and allows for better planning and restoration of larger network outages.”
Survalent offers its SurvalentONE OMS, which it calls “a comprehensive outage management solution that empowers utilities to reduce the scale and duration of outages through efficient tracking and management. The solution provides predictive outage analysis that helps operators determine the probable fault location so that field crew can quickly commence restoration activities.”
The company said the SurvalentONE OMS “enables transparency between the control room and field crew, as well as processes for rapid damage assessment, to ensure the right crews and equipment are dispatched at the onset of restoration. SurvalentONE OMS integrates seamlessly with SurvalentONE SCADA and DMS applications to ensure that all applications share the same data. For example, FLISR events are automatically captured in the OMS. This ensures that all operators, dispatchers, customer service reps, and executive team members are equally informed about the status of outages and restoration activities, and all integrated applications are leveraging comprehensive, accurate data about the system. The solution’s enhanced customer communications capabilities include automated text messaging and social media update capabilities to provide customers with up-to-date outage information, including the estimated time of restoration and safety information.”
4. Automated solutions are part of outage management systems, with features and functionality that would enable systems and assets to be controlled from a central location. Control rooms are becoming more sophisticated through the use of software and digital technologies. Courtesy: StockCake
MCG Energy Solutions said its outage management software “provides features and functionality to manage outage systems tickets across the entire lifecycle, including execution and coordination activities with field operators and [grid operators].” The company said its Versify OMS software and operator logbook software, known as OpLog, “utilize a unique workflow enabled architecture that may be configured to manage complex processes associated with outage management, control room operations [Figure 4], [and] switching within generation, transmission, and distribution businesses.”
The executives who spoke with POWER said grid automation software is critical for utilities and grid operators in support of improved electricity transmission delivery. The software enables connections and protections, and when integrated with smart grid tech such as voltage regulators, capacitors, reclosers, switchgear, sensors, and more can help improve transmission system reliability and often reduce operating costs.
“Grid automation software automates the fault isolation and restoration process, reducing the need for large field crews to manually handle these tasks. Restoration times are calculated more accurately through real-time data insights, and the utility can deploy crews into the field more strategically by first assessing the severity and scope of the outage,” said Giroux-Maltais. “To put it simply, automated systems improve efficiency by centralizing operations, allowing the utility to deploy the right number of field workers to the right place the first time to minimize service disruption for its members.”
Walkingshaw told POWER that software upgrades “provide three critical advantages for outage management. First, they enable comprehensive real-time monitoring across hybrid systems, processing millions of data points to detect subtle pattern changes that might indicate impending issues. Second, they facilitate intelligent load management through advanced predictive analytics, particularly valuable when managing complex systems combining traditional power sources with newer technologies like flywheel energy storage systems, in our case. Third, they enhance cybersecurity protocols, which has become increasingly crucial as power infrastructure becomes more interconnected. Our experience has shown the value of owning our software stack, giving us total control of the modern software systems that are essential for maintaining grid stability while responding to unplanned events.”
Giroux-Maltais said Eaton is “working with utilities across North America to improve outage detection, management, and response through distribution automation software such as our Feeder Automation Manager—Reliability Module, which is part of our Brightlayer Utilities suite. This solution intelligently detects grid disturbances, isolates faults, and automatically reconfigures the grid to restore service to unaffected areas. The software integrates fault location, isolation, and service restoration capabilities, enabling the utility to minimize the duration and impact of outages across its distribution network.” Giroux-Maltais added, “It is important to flag that FLISR capabilities differ from an outage management system. The latter handles customer communication and crew management during an outage, whereas FLISR capabilities are aimed directly at grid automation functionality.”
Keefe detailed how this technology supports utilities and grid operators as they work to make the electricity delivery system more reliable and resilient.
“Advanced sensing protection and controls build on the foundational advanced metering infrastructure and advanced distribution management systems, which enable maximization of use cases from complementary technologies such as fault location, isolation, and service restoration,” said Keefe. “DER and microgrid integration and utilization can develop as ADMS modules or interconnected systems. Bidirectional charging, grid-forming inverters, and smart grid chips communicating real-time data further enable integration. Finally, utilities are starting to deploy AI for resilience. Use cases include pre-outage and outage prediction and monitoring, and enhanced real-time grid operations.”
Giroux-Maltais said, “By applying distribution automation software solutions and advancing metering devices, utilities are significantly reducing outage frequency and duration, while helping to reduce truck rolls and time for utility crews. That’s because these types of upgrades can automate fault detection, isolation and service restoration, reducing the need for manual investigation and intervention. They also integrate seamlessly with existing legacy systems, allowing utilities to modernize operations without replacing current infrastructure.”
The Eaton executive provided a real-world example. “Carroll Electric Membership Corporation (EMC) in western Georgia utilizes 250 intelligent electronic devices across its distribution network and our Feeder Automation Manager—Reliability Module to automatically isolate damaged sections and restore power to as many people as possible. Since the start of initial deployments in 2014, this distribution automation program has helped the utility continually reduce its System Average Interruption Duration Index [SAIDI] year-over-year.”
Giroux-Maltais said, “In 2022 alone, Carroll EMC calculated that Eaton’s [module] reduced outage time by about 75 minutes. The utility anticipates a 41% reduction in outage duration, including 34% reduction from the three-phase deployment and an additional 7% reduction from the single-phase project as Carroll EMC continues to expand deployment of our FLISR technology.”
Keefe said project resiliency targets often revolve around shorter outage duration and faster restoration. “Unlocking the potential of DER to better manage power outages requires advanced digital technologies to enable active management, and could deliver cost savings to utilities and customers. While controls to manage DER in real time require significant utility investment, DER management includes least-cost mechanisms such as time-of-use rates and demand response for programmable thermostats that have been operating for decades with simple controls,” Keefe said.
Keefe added, “Studies commissioned by the states of California and New York show that managed electrification could lower the cost of distribution upgrades needed through 2035 by more than $30 billion in each of these states. That’s because building efficiency measures and smart devices to manage energy usage and smooth EV charging could reduce capital spending on new substations, transformers, feeders, and other distribution equipment.”
Keefe referenced a Department of Energy program designed to reward companies working on grid upgrades. “Grid Resilience and Innovation Partnership [GRIP] awards provide a window into the needed technical capabilities and software upgrades that can help power plant operators and grid managers better manage power outages. The first round of GRIP provided funding to 24 investor-owned utilities [IOUs]. All but two of the awarded projects cover distribution.”
5. This chart identifies projects designed to upgrade the power grid that are receiving federal funding. Courtesy: Deloitte
Keefe said, “The projects range from infrastructure replacement and hardening to AI deployments. The figure below [Figure 5] shows all IOUs awarded GRIP grid resilience and smart grid grants, with projects categorized according to a distribution investment prioritization pyramid, where each level builds on the preceding one. Basic distribution automation projects among IOU awardees involve sectionalization, or dividing feeders into sections so that utilities can isolate a fault on one feeder to a small area while restoring power to adjoining areas from a second feeder. More advanced applications the technology enables include creating dynamic DER microgrids to minimize the impact of outages. Other basic distribution automation projects include fault locators and reclosers to detect and interrupt faults, and supervisory control and data acquisition systems to remotely monitor and control the grid.”
Roy Fadida, co-founder and Chief Product Officer of enSights, a group that monitors solar energy systems, said, “Software can forecast upcoming outages based on electrical parameters collection and by performing anomaly detection to realize that the current load on the grid might cause outage.” Fadida, like the other executives who spoke with POWER, said having real-time information—particularly about renewable energy’s availability for the grid—is critical.
“By collecting and analyzing data from a wide array of geographically dispersed PV [solar photovoltaic] systems, a renewable management platform can predict potential power fluctuations before they evolve into serious issues. For instance, if the data and the forecasting analysis based on this data indicates a pattern of unexpected output drops across multiple PV sites, the system can anticipate impending grid imbalances or load swings.”
Fadida continued: “Equipped with this foreknowledge, the platform can send timely alerts to the utility’s OMS. These early warnings allow utilities to take proactive measures, such as adjusting dispatch schedules, coordinating energy storage resources, or initiating demand response programs, to stabilize the grid and prevent widespread outages. In essence, integrating PV and storage management insights into OMS operations ensures more effective and preemptive outage mitigation.
“Moreover, storage systems can also provide critical black-start capabilities, enabling portions of the grid to be restarted after a system-wide outage. This makes it easier and faster to bring renewable-powered microgrids and isolated systems back online, improving overall restoration times,” said Fadida. He said his company’s “ecosystem collects data from multiple clean energy sites and calculates nearby behavior to detect anomalies. The goal is to identify if a specific issue [such as an outage] is related to a specific PV site or the larger area. In addition to simply detecting outages, by aggregating distributed storage systems, our ecosystem can help prevent or compensate for various grid events, such as outages.”
The use of smart meters (Figure 6) by many utilities supports advanced automatic meter reading (AMR) systems, which can provide outage detection and restoration capability. These systems can serve as “virtual calls,” indicating customers who are without power without the need for those customers to alert the utility. Such systems may be integrated with SCADA systems, which can automatically report the operation of monitored circuit breakers, along with other intelligent devices such as SCADA reclosers.
6. Smart meters are being used by more utilities as a way to gather data, with a goal to improve power delivery and efficiency—along with automating collection of outage information. Courtesy: PxHere
A mobile data system also can be integrated with an OMS. This integration enables outage predictions to automatically be sent to crews in the field; the crews also can update the OMS with information such as estimated restoration times without requiring radio communication with the control center. Crews also can transmit work reports about what actions were taken during an outage restoration.
“Improving visibility, situational awareness, and flexibility is critical for managing the modern electric system, especially with the rapid adoption of customer-owned distributed energy resources such as electric vehicles, solar arrays, and local energy storage,” said Eyford. “Although the industry buzz is generally around virtual power plants and flexible interconnections, these technologies can create both operational risks [‘hidden’ load that must be picked up by the grid due to the local generation previously serving it remaining temporarily offline] and safety risks [many more potential generation sources to be aware of and secure]. At the same time, they offer the potential to restore portions of the network that may be isolated from the grid due to damage and may even be able to pre-emptively do so to avoid an outage altogether.”
Eyford noted, though, that “Sensors and communications are now standard on most modern grid equipment, but traditional utility systems were designed well before these technologies were envisioned. Software upgrades are needed to model, monitor, and manage them effectively.”
Eyford told POWER: “There is a great deal of data available to help determine outage locations, both from utility systems as well as from external sources. For larger outages, the utility’s SCADA system and other telemetered devices will immediately inform grid operators that a problem has occurred, but for outages affecting a smaller area, technologies such as advanced metering infrastructure can provide near-real-time notifications that customers have experienced a power outage.
“Where utilities haven’t yet deployed AMI [advanced metering infrastructure], or in cases where communications to those meters is not fully reliable, they are still largely dependent on affected customers reporting their outage themselves, typically by calling the utility’s outage reporting hotline or submitting their information via the utility’s website or smartphone app. They may also receive information directly from 911 dispatchers or emergency service personnel, especially if public safety concerns are present such as a car-hit pole or a downed power line. As artificial intelligence and large-language models such as ChatGPT become more widely available, utilities are increasingly looking to AI to analyze social media data to help identify affected customers and safety issues as well as non-customer locations such as streetlights and traffic signals.”
Matt Smith, who leads the global business and product strategy for the grid management business at Itron, a global group helping utilities develop innovative solutions for their operations, told POWER: “A structured DI [distributed intelligence] system continuously monitors the performance of both hardware and software, enhancing grid reliability by identifying potential weaknesses or components under strain due to increased demand or environmental factors. By leveraging predictive analytics, utilities can anticipate when equipment is likely to fail and schedule timely updates or replacements before failures occur, preventing costly downtime and ensuring grid reliability. The approach not only maintains system stability but also supports long-term resilience as the grid evolves.”
Smith said model predictive control, or MPC, “can impact grid reliability by predicting energy demand and leveraging the use of EV energy storage to prevent potential overloads. Additionally, by integrating advanced weather forecasts, MPC can also prepare the grid for incoming extreme weather events. MPC can also isolate potential faults and reroute power before disasters strike, thus minimizing outages and maintaining grid stability. This combination of demand prediction and weather integration preparation helps ensure that the grid remains resilient during high-stress periods.”
Terry Saunders, Worldwide Utilities and Industry leader at IBM, said: “Model predictive control can optimize asset health and predict failures while extending the useful life of assets through a strategy that prioritizes repairs and replacements. Conditional-based predictive maintenance based on health insights from operational data and analytics helps you put your asset data to work. One example of this could be software that helps you understand the status of critical equipment and assets with insights from data and analytics to help make smarter decisions.”
Saunders noted, “Utilities rely on inspections to understand current asset condition and need for maintenance. Many assets are geographically disbursed throughout the service territory making physical inspections time-consuming and expensive. With aging critical assets, more frequent inspections are warranted … however, there are fewer people available to conduct these inspections with ongoing retirements and a transitioning workforce.”
Saunders said IBM’s Maximo Application Suite, or MAS, “helps support condition monitoring to trigger warnings or immediate actions. Capabilities such as Monitor, Health, and Predict provide an enterprise view of asset health, helping organizations identify anomalies and justified maintenance,” which can help prevent outages and enhance power reliability.
Torus is building technology that Walkingshaw said takes “a multi-layered approach to resource management during outages. Our hybrid energy storage systems, combining flywheel technology with traditional batteries, provide us unique flexibility in responding to outages of varying sizes and durations. For shorter-duration events, our Nova Spin flywheel system can respond within milliseconds to maintain frequency stability, while our chemical battery systems handle longer-duration support needs. Our resource allocation is driven by real-time data analytics that help optimize deployment of both technological and human resources based on the specific characteristics of each event.”
Walkingshaw said his company has developed “an integrated approach to outage management that combines real-time monitoring with predictive analytics. Our outage management system, built in-house, excels at managing distributed energy resources and hybrid storage solutions. This approach has proven especially valuable in coordinating nearly 1 GWh of facility-managed power across 60-plus projects in 2024 alone, while responding to more than 120 unplanned demand response events. The system’s architecture emphasizes cybersecurity and integration with existing SCADA infrastructure, enabling rapid response to potential outages before they cascade through the network.”
Utility executives for years have said customers want a better understanding of how long their power will be out, along with more details about why service has been disrupted. Walkingshaw noted the importance of more accurate forecasting.
“Our restoration time estimates integrate multiple data sources and predictive modeling to provide accurate projections. The calculations factor in real-time system status, historical performance metrics from comparable events, current operating conditions, geographic distribution of affected areas, and available resource capacity,” he said. “For field crew management, we employ an adaptive dispatch system that optimizes deployment by prioritizing critical infrastructure, coordinating between automated and manual interventions, and adjusting resource allocation as events unfold. This approach enables efficient coverage across our infrastructure while maintaining appropriate emergency response capabilities. We continuously refine our estimates based on real-time feedback from field operations and system diagnostics, ensuring our restoration projections remain accurate as conditions change.”
Rob Brook, senior vice president Americas at Neara, a group that works with utilities worldwide on simulations of severe weather events to support grid hardening, said, “We’re asking the power grid to do more in the next 10 years than we have in the last 50. This demand is driven by a number of factors, like an increase in severe and unpredictable weather, multiple accessible technology devices in the hands of every consumer, and power-hungry data centers.
“To meet this unprecedented energy demand and better manage power outages, migration to AI-assisted software platforms can help grid managers identify outage and safety risks such as equipment failures and maintenance blindspots,” said Brook. “Upgrading to AI-enhanced software helps utilities surface bottlenecks in their network that they can’t easily see otherwise, and can offer tailored recommendations to reduce the likelihood of outages and shorten their duration. For example, by simulating various weather-related conditions, Neara’s technology highlights where specific upgrades, like increasing pole height or changing the conductor type, would result in the network’s ability to support fast-growing loads. Today, every business and consumer rely on electricity and they expect clear communication and maintenance from their providers during outages—up-to-date software helps every stakeholder do just that.”
7. Mapping technology that enables data collection from transmission assets is important to help utilities and grid operators take steps to mitigate or avoid power disruptions. Courtesy: Neara
Said Brook, “While real-time information is essential to fuel effective in-the-moment decision-making, the ability to anticipate potential outages and their impact before an actual grid-threatening event also plays a critical role in accelerating recovery times and minimizing impact. Accurate 3D mapping across network environments [Figure 7], including manmade assets like poles, lines, and stakes, as well as natural elements like vegetation and ground topography, is extremely important for establishing the ground truth that can help utilities proactively plan to reduce outages and respond more effectively by monitoring real-time conditions in extreme weather.”
8. Technology that can help utilities and grid operators determine flood risk to their infrastructure is another way to enable proactive measures that can mitigate the impacts of disruptions to an area’s power supply. Courtesy: Neara
Brook said Neara’s technology “enables utilities to simulate weather like storms, wildfires, floods [Figure 8], and winds to see how these events will impact the grid, and provide AI-enhanced improvement recommendations accordingly. Neara helps utilities take a more proactive approach to resiliency, reducing time-spent on processes from structure loading assessment to new network design and vegetation management from years to hours. Risk mitigation is the ultimate goal, and new AI-assisted software programs empower utilities to better understand the vulnerabilities within their networks so they can identify exactly where their networks need attention and the best ways to get ahead of potential issues.”
Neara’s modeling technology takes into account a range of data sources, including LiDAR (light detection and ranging), GIS (geographic information system), imagery, pole libraries, etc., that help paint the most accurate possible picture of utilities’ entire networks, including all assets and surrounding things like vegetation, roads, buildings, and more. “A common delay in power restoration is the reality of poor data quality. Utilities must understand the relative strengths and weaknesses of different data sources, then employ the right mix of these sources to maximize visibility and de-risk networks. Without accurate data, utilities are unable to easily identify the cause of an outage, such as a pole failure, and critically, the precise location of that pole failure,” Brook said.
“By modeling customers’ networks with multiple data sources, we help them automatically correct errors in GIS, which means that they can pinpoint the cause of an outage and the exact location of the offending equipment failure or vegetation encroachment, save time across routine field operations, particularly in instances where a pole marked for maintenance might be 600 yards from where the map indicates, and most importantly, in emergency situations when on borrowed time, they know exactly where to send field teams, and when it’s safe to do so,” he explained.
Giroux-Maltais also noted the importance of gathering data and automating processes. “Grid automation software automates the fault isolation and restoration process, reducing the need for large field crews to manually handle these tasks. Restoration times are calculated more accurately through real-time data insights, and the utility can deploy crews into the field more strategically by first assessing the severity and scope of the outage,” he said. “To put it simply, automated systems improve efficiency by centralizing operations, allowing the utility to deploy the right number of field workers to the right place the first time to minimize service disruption for its members.”
—Darrell Proctor is a senior editor for POWER.