How automation is revolutionizing outage management

How automation is revolutionizing outage management

Posted on

Fifteen minutes can be saved when troubleshooting an outage, simply by automating the mundane diagnostics data collection required to start analysis, according to Sean Scott (pictured), chief product officer of PagerDuty Inc.

Those initial tasks, such as checking the memory, verifying that the CPU is still looking good, and so on, are ideal work for automation, he explained.

“We can run our automated diagnostics and have all that data pumped directly into the incident,” Scott said. “When the responder engages, it’s all there waiting for them.”

Scott spoke with theCUBE industry analyst Lisa Martin at PagerDuty Summit, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed how automation can keep outage responses moving forward quickly, despite huge amounts of data needing to be looked at. (* Disclosure below.)

Outage cost-side getting addressed

Organizations are thinking in milliseconds, according to Scott. That means an outage of only a few minutes becomes highly noticeable. Machine learning algorithm-based responses have multifaceted advantages in outage mitigation.

“We’re able to eliminate a lot of the noise,” Scott stated.

Algorithms can cut through any chaos happening that’s caused by multiple sources of information and stakeholders all contributing during events. That machine drilling down allows respondents to focus.

“We have machine learning algorithms that help you focus your attention and point you to the really relevant work,” keeping everything moving, Scott added.

Scott also revealed how PagerDuty is advancing its business. Orchestrating the tasks needed to be completed during a service outage has historically been PagerDuty’s role, but it’s now aiming to solve some of the cost-side issues too. Probable Origin is one of the company’s newer features being offered.

“When you have an outage, we can actually narrow in where we think the outage is,” he said, explaining that spotting anomalous areas and communicating that, speeding up the fix, is an example of the company’s automation solutions.

Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of PagerDuty Summit:

(* Disclosure: TheCUBE is a paid media partner for PagerDuty Summit. Neither PagerDuty Inc., the sponsor of theCUBE’s event coverage, nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE

Show your support for our mission by joining our Cube Club and Cube Event Community of experts. Join the community that includes Amazon Web Services and CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger and many more luminaries and experts.

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *