The growing complexity and diversity of IT is a challenge for CIOs and their teams. Datacentre demands, cloud infrastructure, containerization, and security and compliance all put more pressure on IT management and operations.
At the same time, it is difficult to recruit and retain experienced staff. This has prompted IT departments and suppliers to look for ways to automate information services and infrastructure management, including for data storage.
One answer is to use machine learning (ML) and artificial intelligence (AI) to take on some of the workload. AIOps – or AI for IT operations – is an industry-promoted tool to deal with complexity, to optimize systems and maximize uptime. It also plays an increasingly important role in storage management.
AIOps has evolved rapidly in recent years due to improvements in ML and AI processing – including in the cloud – but also because IT systems are increasingly being monitored in real time.
But while everyone wants more visibility into their IT operations, the sheer volume of log data generated by modern hardware, both on-premise and in the cloud, can overwhelm IT teams.
AIOps uses that data through analytics engines to predict peak workloads, bottlenecks, capacity limitations and failures, and flag the need for maintenance and upgrades.
For storage, AIOps promises to help companies with resource allocation, to best utilize available capacity and, potentially, move data between storage tiers and/or to the cloud.
AIOps promises to do this faster and perhaps with more accuracy than human system admins. It also allows organizations to scale their digital operations without hiring more staff. It can be integrated with service management, dealing with user tickets, and DCIM (datacentre infrastructure management).
“The way I describe AIOps is that it’s a general technology that sits on top of domain expert operating systems and can correlate data and bring that back as actions to operational teams,” said Roy Illsley, principal analyst at Omdia. “That could be a manual action or it could be an automated action.”
What is (storage) AIOps, and what can it track?
Within IT management, AIOps sets out to harness the data generated by servers, networking equipment and storage arrays – but it aims to be more than a simple monitoring tool. By using AI, organizations can gain insights into system health, but also see how systems can be optimized.
This, according to GigaOm analyst Enrico Signoretti, amounts to putting “storage on autopilot mode”.
As suppliers add sensors to their systems, they can provide more up-to-date status information. Then they tied it to predictive analytics and, eventually, automation.
“They use machine learning, and they feed their machine learning algorithms all this information,” Signoretti said. “So now they have automation that makes better suggestions for what’s happening in the system and what you should do when something happens. All of this improves the usability of the system.
Storage AIOps monitors common metrics, such as utilization, I/O activity and latency. By adding a level of ML or AI, systems go beyond delivering raw data, and can flag unexpected events to a human analyst, predict when a part might fail or when a system needs more capacity.
Omdia’s Illsley describes it as “domain expertise”, and suppliers in other sectors, such as networking, also have their own AIOps capabilities.
The real power of AIOps for storage comes when combined with data from other parts of the IT system. This allows companies to detect a wider range of problems, such as a network issue that is traced back to a malfunctioning database.
With storage, AI offers the possibility to detect bottlenecks before they become issues, such as when data is moved to a lower-performing storage tier to make better use of capacity. The AI system alerts IT managers and suggests changes, or even makes configuration changes on its own.
AIOps can also allocate workloads, including storage volumes, across different types of infrastructure. This stands alone for companies operating hybrid architectures, where AIOps can manage the transition from on-premises to the cloud, or between cloud tiers. But, to work, such systems need to be integrated with the cloud suppliers’ management application programming interfaces (APIs), and have an understanding of their pricing models.
Benefits and potential of AIOps
AIOps promises improved system availability, reliability and efficiency.
This comes primarily from reduced maintenance downtime and reduced failures, and also from better allocation of compute and storage resources. This is particularly useful for organizations that have a large number of isolated systems, virtual machines (VMs), or are beginning to move toward containerization for production.
AIOps become more useful as systems become more complex. Replacing dozens of servers with hundreds of VMs already puts heavy loads on IT admins. Moving potentially thousands of containers may not be possible without automation.
The biggest benefit of AIOps, however, comes from consolidating across a company’s IT estate.
This – for now, at least – is beyond the reach of most storage suppliers’ AI tools, which only work with their own products. The issue is whether they can communicate with competitors’ systems, for computing and networking tools, or cloud management systems. As Omdia’s Illsely says, AIOps suppliers need to be “neutral Switzerland”.
Storage suppliers will need to convince their competitors, and most or all of their customers, that they will work well with other suppliers’ tools. Alternatively, CIOs can look to hardware-independent monitoring tools, such as Splunk, Datadog or ServiceNow.
And, as GigaOm’s Signoretti points out, storage AIOps suppliers’ cloud capabilities are still limited – they need to build cloud and hybrid capabilities. He expects them to grow, to allow companies to federate storage across suppliers, and by policy. “We’re going to get to that kind of scenario, but it’s going to take time,” he said.
Storage vendors with AIOps capabilities
Dell CloudIQ: Currently covers all Dell EMC storage and PowerEdge servers, as well as the firm’s hyperconverged systems and networking hardware.
HPE InfoSight: ProLiant and Apollo servers and Alletra, Primera and Nimble storage are supported, as well as HCI.
IBM Storage Insights Pro: Covers a wide range of systems including IBM’s own Flash storage, as well as Spectrum Scale and Cloud Object Storage. Some from Dell EMC, Hitachi Vantara, NetApp and Pure Storage are also supported.
Infinidat InfiniVerse: InfiniBox and InfiniBox SSA are supported.
NetApp Active IQ: Covers OnTap, E-Series and StorageGRID storage, as well as integration with NetApp Cloud Manager.
Pure Storage Pure1: Pure1 manages all Pure arrays (FlashArray, FlashBlade and Portworx) storage. Pure1 Meta provides full-stack analytics capabilities.