Accelerating research and innovation | IBM

In Taiwan, where the pandemic response has been exceptionally effective at limiting outbreaks and death, the National Center for High-performance Computing (NCHC) helps accelerate research and innovation nationwide by providing access to supercomputers and analytics and by facilitating nationwide networks for data sharing and collaboration.

Although NCHC supports research in all disciplines, the urgency of the pandemic inspired it to launch successive “Tech v Virus” programs, which call for universities, research organizations, enterprises and startups to find new ways to fight the spread of the SARS-CoV-2 coronavirus. One high-profile breakthrough so far is a stethoscope that visualizes a patient’s breathing, helping doctors and nurses reduce close contact with potentially infected patients — thus reducing risk of transmission. Another is a map of the COVID-19 gene’s evolution, helping predict routes of spread.

To support efforts like these, and hundreds of others in all fields, NCHC wants to ensure that research moves as fast as it can. That’s why it continues evolving its Taiwania series of supercomputers, which includes one of the 50 most powerful computers in the world. That’s why it provides AI services — including tools based on IBM Cloud Pak® for Data. And that’s why NCHC recently worked with the IBM Garage™ to implement the IBM Cloud Pak for Watson AIOps solution, applying AI-based automation to maximize resilience and performance.

Taiwan has several major public computing networks that crisscross the country and allow researchers to share information and collaborate. Some of the networks are specialized for academia, some for government and some for industry. But increasingly — especially in response to the COVID-19 pandemic — research initiatives have demanded cross-discipline efforts and cross-network collaboration. Fast information sharing between the public networks is crucial.

So NCHC began a new initiative: building a central network exchange. But bringing the networks together presented a new layer of challenges. The different networks were equipped with a disparate array of monitoring tools and data log sources and formats. The complexity complicated management, which kept NCHC from quickly filtering alarms to detect significant issues and prevent outages. Outages, in turn, would impede data sharing and collaboration across the networks.

To fulfill the purpose of the central exchange — accelerating nationwide research collaboration — NCHC needed a way to cut through the complexity of IT operations management. It turned to AIOps.

As part of its search for a solution, NCHC worked with the IBM Garage to run a proof of concept (POC) based on IBM Cloud Pak for Watson AIOps software.

The goal of the POC was to gauge the real-world impact of the potential solution. NCHC provided operations data and networking log data from real-life scenarios — where some networking equipment is breaking down and would create outages, for example.

The NCHC and IBM teams then used IBM Cloud Pak for Watson AIOps as a central integrator of the network exchange’s diverse array of IT operations tools, producing a holistic view of the entire infrastructure. And by feeding structured and unstructured data into the solution’s AI Manager component, NCHC and the IBM Garage team were able to train AI models to automatically, and proactively, manage problems and incidents.

The results were excellent. The teams achieved a 55% shorter mean time to detect (MTTD) issues that would affect service.