Repository logo
 

Towards Self-Healing Networks: AI-Based Log Analytics and Automated Remediation

Date

Authors

Feng, Chengwei

Supervisor

Madanian, Samaneh

Item type

Thesis

Journal Title

Journal ISSN

Volume Title

Publisher

Auckland University of Technology

Abstract

The rapid growth in network complexity and scale, driven by cloud adoption, IoT proliferation, and hybrid infrastructures, has outpaced traditional, manual network management methodologies. Current approaches exhibit significant shortcomings, including latency in fault detection, inadequate root cause analysis, heavy reliance on manual intervention, and the absence of closed-loop remediation, all of which negatively impact network availability and operational efficiency. This research presents an innovative AI-driven self-healing framework, designed to enhance network reliability through advanced log analytics and automated remediation. The proposed solution integrates cutting-edge machine learning techniques—unsupervised anomaly detection, transformer-based semantic analysis, and reinforcement learning-driven remediation—into a unified system architecture. Our log analytics engine efficiently ingests, normalizes, and analyzes heterogeneous log data from multi-vendor network devices, accurately identifying anomalies and providing precise root cause analysis in real time. The automated remediation component leverages a progressive, multi-tiered approach ranging from assistive recommendations to fully autonomous actions, facilitated by a closed-loop Monitor-Analyze-Plan-Execute-Knowledge (MAPE-K) architecture. Through rigorous evaluation against traditional methods, the framework demonstrated significant performance improvements, including a 60% reduction in fault detection latency, a 45% decrease in Mean Time to Remediation (MTTR), and a high accuracy rate exceeding 90% in automated corrective actions. Additionally, scalability tests confirmed the system’s robustness in handling large-scale network environments with minimal latency overhead. The integration of human-in-the-loop feedback mechanisms ensures operational safety and continuous improvement, thereby addressing critical gaps identified in existing approaches. This research highlights the transformative potential of AI-driven self-healing networks in achieving unprecedented levels of network resilience, operational efficiency, and scalability, while also establishing a foundational model for future advancements in automated network management systems.

Description

Keywords

Source

DOI

Publisher's version

Rights statement

Collections