Please use this identifier to cite or link to this item:
Scopus Web of Science® Altmetric
Type: Theses
Title: Adaptive anomalous behavior identification in large-scale distributed systems
Author: Alvarez Cid-Fuentes, Javier
Issue Date: 2017
School/Discipline: School of Computer Science
Abstract: Distributed systems have become pervasive in current society. From laptops and mobile phones, to servers and data centers, most computers communicate and coordinate their activities through some kind of network. Moreover, many economic and commercial activities of today’s society rely on distributed systems. Examples range from widely used large-scale web services such as Google or Facebook, to enterprise networks and banking systems. However, as distributed systems become larger, more complex, and more pervasive, the probability of failures or malicious activities also increases, to the point that some system designers consider failures to be the norm rather than the exception. The negative effects of failures in distributed systems range from economic losses, to sensitive information leaks. As an example, reports show that the the cost of downtime in industry ranges from $100K to $540K per hour on average. These undesired consequences can be avoided with better monitoring tools that can inform system administrators of the presence of anomalies in the system in a timely manner. However, key challenges remain, such as the difficulty in processing large amounts of information, the huge variety of anomalies that can appear, and the difficulty in characterizing these anomalies. This thesis contributes a novel framework for the online detection and identification of anomalies in large-scale distributed systems that addresses these challenges. Our framework periodically collects system performance metrics, and builds a behaviour characterization from these metrics in a way that maximizes the distance between nor mal and anomalous behaviors. Our framework then uses machine learning techniques to detect previously unseen anomalies, and to identify the type of known anomalies with high accuracy, while overcoming key limitations of existing works in the area. Our framework does not require historical data, can be employed in a plug-and-play manner, adapts to changes in the system behavior, and allows for a flexible deployment that can be tailored to numerous scenarios with different architectures and requirements. In this thesis, we employ our framework in three anomaly detection application domains: distributed systems, large-scale systems, and malicious traffic detection. Extensive experimental studies in these three domains show that our framework is able to detect several types of anomalies with 0.80 Recall on average, and 0.68 mean Precision or 0.082 mean FPR depending on the domain. Moreover, our framework achieves over 0.80 accuracy in the identification of various types of complex anomalous behaviors. These results significantly improve similar works in the three explored research areas. Most importantly, our approach achieves these detection and identification rates with significant advantages over existing works. Specifically, our framework does not rely on historical anomalous data or on assumptions on the characteristics of the anomalies that can make anomaly detection easier. Moreover, our framework provides a flexible and highly scalable design, and an adaptive method that can incorporate new system information at run time.
Advisor: Szabo, Claudia
Falkner, Katrina Elizabeth
Dissertation Note: Thesis (Ph.D.) -- University of Adelaide, School of Computer Science, 2017.
Keywords: distributed systems
large-scale systems
anomaly detection
Provenance: This electronic version is made publicly available by the University of Adelaide in accordance with its open access policy for student theses. Copyright in this thesis remains with the author. This thesis may incorporate third party material which has been used by the author pursuant to Fair Dealing exceptions. If you are the owner of any included third party copyright material you wish to be removed from this electronic version, please complete the take down form located at:
DOI: 10.4225/55/5b10d6fd92ea1
Appears in Collections:Research Theses

Files in This Item:
File Description SizeFormat 
01front.pdf131.16 kBAdobe PDFView/Open
02whole.pdf2.18 MBAdobe PDFView/Open
PermissionsLibrary staff access only237.3 kBAdobe PDFView/Open
RestrictedLibrary staff access only2.74 MBAdobe PDFView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.