Module 16 Anomaly Detection and Outlier Analysis
by Patrick Boily, with contributions from Youssouph Cissokho, Soufiane Fadel, and Richard Millson
With the advent of automatic data collection, it is now possible to store and process large troves of data. There are technical issues associated to massive data sets, such as the speed and efficiency of analytical methods, but there are also problems related to the detection of anomalous observations and the analysis of outliers.
Extreme and irregular values behave very differently from the majority of observations. For instance, they can represent criminal attacks, fraud attempts, targeted attacks, or data collection errors. As a result, anomaly detection and outlier analysis play a crucial role in cybersecurity, quality control, etc. . The (potentially) heavy human price and technical consequences related to the presence of such observations go a long way towards explaining why the topic has attracted attention in recent years.
In this module, we review various detection methods, with particular attention paid to both supervised and unsupervised methods.