PyNomaly¶
PyNomaly is a Python 3 implementation of LoOP (Local Outlier Probabilities). LoOP is a local density based outlier detection method by Kriegel, Kröger, Schubert, and Zimek which provides outlier scores in the range of [0,1] that are directly interpretable as the probability of a sample being an outlier.
PyNomaly is a core library of deepchecks, OmniDocBench and pysad.
Overview¶
The outlier score of each sample is called the Local Outlier Probability. It measures the local deviation of density of a given sample with respect to its neighbors as Local Outlier Factor (LOF), but provides normalized outlier scores in the range [0,1]. These outlier scores are directly interpretable as a probability of an object being an outlier. Since Local Outlier Probabilities provides scores in the range [0,1], practitioners are free to interpret the results according to the application.
Like LOF, it is local in that the anomaly score depends on how isolated the sample is with respect to the surrounding neighborhood. Locality is given by k-nearest neighbors, whose distance is used to estimate the local density. By comparing the local density of a sample to the local densities of its neighbors, one can identify samples that lie in regions of lower density compared to their neighbors and thus identify samples that may be outliers according to their Local Outlier Probability.
The authors' 2009 paper detailing LoOP's theory, formulation, and application is provided by Ludwig-Maximilians University Munich - Institute for Informatics; LoOP: Local Outlier Probabilities.
Quick Links¶
- How It Works -- understand the algorithm
- Getting Started -- installation and first steps
- User Guide -- parameters, performance, streaming, and error handling
- API Reference -- full class and method documentation
- Examples -- worked examples with visualizations
Research¶
If citing PyNomaly, use the following:
@article{Constantinou2018,
doi = {10.21105/joss.00845},
url = {https://doi.org/10.21105/joss.00845},
year = {2018},
month = {oct},
publisher = {The Open Journal},
volume = {3},
number = {30},
pages = {845},
author = {Valentino Constantinou},
title = {{PyNomaly}: Anomaly detection using Local Outlier Probabilities ({LoOP}).},
journal = {Journal of Open Source Software}
}
References¶
- Breunig M., Kriegel H.-P., Ng R., Sander, J. LOF: Identifying Density-based Local Outliers. ACM SIGMOD International Conference on Management of Data (2000). PDF.
- Kriegel H., Kröger P., Schubert E., Zimek A. LoOP: Local Outlier Probabilities. 18th ACM conference on Information and knowledge management, CIKM (2009). PDF.
- Goldstein M., Uchida S. A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data. PLoS ONE 11(4): e0152173 (2016).
- Hamlet C., Straub J., Russell M., Kerlin S. An incremental and approximate local outlier probability algorithm for intrusion detection and its evaluation. Journal of Cyber Security Technology (2016). DOI.
Acknowledgements¶
- The authors of LoOP (Local Outlier Probabilities)
- Hans-Peter Kriegel
- Peer Kröger
- Erich Schubert
- Arthur Zimek
- NASA Jet Propulsion Laboratory