Sequential Anomaly Detection in a Batch with Growing Number of Tests: Application to Network Intrusion Detection

Speaker : George Kesidis
Pennsylvania State University
Date: 17/10/2012
Time: 2:00 pm - 3:00 pm
Location: LINCS Meeting Room 40


For high (N)-dimensional feature spaces, we consider detection of an unknown, anomalous class of samples amongst a batch of collected samples (of size T), under the null hypothesis that all samples follow the same probability law. Since the features which will best identify the anomalies are a priori unknown, several common detection strategies are:

1) evaluating atypicality of a sample (its p-value) based on the null distribution defined on the full N-dimensional feature space;

2) considering a (combinatoric) set of low order distributions, e.g., all singletons and all feature pairs, with detections made based on the smallest p-value yielded over all such low order tests.

The first approach relies on accurate estimation of the joint distribution, while the second may suffer from increased false alarm rates as N and T grow. Abstract Alternatively, inspired by greedy feature selection commonly used in supervised learning, we propose a novel sequential anomaly detection procedure with a growing number of tests. Here, new tests are (greedily) included only when they are needed, i.e., when their use (on currently undetected samples) will yield greater aggregate statistical significance of (multiple testing corrected) detections than obtainable using the existing test cadre. Our approach thus aims to maximize aggregate statistical significance of all detections made up until a finite horizon. Our method is evaluated, along with supervised methods, for a network intrusion domain, detecting Zeus bot command-and-control (i.e., intrusion) packet flows embedded amongst (normal) Web flows. It is shown that judicious feature representation is essential for discriminating Zeus from Web.

This work in collaboration with D.J. Miller and F. Kocak.


George Kesidis received his M.S. and Ph.D. in EECS from U.C. Berkeley in 1990 and 1992 respectively. He was a professor in the E&CE Dept of the University of Waterloo, Canada, from 1992 to 2000. Since 2000, he has been a professor of CSE and EE at the Pennsylvania State University. His research, including several areas of computer/communication networking and machine learning, has Biography been primarily supported by NSERC of Canada, NSF and Cisco Systems URP. He served as the TPC co-chair of IEEE INFOCOM 2007 among other networking and cyber security conferences. He has also served on the editorial boards of the Computer Networks Journal, ACM TOMACS and IEEE Journal on Communications Surveys and Tutorials. Currently, he is an Intermittent Expert for the National Science Foundation’s Secure and Trustworthy Cyberspace (SaTC) program.

His home page is