|
| | | |
Detection of Structural Changes in Data Streams
Callister, R., Lazarescu, M. and Pham, D.S.
We propose new methods for detecting structural
changes in data streams. Significant changes within
data streams, due to their often highly dynamic nature,
are the main cause in performance degradation
of many algorithms. The primary difference to previous
works related to change detection in data streams
is our usage of an algorithmic process to define the
changes. We focus on RepStream, a powerful graph
based clustering algorithm, which has been shown to
perform well in a stream clustering context. Rep-
Stream, like many other algorithms, operates according
to parameters which are set by the user. Primarily,
RepStream uses the K value to determine
the degree of connectivity in its K Nearest Neighbour
graph structure. RepStream requires that its K
value be set suitably in order to achieve optimal clustering
performance, which we measure in terms of FMeasure.
Since real-world data streams are dynamic,
with classes appearing and disappearing, and moving
and shifting, this requires the K value to be varied
according to the current state of the stream. However,
such a problem in a data stream mining context
is largely unexplored. We first consider this challenge
by addressing the research question: when K needs to
be changed. From a change detection perspective, our
proposed method measures the structural variation of
the underlying data stream using five different statistical
and geometrical features which can be extracted
whilst RepStream performs its clustering. We show
that combining these features into a detection method
gives promising results in regards to early detection of
structural changes in data streams. We use the well
known KDD Cup 1999 intrusion detection benchmark
dataset, and show that our proposed method was able
to identify many of the changes within the stream. |
Cite as: Callister, R., Lazarescu, M. and Pham, D.S. (2015). Detection of Structural Changes in Data Streams. In Proc. Thirteenth Australasian Data Mining Conference (AusDM 2015) Sydney, Australia. CRPIT, 168. Ong, K.L., Zhao, Y., Stone, M.G. and Islam, M.Z. Eds., ACS. 79-88 |
(from crpit.com)
(local if available)
|
|