forked from YahooArchive/samoa
-
Notifications
You must be signed in to change notification settings - Fork 1
Distributed Stream Clustering
Albert Bifet edited this page Oct 21, 2013
·
3 revisions
The SAMOA Clustering Algorithm is invoked by using the ClusteringEvaluation
task. The clustering task can be executed with default values just by running:
cd src/main/resources
./samoa storm ../../../target/SAMOA-Storm-0.0.1.jar "ClusteringEvaluation"
Parameters:
-
-l
: clusterer to train -
-s
: stream to learn from -
-i
: maximum number of instances to test/train on (-1 = no limit) -
-f
: how many instances between samples of the learning performance -
-n
: evaluation name (default: ClusteringEvaluation_TimeStamp) -
-d
: file to append intermediate csv results to
In terms of the SAMOA API, Clustering Evaluation consists of a source
processor, a clusterer
, and a evaluator
processor. Source
processor sends the instances to the classifier using source
stream. The clusterer sends the clustering results to the evaluator
processor via the result
stream. The source PI
corresponds to the -s
option of Clustering Evaluation, and the clusterer corresponds to the -l
option.