2013年8月23日 星期五

Paper Reading : BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection

BotMiner is proposed in Usenix'08 by College of Computing, Georgia Institute of Technology.
In this paper, authors aim to detect c&c botnet without a priori knowledge. The key concept of this paper is to cluster malicious behaviors and communication traffic separately and mining  relation between them. Their system achieve a good detection rate in their evaluation.

Introduction

In this section, authors introduce current state of botnet. Classificaton of botnet is list below.
IRC Botnet, include 
P2P Botnet, include Nugache and Storm/Peacomm

Related Work

Most related work are specific to IRC and HTTP protocols,which are most widest used by C&C botnet. This paper proposed a system to detect arbitrary protocols communications.

Definition and Assumption

The authors define the botnet as "A coordinated group of malware instances that are
controlled via C&C channels".  That is, it is unavoidable for bot to communicate to other bot/bot server and it necessary for a bot to bring out some malicious activities. 

Proposed Scheme

The architecture of botMiner is in the figure.
In order to detect bot in a network, BotMiner follow two observation, "who is talking to whom" and "who is doing what". Therefore the system is divide to C-Plan and A-Plan.
  • C-Plan is responsible for communication
    C-Plan Monitor employ hardware supporting(Cisco, Juniper) to log network traffic. 
  • A-Plan is responsible for activities
    Snort with SCADE (Statistical sCan Anomaly Detection Engine) plugin is used to implement A-Plan Monitor.
In C-Plan Cluster, noise connections and white list connection are first filtered out. Then the two-level cluster is processed, from course grain to fine grain. The aggregating of some attributes are calculated
  1. the number of flows per hour(fph)
  2. the number of packets per flow(ppf)
  3. the average number of bytes per packets(bpp)
  4. the average number of bytes per second (bps)
After those feature are computed, X-mean, an K-means variant algorithms, is used to cluster in different feature space to implement course grain and fine grain cluster. X-mean can cluster without pre-define clustering number. X-means runs multiple rounds of K-means to find proper number of clustering with help of Bayesian Information Criterion. Similar approach is used in A-Plan cluster.

Finally,  cross-check clusters is taken to find out intersections that infers exists of botnet.

沒有留言:

張貼留言