Anomaly Models
Here we post our collection of anomaly models to be used with FLAME. We have three different classes of anomalies: Network scans, spam and denial of service attacks.
Network scans
We have captured and analyzed the characteristics of five
different network scans: the Nachi scan, an SSH scan, an Radmin
scan, a DCOM/RPC scan, and a Netbios scan. Network scans
are by far the most commonly observed class of anomalous
activity. They typically originate at a single source
IP address and target many different destinations. While the
time of large worm outbreaks is passed, we still observe
occasionally scanning activity related to worms such as Nachi, Blaster,
or Welchia. Network scans may use TCP, UDP, or ICMP as
transport protocol. Scans are of importance to network
administrators
when they originate from internal hosts, since this might
be a sign of an infection on the affected machine.
Nachi Scan
We found in our traces several instances of an ICMP scan that
can be attributed to the Nachi worm released in 2003. Nachi
uses fixed size 92-byte ICMP echo packets to scan for vulnerable
hosts and is thus easy to recognize. The Nachi scan flow
attributes have the following characteristics.
Transport protocol: is set to ICMP.
Source IP addresses: are set to the address of the scanning
host.
Source port numbers: are set to ICMP code 0 type 8, which
corresponds to echo-request messages.
Destination port numbers: are set to 0 or not set.
Flow sizes: are set to 1 packet and 92 bytes.
Flow durations: are set to 0 msecs, which is the default for
flows that contain 1 packet.
Destination IP addresses: show a periodic pattern show in
the Figure below. We observe an interesting periodical
pattern that resembles a fishbone with two different cycles.
Every 200 flows we observe a few positive and negative
shifts by 400 IP addresses. Moreover, every 800 flows we observe
periods with shifts of 45 to 110 IP addresses. All remaining flows
have IP address differences in the range [-40:40].
Inter-arrival times: show a periodic behavior that has a
stochastic component. The Figure below shows for different
flows with index i their inter-arrival time ti measured relative
to the previous flow.
After five scans that arrive at the router with an
inter-arrival time of 0 or 1 msec 3, no flow is observed for 61 msec until
the next scan period starts. Furthermore, we do not see any
flows in reply to the Nachi scans in our traces as the targeted
network is probably filtering these packets.
SSH scan
Password-probing scans for SSH are quite common in todays
networks. We have extracted two SSH scan instances from the
SWITCH traces. Both instances show a very similar
behavior.
Transport protocol: is set to TCP.
Source IP addresses: are set to the address of the scanning
host.
Destination port numbers: are set to 22.
Destination IP addresses: show a very irregular pattern that
contains many atomic scan periods of 30 to 200 flows that
each cover a range of approximately 400 IP addresses.
In the Figure below we plot the histogram of the difference
in destination IP addresses
between consecutive flows within such an atomic scan period.
After each period a positive or negative shift by 200 to 400
IP addresses occurs.
Source port numbers: are selected randomly from the range
[32,000:61,000].
Flow sizes: Each flow contains between 1 and 4 packets where
approximately 80% of the flows have a size of 2 packets and
120 bytes 4.
Inter-arrival times: show a periodic behavior with two
cycles and a stochastic component. The inter-arrival times for 8000
SSH scan flows are plotted in the Figure below.
Every 800th flow has an inter-arrival time of 5 seconds,
every 300th flow has an inter-arrival time between 10 and 50 msec,
while all other flows have inter-arrival times of either 0
or 1 msec.
Radmin scan
We have observed one instance of a scan on destination port
number 4899. This port is used by the Radmin remote
administration
application. A remotely exploitable vulnerability in the
Radmin server version 2.0 and 2.1 that allows for code
execution
was reported in July 2004. The Radmin scan has the following
characteristics.
Transport protocol: is set to TCP.
Source IP addresses: are set to address of scanning
host.
Destination port numbers: are set to 4899.
Destination IP addresses: show a highly irregular pattern.
The difference between consecutive IP addresses of the
majority of flows varies in the range [-40:40] as shown in the
histogram given in the Figure below. In addition we observe positive or
negative shifts that exhibit no particular patterns.
Source port numbers: also show a highly irregular pattern,
but in addition they are limited to the interval
[1,000:5,000]. The distribution of the difference between port numbers of
consecutive flows is very similar to the distribution of IP address
differences. Flow sizes: 89% of all scan flows contain 2 packets and 96
bytes, while 10% of the scan flows contain 1 packet and 48
bytes.
Flow durations: 2-packet flows have a duration of either
46*64 msec or 47*64 msec.
Inter-arrival times: show a periodical behavior with two
cycles and a stochastic component. The timing behavior of the
Radmin scan anomaly is illustrated in the Figure below that shows the
inter-arrival times for 400 flows.
Every 25th flow is received with a delay 25 to 30 msec, and
every 4000th flow is received with a delay of 1 to 2 seconds. All
remaining flows have an inter-arrival time of either 0 or 1
msec.
DCE-RPC scan
Destination port 135 is one of the top-scanned ports as
various vulnerabilities have been reported in the RPC service
running on this port. Also the famous Blaster worm used port 135 for
propagation. DCE-RPC flows have the following
characteristics.
Transport protocol: is set to TCP.
Source IP addresses: are set to the address of the scanning
host.
Destination port numbers: are set to 135.
Flow sizes: are set to 3 packets and 144 bytes as port 135
is open on most machines.
Flow durations: are set to either 19*64, 20*64, or 21*64
msec.
Destination IP addresses: do not show any regular patterns.
The difference between successively scanned IP addresses
varies in the range [-256:256]. Their distribution is depicted in
the Figure below. Note that this distribution differs from the ones we
have previously encountered. Additionally, we find random shifts
at irregular times.
Source port numbers: are irregular as well, but in addition
they are limited to the range starting at 1,200 and ending
at 4,800. The range of variation between port numbers of
successive flows is approximately [-200:200]. Again, the distribution
for source port differences resembles the distribution of IP
address differences. Additionally, we have a periodic component that
introduces a positive shift of 250 to 500 source port numbers after
300 to 600 received flows and has the effect that certain
port ranges are skipped.
Inter-arrival times: show a periodical behavior. The timing
behavior of the RPC scan is periodical and has a stochastic
component.
Every 256th flow has a delay of 2 seconds, while all other
flows have an inter-arrival time of either 0 or 1 msec.
Netbios scan
We found two instances of scans for the netbios service that
runs on UDP port 137 in our traces. Several vulnerabilities for
the netbios service exist.
Transport protocol: is set to UDP.
Source IP addresses: are set to the address of the scanning
host.
Destination port numbers: are set to 137.
Source port numbers: are set to a fixed value larger than
10,000.
Flow sizes: are set to 1 packet and 78 bytes.
Flow durations: are set to 0 msec.
Destination IP addresses: show a periodic behavior. The IP
addresses for 100 to 200 flows are selected sequentially
until a negative shift of 60 to 70 IP addresses occurs. The
sequential target selection behavior within each scan interval of the
Netbios scan is plotted in the Figure below. Most of the time the
scanner simply increases the destination IP address by one. However,
from time to time we observe a positive shift of 2, i.e.,
one IP address is skipped, followed by a negative shift of 1, i.e.,
the missed IP address is scanned, followed by a positive shift
of 2, i.e., the normal scanning continues.
Inter-arrival times: show a periodic behavior. The timing
behavior of the Netbios scan is plotted in the Figure below.
Every 5th scan has a delay of 0 msec, while all other scans
have an inter-arrival time between 60 and 70 msec. Hence, this
Netbios scan is considerably slower than the previously analyzed
scans.
Spam
We did not find any anomalies related to e-mail spam such as
massive spam campaigns caused by botnets in the three
analyzed
weeks of data. Instead, we detected several instances of
Windows
Messenger pop-up spam. We call them variant A and variant B.
Windows Messenger Popup spam targets UDP destination ports
1026 and 1027.
Popup Spam Variant A
Transport protocol: is set to UDP.
Source IP addresses: are set to the address
of host that is sending the spam.
Destination port numbers: are set to 1026 or 1027.
Flow sizes: are set to 1 packet and 925 bytes.
Flow durations: are set to 0 msec.
Inter-arrival times: show a periodical behavior with two
cycles and a stochastic component.
Approximately every 200th flow has a delay of 64 msec, and
every 550th flow has a delay of 250 msec. The remaining flows have
an inter-arrival time of either 0 or 1 msec.
Destination IP addresses: show no regular patterns. The
difference distribution of variant A is shown in the Figure
below. IP address difference values vary in the range
[-200:200].
Source port numbers: Variant A selects the source port
sequentially from the range [32,000:61,000]. The difference between
source ports of consecutive flows varies in the range
[-1,000:1,000] and resembles the distribution of IP address differences.
Additionally we observe a periodical component: After 550 flows a
positive shift of 2,000 source port numbers occurs.
Popup Spam Variant B
We only report the attributes that differ from the
popup-spam
variant A anomaly in the following.
Destination IP addresses: Variant B selects destination IP
addresses more or less randomly from blocks of 3000 IP
addresses according to the difference distribution given in
the Figure below. In
this distribution the spikes at multiples of 256 IP
addresses are interesting. However, no regular pattern involving multiples
of 256 IP addresses is visible. After approximately 300 flows
the next block of IP addresses is used.
Source port numbers: Variant B uses a different mechanism
for sequential port selection. It randomly chooses a source
port number to start with. After 550 flows have been sent
with the
same source port, it increases the port number by 1 to 4
ports.
Denial of Service
The third large group of anomalies that we have found are
denial of service (DoS) attacks. DoS attacks have been extensively
studied in previous work. Mirkovic et al. provide
a taxonomy of DDoS attacks and defense mechanisms. We
complement this work by providing a detailed analysis of the network
behavior for different types of denial of service attacks such as UDP
bandwidth flood or TCP SYN flood.
UDP Bandwidth Flood Variant A
We have found three instances of two one-to-one UDP
bandwidth flood variants. Again, we call them variant A and variant B.
In the following we report the characteristics for variant
A.
Transport protocol: is set to UDP.
Source IP addresses: are set to the address of attacking
host.
Destination IP addresses: are set to the address of the
victim host.
Flow sizes: are set to 1 packet and 540 bytes for variant
A.
Source port numbers: are selected uniformly from the range
[x:x+19] where x is randomly chosen.
Destination port numbers: are selected sequentially between
20 and 1024. For each flow the destination port number is
increased by 1 port every time a flow is sent.
Inter-arrival times: show a periodical behavior. The flow
inter-arrival time distribution of variant A is depicted in
the Figure below.
Every 40th flow has a delay of 60 or 120 msec, while the
remaining flows have shorter inter-arrival times of 0 or 1
msec.
UDP Bandwidth Flood Variant B
Again, we report only differences to variant A of this
attack.
Flow sizes: are set to 1 packet and 1028 bytes for variant
B.
Source port numbers: are selected randomly from the interval
[1:6,000].
Destination port numbers: are selected from the range
[1,000:5,000]. The distribution of port number differences
between consecutive flows is shown in the Figure below. The positive
and negative difference values of 200 to 300 port numbers
stem from the fact that two processes with smaller port
differences run in parallel.
Inter-arrival times: show a periodical behavior.
Every 75th flow is received with a delay of 60 msec, while
all other flows have inter-arrival times of 0 or 1
msec.
TCP Flood Variant A
We have observed two instances of one-to-one TCP floods on
destination port 80. Both attacks target the same web
server.
Transport protocol: is set to TCP.
Source IP addresses: are set to the address of the attacking
host.
Destination IP addresses: are set to the address of the
victim host.
Destination port numbers: are set to 80.
Flow sizes: are set to 3 packets and 128 bytes.
Flow durations: are set to either 11*64 or 12*64
msec.
Source port numbers: are selected from the interval
[1,000:3,000] and the difference between consecutive flows
shows the regular but rather complex pattern depicted in the Figure
below.
Inter-arrival times: Every 10th flow of TCPFlood-A has a
delay of either 60 or 120 msec, while all other flows are sent
with an inter-arrival time of either 0 or 1 msec.
TCP Flood Variant B
Flow sizes: are set to 1 packet (26.4%) or 2 packets
(73.6%).
Flow durations: 2-packet flows have lengths between 2*64
msec and 15*64 msec. 1-packet flows have a length of 0
msec.
Source port numbers: show no particular patterns and are
selected from the interval [49,000:65,400]. The difference
between consecutive flows has the distribution shown in the Figure
below.
Inter-arrival times: Every 10th flow of TCPFlood-B has a
delay between 21 and 35 msec, while the remaining flows are
sent with inter-arrival times less or equal to 1 msec.
TCP Backscatter
We found 11 instances of TCP backscatter in the SWITCH
traces. Backscatter flows are replies of a DoS victim that
has been overflown by packets with spoofed source IP addresses.
The replies of the victim are then routed towards the owner of
the spoofed address space.
Transport protocol: is set to TCP.
Source IP addresses: are set to the victim of the DoS
attack.
Flow sizes: are set to 1 packet and 44 or 46
bytes.
Destination port numbers: are selected randomly from the
interval [1,000:2,000] according to the distribution given
in the Figure below.
Source IP addresses: show no regular pattern. The difference
in source IP addresses between consecutive flows varies in
the range [-600:600].
Inter-arrival times: show a periodical behavior with three
cycles.
Approximately every 720th flows has an inter-arrival time of
1 msec, every 3000th flow has a delay of 60 msec, and every
8000th flow has a delay of 380 msec. The remaining flows
have inter-arrival times of 0 msec.