正在加载图片...
n-grams found in normal computer programs. This ap- lapping fixed-length content strings over each byte off proach is extremely powerful, but assumes the presence set) although it is not currently clear what the impact of of a known instance of a virus and a controlled environ ment to monitor The former limitation is partially addressed by the 2.3 Containment Honeycomb system of Kreibich and Crowcroft [17. Containment refers to the mechanism used to slow or stop the spread of an active worm. There are three Honeycomb is a host-based intrusion detection system containment mechanisms in use today: host quarantine, that automatically generates signatures by looking for longest common subsequences among sets of strings string-matching and connection throttling. Host quaran found in message exchanges. This basic procedure is tine is simply the act of preventing an infected host from similar to our own, but there are also important structural communicating with other hosts-typically implemented via Ip-level access control lists on routers or firewalls the most important of which is scale. Honeycomb is de- String-matching containment - typified by signature- signed for a host-based context with orders of magr tude less processing required. To put this in context, our matches network trattic against particular strings, or sig Earlybird system currently processes more traffic in one natures, of known worms and can then drop associated second than the prototype Honeycomb observed in 24 packets. To enable high-bandwidth deployments, sev- hours. However, one clear advantage offered by the host eral hardware vendors are now producing high-speed context is its natural imperviousness to network evasion String matching and regular expression checking chip techniques [30]. We discuss this issue further in Se for worm and virus filtering. lockwood et al. describe tion 7 application [ 19]. Finally, a different strategy, proposed Finally,over the last two years of Earlybird's devel- by Twycross and Williamson([43], is to proactively limit opment([34, 35, 37], the clearest parallels can be drawn the rate of all outgoing connections made by a machine to Kim and Karp's contemporaneously-developed Auto- and thereby slow-but not stop-the spread of any worm graph system [16]. Like Earlybird, Autograph also uses Their approach was proposed in a host context, but there network-level data to infer worm signatures and both systems employ Rabin fingerprints to index counters of is no reason such connection throttling cannot be applied content substrings and use white-lists to set aside wel at the network level as well known false positives. However, there are several im- In this paper, we assume the availability of portant differences as well. First, Autograph relies on matching containment (perhaps in concert wit tling) and our Earlybird prototype generates sig a prefiltering step that identifies flows with suspicious for a Snort in-line intrusion detection system -blocking scanning activity (particularly the number of unsuccess. all packets containing discovered worm signatures prevalence. By contrast, Earlybird measures the preva- 3 Defining Worm behavior lence of all content entering the network and only then Network worms, due to their distinct purpose, tend to be considers the addressing activity. This difference means have quite differently from the popular client-server and that Autograph cannot detect large classes of worms that peer-to-peer applications deployed on today's networks Earlybird can-including almost all e-mail borne worms, In this section we explore these key behaviors in more such as MyDoom, UDP-based worms such as Slammer, detail and how they can be exploited to detect and char- poofed source worms, or worms carried via IM or p2P acterize network worms clients. Second, Autograph has extensive support for distributed deployments-involving active cooperation 3.1 Content invariance between multiple sensors. By contrast, Earlybird has In all existing worms of which we are aware, some or focused almost entirely on the algorithmics required all of the worm program is invariant across every copy support a robust and scalable wire-speed implementation Typically, the entire worm program is identical across in a single sensor and only supports distribution through every host it infects. However, some worms make use a centralized aggregator. Third, Earlybird is an on-line of limited polymorphism- by encrypting each worm in- system that has been in near-production use for eight stance independently and/or randomizing filler text. In months and handles 200 megabits of live traffic, these cases, much of the worm body is variable, but key while, as described, Autograph is an off-line system that portions are still invariant(e. g, the decryption routine) has only been evaluated using traces. Finally, there are For the purposes of this paper, we assume that a worm many differences in the details of the algorithms used has some amount of invariant content or has relatively (e.g. Autograph breaks content into non-overlapping few variants. We discuss violations of this assumption in variable-length chunks while Earlybird Section 7n-grams found in normal computer programs. This ap￾proach is extremely powerful, but assumes the presence of a known instance of a virus and a controlled environ￾ment to monitor. The former limitation is partially addressed by the Honeycomb system of Kreibich and Crowcroft [17]. Honeycomb is a host-based intrusion detection system that automatically generates signatures by looking for longest common subsequences among sets of strings found in message exchanges. This basic procedure is similar to our own, but there are also important structural and algorithmic differences between our two approaches, the most important of which is scale. Honeycomb is de￾signed for a host-based context with orders of magni￾tude less processing required. To put this in context, our Earlybird system currently processes more traffic in one second than the prototype Honeycomb observed in 24 hours. However, one clear advantage offered by the host context is its natural imperviousness to network evasion techniques [30]. We discuss this issue further in Sec￾tion 7. Finally, over the last two years of Earlybird’s devel￾opment [34, 35, 37], the clearest parallels can be drawn to Kim and Karp’s contemporaneously-developed Auto￾graph system [16]. Like Earlybird, Autograph also uses network-level data to infer worm signatures and both systems employ Rabin fingerprints to index counters of content substrings and use white-lists to set aside well￾known false positives. However, there are several im￾portant differences as well. First, Autograph relies on a prefiltering step that identifies flows with suspicious scanning activity (particularly the number of unsuccess￾ful TCP connection attempts) before calculating content prevalence. By contrast, Earlybird measures the preva￾lence of all content entering the network and only then considers the addressing activity. This difference means that Autograph cannot detect large classes of worms that Earlybird can – including almost all e-mail borne worms, such as MyDoom, UDP-based worms such as Slammer, spoofed source worms, or worms carried via IM or P2P clients. Second, Autograph has extensive support for distributed deployments – involving active cooperation between multiple sensors. By contrast, Earlybird has focused almost entirely on the algorithmics required to support a robust and scalable wire-speed implementation in a single sensor and only supports distribution through a centralized aggregator. Third, Earlybird is an on-line system that has been in near-production use for eight months and handles over 200 megabits of live traffic, while, as described, Autograph is an off-line system that has only been evaluated using traces. Finally, there are many differences in the details of the algorithms used (e.g. Autograph breaks content into non-overlapping variable-length chunks while Earlybird manages over￾lapping fixed-length content strings over each byte off￾set) although it is not currently clear what the impact of these differences is. 2.3 Containment Containment refers to the mechanism used to slow or stop the spread of an active worm. There are three containment mechanisms in use today: host quarantine, string-matching and connection throttling. Host quaran￾tine is simply the act of preventing an infected host from communicating with other hosts – typically implemented via IP-level access control lists on routers or firewalls. String-matching containment – typified by signature￾based network intrusion prevention syst ems (NIPS) – matches network traffic against particular strings, or sig￾natures, of known worms and can then drop associated packets. To enable high-bandwidth deployments, sev￾eral hardware vendors are now producing high-speed string matching and regular expression checking chips for worm and virus filtering. Lockwood et al. describe an FPGA-based research prototype programmed for this application [19]. Finally, a different strategy, proposed by Twycross and Williamson [43], is to proactively limit the rate of all outgoing connections made by a machine and thereby slow – but not stop – the spread of any worm. Their approach was proposed in a host context, but there is no reason such connection throttling cannot be applied at the network level as well. In this paper, we assume the availability of string￾matching containment (perhaps in concert with throt￾tling) and our Earlybird prototype generates signatures for a Snort in-line intrusion detection system – blocking all packets containing discovered worm signatures. 3 Defining Worm Behavior Network worms, due to their distinct purpose, tend to be￾have quite differently from the popular client-server and peer-to-peer applications deployed on today’s networks. In this section we explore these key behaviors in more detail and how they can be exploited to detect and char￾acterize network worms. 3.1 Content invariance In all existing worms of which we are aware, some or all of the worm program is invariant across every copy. Typically, the entire worm program is identical across every host it infects. However, some worms make use of limited polymorphism – by encrypting each worm in￾stance independently and/or randomizing filler text. In these cases, much of the worm body is variable, but key portions are still invariant (e.g., the decryption routine). For the purposes of this paper, we assume that a worm has some amount of invariant content or has relatively few variants. We discuss violations of this assumption in Section 7
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有