Transport Layer Identification of p2p Traffic T Karagiannis, A Broido, M. Faloutsos, K claffy
Transport Layer Identification of P2P Traffic T. Karagiannis, A. Broido, M. Faloutsos, K. Claffy
Outline ● Introduction ●Re| ated work Payload analysis Limitations Non-payload identification Experiments evaluation ●P2 P traffic trends e Conclusions
Outline • Introduction • Related work • Payload analysis & Limitations • Non-payload identification • Experiments & Evaluation • P2P traffic trends • Conclusions
Characters of p2P Traffic Traffic volume grows rapidly Frequently upgrades emergence of new protocols Disquise the traffic to circumvent firewalls legal issues Non-standard proprietary protocols(poor documented) Operate on arbitrary port numbers Support payload encryption
Characters of P2P Traffic • Traffic volume grows rapidly • Frequently upgrades & emergence of new protocols • Disguise the traffic to circumvent firewalls & legal issues – Non-standard, proprietary protocols (poorly documented) – Operate on arbitrary port numbers – Support payload encryption
Identification Methodology Examining packet payload Signature-based methodology Limitations Identifying at transport layer Based on flow patterns p2p behaviors Advantages
Identification Methodology • Examining packet payload – Signature-based methodology – Limitations • Identifying at transport layer – Based on flow patterns & P2P behaviors – Advantages
Contributions Develop a methodology for P2P traffic profiling by identifying flow patterns and behavior characteristics Evaluate the effectiveness by comparing with payload analysis Convince the growing of P2P traffic by analyzing backbone traces
Contributions • Develop a methodology for P2P traffic profiling by identifying flow patterns and behavior characteristics • Evaluate the effectiveness by comparing with payload analysis • Convince the growing of P2P traffic by analyzing backbone traces
Previous Work Detailed characterization of a small subset of P2P protocols networks Properties of topology, bandwidth, caching availability, etc Signature-based traffic identification Traffic estimation of P2P applications with fixed ports
Previous Work • Detailed characterization of a small subset of P2P protocols & networks • Properties of topology, bandwidth, caching & availability, etc. • Signature-based traffic identification • Traffic estimation of P2P applications with fixed ports
Payload Analysis Table 2: Strings at the beginning of the payload of P2P protocols. The character 0x" below implies Hex strings P2P Protocol String Trans. prot. Def. ports eDonkey20000xe319010000TCP/UDP4661-4665 0xc53f010000 Fasttrack "Get hash' TCP 1214 0x270000002980 UDP Bittorrent "Ox13bit' TCP 6881-6889 gnutella GNUT”.“GIV TCP 6346-6347 GND” UDP GO!!. MD5. SIZOx 20 TCP 41170UDP Direct connect $MyN”,"$Dir TCP41-412 SR” UDP A res "get hash: TCP Get shal
Payload Analysis
Payload Analysis M1: Flag a flow with a src/dst port number matching one of the well-known port numbers M2: Flag a flow as p2P if the 16-byte payload of any packet matches the signatures, else flag it as non-P2P A loose lower bound on P2p volume M3: Hash the (src, dst] ip pair of a flow flagged as P2P into a table. Flag the flows containing an IP address in the table as "possible P2P" even if no payload matches
Payload Analysis • M1: Flag a flow with a src/dst port number matching one of the well-known port numbers. • M2: Flag a flow as P2P if the 16-byte payload of any packet matches the signatures , else flag it as non-P2P. – A loose lower bound on P2P volume • M3: Hash the {src, dst} ip pair of a flow flagged as P2P into a table. Flag the flows containing an IP address in the table as “possible P2P” even if no payload matches
Limitations Captured payload size Only first 16 bytes of payload Only 4 bytes in older traces Http requests Encryption o Other P2P protocols o Unidirectional traces
Limitations • Captured payload size – Only first 16 bytes of payload – Only 4 bytes in older traces • HTTP requests • Encryption • Other P2P protocols • Unidirectional traces
Non-payload Identification ● Two main heuristics dsrc, dst IP pairs that use both TCP and udp to transfer data The behavior of peers by studying connection characteristics of (IP porty pairs
Non-payload Identification • Two main heuristics: – {src, dst} IP pairs that use both TCP and UDP to transfer data – The behavior of peers by studying connection characteristics of {IP, port} pairs