I3: Maximizing Packet Capture Performance - Wireshark

1y ago
6 Views
2 Downloads
1.59 MB
42 Pages
Last View : 22d ago
Last Download : 2m ago
Upload by : Fiona Harless
Transcription

I3: Maximizing PacketCapture PerformanceAndrew Brown

Agenda Why do captures drop packets, how can you tell? Software considerations Hardware considerations Potential hardware improvements Test configurations/parameters Performance resultsSharkfest 2014

What is a drop? Failure to capture a packet that is part of the trafficin which you’re interested Dropped packets tend to be the most important Capture filter will not necessarily helpSharkfest 2014

Why do drops occur? Applications don’t know that their data is beingcaptured Result: Only one chance to capture a packet What can go wrong?Let’s look at the life of a packetSharkfest 2014

Internal packet flow Path of a packet from NIC to application (Linux) Switch output queue dropsInterface dropsKernel dropsSharkfest 2014

Identifying drops Software reports drops L4 indicators (TCP ACKed lost segment) L7 indicators (app-level sequence numbersrevealed by dissector)Sharkfest 2014

When is (and isn’t) it necessary to takesteps to maximize capture performance? Not typically necessary when capturing traffic of 1G end device More commonly necessary when capturing uplinktraffic from a TAP or SPAN port Some sort of action is almost always necessary at10G Methods described aren’t always necessary Methods focus on free solutionsSharkfest 2014

Software considerations Windows Quit unnecessary programs Avoid Wireshark for capturing Saves to TEMP Additional processing for packet statistics Uses CPU Uses memory over time, can lead to out of memory errorsSharkfest 2014

Software considerations –Windows (continued) Alternative? Dumpcap Command-line utilityCalled by Wireshark/Tshark for captureProvides greater controlDumpcapui for CLIphobic“At the limits” example Dumpcap captured 100% of packets sent Wireshark captured 68% of packets sentSharkfest 2014

Software considerations –Windows (continued) Windows dumpcap buffer tuning Large buffers are generally good, but Increased bandwidth has a tipping point Sharkfest 2014Write to disk slows significantlyLarger buffers make it worseMade buffer selection for testing difficultBest option seemed to be 50MB

Software considerations –Windows (continued) Dumpcap “slow count” example Sending 844,600 packets @ .4GbPackets take 1.48 seconds to send20MB buffer takes 2.5 seconds to write512MB buffer takes 46 seconds to writeNeither setting captured all packetsNot cosmetic (break out and file is truncated)Issue disappears at lower bandwidthSharkfest 2014

Software considerations –Windows (continued) Video of normal countSharkfest 2014

Software considerations –Windows (continued) Video of “slow count”Sharkfest 2014

Software considerations –Windows (continued) Disable protocols on interface(TAP/SPAN) Pure TAP/SPAN capture Only for TAP/SPAN Prevents OS from attempting tointerpret packets Tested performance withdestination MAC set tobroadcast address Result: Captured 100% withprotocols disabled,only 40% when enabled Eliminate performance impactimmediately after link upSharkfest 2014Uncheck everything below

Software considerations – Linux Quit unnecessary programs Use tcpdump with 512MB buffer Ensure libpcap 1.0.0 (tcpdump -h) Watch value of -s flag No option to disable protocols like Windows Static (or no) IP for dedicated capture interface Use XFS with RAID and coordinate stripe sizesSharkfest 2014

Software considerations – Linux(continued) Access to development resources? Look at PF RING Module/NIC driver combinationImproves capture performanceIncluded tcpdump wasn’t better than stockWe use the API and it worksDifferent performance tiers some are freeSharkfest 2014

Software considerations – Linux(continued) PF RING Kernel module/NIC driver combinationImproves capture performance via various methodsIncluded tcpdump wasn’t better than stockWe use the API and it worksDifferent performance tiers some are freeSharkfest 2014

Hardware considerations Storage 1Gb line rate traffic generates 123-133MB in onesecond WD Black 7.2K RPM: 171MB/s WD Raptor 10K RPM: 200MB/s If 10Gb is 10X 1Gb (do the math) SSD: 500MB/s RAM disk is another optionSharkfest 2014

Hardware considerations - CPU Three considerations Number of cores Clock speed Performance per clock Clock speed * PPC Per-core performance Multicore is good . but per-core performance is better than manycoresSharkfest 2014

Hardware considerations - NIC Intel (regular NIC) Drivers more activelymaintained Best PF RING support 10G NIC doesn’t helpwith 1G capture (1Gand 10G NICs had thesame max bandwidth ator below 1G)Sharkfest 2014 Avoid USB NICs USB 2.0 is too slow(480Mb/s) USB 3.0 didn’t performwell

Benchmark methodology Tested limits of capture configurations at 1G and10G For each configuration, increase bandwidth until it fails Failure is defined as not capturing all packets Highest performing solutions formed basis forrecommendationsSharkfest 2014

Obvious question: Traffic profile? If not testing for a specific use case, what is theappropriate traffic with which to test? What mix of TCP/UDP? What duration, frequency, severity of bursts? What mix of small/large packets?Sharkfest 2014

(My) Answer: Many copies of a singlepacket with tests at various packet sizes Takes Receive Side Scaling out of the picture Removes buffering from the equation Tends to be pessimisticSharkfest 2014

Test configuration Unicast UDP packet used for (almost) all tests Packet sizes of 64, 128, 256, 512, 1024, 1500 bytes Additional CPU overhead for every packet One second at 1Gb is 82K 1500 byte packets One second at 1Gb is 1.49M 64 byte packets Number of packets tailored to generate a 1.5GBcapture file Careful to eliminate disk as a bottleneckSharkfest 2014

Improving performanceThe ideal Ideal capture laptop Fast CPUFast storage (SSD RAID)Dedicated Intel NIC10G capability Perfect except for one issue it doesn’t existSharkfest 2014

Improving performanceThunderbolt PCIe via a cable (developed by Intel) Allows use of desktop cards on a laptop Expensive Not very widespread (mostly Apple computers) Other laptop limitations are still a problemSharkfest 2014

Improving performanceLaptop alternative What level of performance is possible from(relatively) portable commodity hardware? Packet toaster Used for all capture testingIntel i5 4570 desktop CPU (3.6GHz quad-core)Up to 16GB RAM for RAM diskUp to 4 SSD in RAID 0Cost 800 with 8GB RAM, 2 SSDs Concept: Run without monitor, manage via laptopSharkfest 2014

Packet Toaster port layout Intel 1G NIC Additional 1G NIC for management (SSH/RDP) 802.11n for capture (Linux) or management PCIe slot for 10GSharkfest 2014

Solarflare Low-latency NIC with stack bypass Why include it? Price competitive with other commodity 10G NICsWorks as a regular NIC under Linux, Windows, Mac etc.Works at 1G alsoSolarCapture app for high-performance Linux capture Hardware/software capture solution Tested with Packet Toaster and MacBook Pro(via Thunderbolt)Sharkfest 2014

The difference a week makes At the time of testing, SolarCapture was a freedownload Less than a week ago, Solarflare changed licensingtiers; free SolarCapture is no longer available Pricing is reasonable (in my opinion) but reasonable is relative this breaks my original concept of free software Debated removing results but couldn’t (impactedother results and no time to re-test)Sharkfest 2014

Performance ResultsConfigurations Wireshark under Windows 7 (SSD) Dumpcap under Windows 7 (SSD) Dumpcap under Linux (SSD) TCPDump under Linux (SSD) SolarCapture under Linux on MacBook Provia Thunderbolt (RAM) SolarCapture under Linux (SSD) SolarCapture under Linux (RAM)Sharkfest 2014

Gb/sPerformance ResultsWireshark vs. Dumpcap (Win 7)10987654321064128256512Packet Size (bytes)Wireshark Windows 7Sharkfest 20141024Dumpcap Windows 71500

Performance ResultsGb/sDumpcap (Win7) - Dumpcap (Linux) – TCPDump (Linux)10987654321064128Dumpcap Windows 7Sharkfest 2014256512Packet Size (bytes)1024Dumpcap LinuxTCPDump Linux1500

Performance ResultsGb/sDumpcap (Win7) - Dumpcap (Linux) – TCPDump (Linux)10987654321064128TCPDump LinuxSharkfest 2014256512Packet Size (bytes)Solarcap Linux SSD1024Solarcap Linux RAM1500

Performance ResultsGb/sTCPDump (Linux) – SolarCapture (SSD) – SolarCapture (RAM)10987654321064128TCPDump LinuxSharkfest 2014256512Packet Size (bytes)Solarcap Linux SSD1024Solarcap Linux RAM1500

Performance ResultsGb/sDumpcap (Win7) - Dumpcap (Linux) – TCPDump (Linux)10987654321064128256512Packet Size (bytes)Solarcap Thunderbolt RAMSharkfest 20141024Solarcap Linux RAM1500

Gb/sPerformance ResultsBy Packet Size10987654321064128256512Packet Size (bytes)1024Wireshark Windows 7Dumpcap Windows 7Dumpcap LinuxTCPDump LinuxSolarcap Thunderbolt RAMSolarcap Linux SSDSolarcap Linux RAMSharkfest 20141500

Performance ResultsBy Configuration10987Gb/s6543210WiresharkWindows 7DumpcapWindows 764Sharkfest erboltRAM10241500SolarcapLinux SSDSolarcapLinux RAM

Acknowledgements BATS Global Markets Guy Harris Core developer: libpcap, tcpdump and WiresharkSharkfest 2014

Appendix - Links Links http://www.intel.com (Intel NICs)http://www.ntop.org (PF RING)http://www.solarflare.com (SolarCapture)http://www.tcpdump.org (TCPdump/Libpcap)http://www.wireshark.org (Wireshark/Dumpcap)http://www.macsales.com (Thunderbolt enclosure)Sharkfest 2014

Appendix – Packet Toaster Specs CPU: Intel i5 4570 (3.6GHz quad-core) Motherboard: Gigabye Z87N-WIFI RAM: 8GB DDR3 Storage Samsung 840 Evo (Operating System) 2 x Sandisk Extreme in RAID 0 (Capture destination)Sharkfest 2014

QuestionsSharkfest 2014

Watch value of -s flag No option to disable protocols like Windows Static (or no) IP for dedicated capture interface Use XFS with RAID and coordinate stripe sizes . SolarCapture app for high-performance Linux capture Hardware/software capture solution Tested with Packet Toaster and MacBook Pro (via Thunderbolt) Sharkfest .

Related Documents:

HowtoImplement Embedded Packet Capture Managing Packet DataCapture SUMMARYSTEPS 1. enable 2. monitor capture capture-name access-list access-list-name 3. monitor capture capture-name limit duration seconds 4. monitor capture capture-name interface interface-name both 5. monitor capture capture-name buffer circular size bytes .

2. monitor capture capture-name access-list access-list-name 3. monitor capture capture-name limit duration seconds 4. monitor capture capture-name interface interface-name both 5. monitor capture capture-name buffer circular size bytes EmbeddedPacketCaptureOverview 4 EmbeddedPacketCaptureOverview PacketDataCapture

Device# monitor capture mycap start *Aug 20 11:02:21.983: %BUFCAP-6-ENABLE: Capture Point mycap enabled.on Device# show monitor capture mycap parameter monitor capture mycap interface capwap 0 in monitor capture mycap interface capwap 0 out monitor capture mycap file location flash:mycap.pcap buffer-size 1 Device# Device# show monitor capture mycap

r1#no monitor capture buffer MYCAPTUREBUFFER Capture Buffer deleted r1#show monitor capture buffer MYCAPTUREBUFFER parameters Capture Buffer MYCAPTUREBUFFER does not exist r1#no monitor capture point ip cef INTERNALLAN fa0/1 *Jun 21 00:07:25.471: %BUFCAP-6-DELETE: Capture Point INTERNALLAN deleted. r1#show monitor capture point INTERNALLAN

Cisco IOS Embedded Packet Capture Command Reference 3 monitor capture through show monitor capture monitor capture. Command History Release Modification 12.2(33)SXI Thiscommandwasintroduced. Usage Guidelines Thebuffer sizekeywordsandargumentdefines thebuffer thatisusedtostore packet. . monitor capture .

Acme Packet 1100 Acme Packet 3900 Acme Packet 4600 front bezel hides the fan assemblies without restricting airflow through the system. Acme Acme Packet 6100 Acme Packet 6300 Packet 6300 Acme Packet 6350 The rear of Acme Packet 6300 least one slot reserved for an NIU.

Sample Capture Session switch1(config)#monitor session 3 type capture switch1(config-mon-capture)#buffer-size 65535 switch1(config-mon-capture)#source interface gi4/15 both switch1#sh monitor capture Capture instance [1] : Capture Session ID : 3 Session status : up rate-limit value : 10000 redirect index : 0x809 buffer-size : 2097152

Introduction to Quantum Field Theory for Mathematicians Lecture notes for Math 273, Stanford, Fall 2018 Sourav Chatterjee (Based on a forthcoming textbook by Michel Talagrand) Contents Lecture 1. Introduction 1 Lecture 2. The postulates of quantum mechanics 5 Lecture 3. Position and momentum operators 9 Lecture 4. Time evolution 13 Lecture 5. Many particle states 19 Lecture 6. Bosonic Fock .