Supermicro All-Flash NVMe Solution For Ceph Storage Cluster

2y ago
53 Views
2 Downloads
2.54 MB
14 Pages
Last View : 20d ago
Last Download : 3m ago
Upload by : Lucca Devoe
Transcription

Table of Contents2 Powering Ceph Storage Cluster withSupermicro All-Flash NVMe StorageSolutions4 Supermicro Ceph OSD Ready All-FlashNVMe Reference ArchitecturePlanning ConsiderationSupermicro NVMe ReferenceArchitecture DesignWhite PaperSupermicro All-Flash NVMe Solutionfor Ceph Storage ClusterHigh Performance Ceph Reference Architecture Based on Ultra SuperServer and Micron 9300MAX NVMe SSDsConfiguration ConsiderationUltra All-Flash NVMe OSD ReadySystemMicron 9300 MAX NVMe SSDsCeph Storage Cluster - Software7 Performance TestingEstablishing Baseline PerformanceTesting Methodology8 4K Random Workload FIO Test Results4K Random Write Workload Analysis4K Random Read Workload Analysis4K Random 70Read/30WriteWorkload Analysis11 4M Object Workloads Test Result4MB Object Write Workload Analysis4MB Object Read Workload Analysis13 SummarySuper Micro Computer, Inc.980 Rock AvenueSan Jose, CA 95131 USAwww.supermicro.com- September 2019

Powering Ceph Storage Cluster withSupermicro All-Flash NVMe Storage SolutionsNVMe, an interface specification for accessing non-volatile storage media via PCI Express(PCI-E) bus, is able to provide up to 70% lower latency and up to six times the throughput/IOPs when compared with standard SATA drives. Supermicro has a wide range of productssupporting NVMe SSDs, from 1U rackmount to 4U SuperStorage servers, to 7U 8-socketmission critical servers and to 8U high-density SuperBlade server solutions.In this white paper, we investigate the performance characteristics of a Ceph clusterprovisioned on all-flash NVMe based Ceph storage nodes based on configuration andperformance analysis done by Micron Technology, Inc. Results published with permission.Results based on testing with Supermicro SuperServer 1029U-TN10RT. Other models andconfigurations may produce different results.This type of deployment choice is optimized for I/O intensive applications requiringthe highest performance for either block or object storage. For different deploymentscenarios and optimization goals, please refer to the table below:1U Ultra 10 NVMe2GoalsRecommendationOptimizationsCost Effectiveness High StorageCapacity & DensitySupermicro 45/60/90-baySuperStorage SystemsHigh capacity SATA HDDs/SSDs formaximum storage densityAccelerated Capacity& Density45/60/90 baysSuperStorage Systemswith NVMe JournalBy utilizing a few NVMe SSDs as CephJournal, the responsiveness of a Cephcluster can be greatly improved whilestill having the capacity benefitsHigh Performance/IOPS IntensiveAll-Flash NVMe ServerAchieving the highest performance withall NVMe SSDs1U Ultra 20 NVMeSupermicro All-Flash NVMe Solution for Ceph Storage Cluster2U SuperStorage 48 NVMe

Supermicro All-Flash NVMe Solutionfor Ceph Storage ClusterThe table below provides some references for where Ceph Block or Object Storage arebest suited for different types of workloads.Ceph Block StorageWorkloadsCharacteristics4U SuperStorage 45-Bay 6 NVMeCeph Object Storage1.Storage for running VM DiskVolumes1.Image/Video/AudioRepository Services2.Deploy Elastic Block Storagewith On-Premise Cloud2.VM Disk Volume Snapshots3.Backup & Archive3.Primary storage for My-SQL& other SQL database apps4.ISO Image Store & RepositoryService4.Storage for Skype,SharePoint and otherbusiness collaborationapplications5.Deploy Amazon S3 ObjectStorage like services withOn-Premise Cloud5.Storage for IT managementapps6.Deploy Dropbox like serviceswithin the Enterprise6.Dev/Test Systems1.Higher I/O1.Low Cost, Scale-out Storage2.Random R/W2.Sequential, Larger R/W3.High Change Content3.Lower IOPS4.Fully API accessible7U 8-Socket 16 U.2 NVMe AICs8U 10x 4-Socket SuperBlade Up to 14 NVMe per BladeWhite Paper - September 20193

Supermicro Ceph OSD Ready All-Flash NVMe ReferenceArchitecturePlanning Consideration Number of Ceph Storage NodesAt least three storage nodes must be present in a Ceph cluster to become eligiblefor Red Hat technical support. Ten storage nodes are the recommended scale for anenterprise Ceph cluster. Four storage nodes represent a valid building block to usefor scaling up to larger deployments. This RA uses four storage nodes. Number of Ceph Monitor NodesAt least three monitor nodes should be configured on separate hardware. Thesenodes do not require high-performance CPUs. They do benefit from having SSDs tostore the monitor map data. One monitor node is the minimum, but three or moremonitor nodes are typically used in production deployments. Replication FactorNVMe SSDs have high reliability with high MTBR and low bit error rate. 2xreplication is recommended in production when deploying OSDs on NVMe versusthe 3x replication common with legacy storage.Supermicro NVMe Reference Architecture DesignData Switches2x Supermicro SSE-SSE-C3632SR32x 100GbE PortsMonitor Nodes3x Supermicro SYS-5019P SeriesStorage Nodes4x Supermicro SYS-1029U-TN10RTConfiguration Consideration4 CPU Sizing CephOSD processes can consume large amounts of CPU while doing small blockoperations. Consequently, a higher CPU core count generally results in higherperformance for I/O-intensive workloads. For throughput-intensive workloadscharacterized by large sequential I/O, Ceph performance is more likely to be boundby the maximum network bandwidth of the cluster. Ceph Configuration TuningTuning Ceph for NVMe devices can be complex. The ceph.conf settings used in thisRA are optimized for small block random performance. NetworkingA 25 GbE network is required to leverage the maximum block performance benefitsof a NVMe-based Ceph cluster. For throughput-intensive workloads, 50GbE to100GbE is recommended.Supermicro All-Flash NVMe Solution for Ceph Storage Cluster

Supermicro All-Flash NVMe Solutionfor Ceph Storage ClusterUltra All-Flash NVMe OSD Ready SystemSupermicro's Ultra SuperServer family includes a 10- and 20-drive all-flash NVMe systemsin a 1U form-factor, and 24- and 48-drive All-Flash NVMe systems in a 2U form-factorarchitected to offer unrivaled performance, flexibility, scalability and serviceability andtherefore ideally suited to demanding enterprise-sized data processing workloads.1U 10 NVMe1U 20 NVMe2U 24 NVMe2U 48 NVMeFor this reference architecture, we utilized four highly flexible 10-drive systems as theCeph Storage Nodes with the hardware configurations shown in the table below.Table 1.2 PCI-E 3.0 x16 slotsHardware ConfigurationsComponentRecommendationServer ModelSupermicro Ultra SuperServer 1029U-TN10RTProcessor2x Intel Xeon Platinum 8168; 24 cores, 48 threads, up to 3.7GHzMemory12x Micron 32 GB DDR4-2666 DIMMs, 384 GB total per nodeNVMe Storage10x Micron 9300 MAX NVMe 12.8 TB SSDsSATA StorageMicron 5100 SATA SSDNetworking2x Mellanox ConnectX-5 100 GbE dual-portPower SuppliesRedundant 1000W Titanium Level digital power suppliesRedundant power supplies2 SATA DOM ports24 DIMM slots DDR4Dual Intel Xeon Scalable processorsup to 205W8 system fans10 hot-swap NVMe/SATA3 drive baysWhite Paper - September 20195

Micron 9300 MAX NVMe SSDsThe Micron 9300 series of NVMe SSDs is Micron's flagship performance family withthe third generation NVMe SSD controller. The 9300 family has the right capacity fordemanding workloads, with capacities from 3.2TB to 15.36TB in mixed-use and readintensive designs.Micron 9300 MAX NVMe SSDTable 2.Micron 9300 MAX NVMe SpecificationsModelMicron 9300 MAXInterfacePCI-E 3.0 x4Form-FactorU.2 SFFCapacity12.8 TBSequential Read3.5 GB/sMTTF2M device hoursSequential Write3.5 GB/sRandom Read850,000 IOPSEndurance144.8 PBRandom Write310,000 IOPSNote: GB/s measured using 128K transfers, IOPS measured using 4K transfers. All data issteady state. Complete MTTF details are available in Micron's product datasheet.Ceph Storage Cluster - Software Red Hat Ceph Storage 3.2Red Hat collaborates with the global open source Ceph community to develop newCeph features, then packages changes into predictable, stable, enterprise-qualityreleases. Red Hat Ceph Storage 3.2 is based on the Ceph community ‘Luminous’version 12.2.1, to which Red Hat was a leading code contributor.As a self-healing, self-managing, unified storage platform with no single point offailure, Red Hat Ceph Storage decouples software from hardware to support block,object, and file storage on standard servers, HDDs and SSDs, significantly loweringthe cost of storing enterprise data. Red Hat Ceph Storage is tightly integrated withOpenStack services, including Nova, Cinder, Manila, Glance, Keystone, and Swift,and it offers user-driven storage lifecycle management.Table 3.Operating SystemRed Hat Enterprise Linux 7.6StorageRed Hat Ceph Storage 3.2: Luminous 12.2.1NIC DriverMellanox OFED Driver 4.5-1.0.0Table 4.6Ceph Storage and Monitor Nodes: SoftwareCeph Load Generation Nodes: SoftwareOperating SystemRed Hat Enterprise Linux 7.6StorageRed Hat Ceph Storage 3.2: Luminous 12.2.1NIC DriverFIO 3.1 w/ librbd enabled Mellanox OFED Driver 4.5-1.0.0Supermicro All-Flash NVMe Solution for Ceph Storage Cluster

Supermicro All-Flash NVMe Solutionfor Ceph Storage ClusterPerformance TestingEstablishing Baseline PerformanceLocal NVMe performance shall be measured for both 4KB blocks and 4MB objects forreference.4KB BlockEach storage node was tested using FIO across all 10 9300 MAX 12.8TB NVMe SSDs. 4KBrandom writes were measured with 8 jobs at a queue depth of 10. 4KB random reads weremeasured with 50 jobs at a queue depth of 10.Table 5.4KB Random Workloads: FIO on 10x 9300 MAX NVMe SSDsStorage NodeWrite IOPSWrite Avg.Latency (ms)Read IOPSRead Avg.Latency (ms)Node 16.47M0.125.95M0.13Node 26.47M0.115.93M0.13Node 26.47M0.125.95M0.13Node 46.47M0.125.93M0.134MB Object4MB writes were measured with 8 jobs at a queue depth of 8. 4MB reads were measuredwith 8 jobs at a queue depth of 8.Table 6.4MB Workloads: FIO on 10x 9300 MAX NVMe SSDsStorage NodeWriteThroughputWrite Avg.Latency (ms)ReadThroughputRead Avg.Latency (ms)Node 132.7 GB/s0.3132.95 GB/s0.31Node 232.66 GB/s0.3132.95 GB/s0.31Node 232.69 GB/s0.3132.95 GB/s0.31Node 432.67 GB/s0.3132.95 GB/s0.31Testing Methodology 4KB Random Workloads: FIO RBD4KB random workloads were tested using the FIO synthetic IO generation tooland the Ceph RADOS Block Device (RBD) driver. 100 RBD images were created atthe start of testing. When testing on a 2x replicated pool, the RBD images were75GB each (7.5TB of data); on a 2x replicated pool, that equals 15TB of total datastored. When testing on a 3x replicated pool, the RBD images were 50GB each (5TBWhite Paper - September 20197

of data); on a 3x pool, that also equals 15TB of total data stored. The four storagenodes have a combined total of 1.5TB of DRAM, which is 10% of the dataset size. 4MB Object Workloads: RADOS BenchRADOS Bench is a built-in tool for measuring object performance. It representsthe best-case object performance scenario of data coming directly to Ceph from aRADOS Gateway node. Object workload tests were run for 10 minutes, three timeseach. Linux file system caches were cleared between each test. The results reportedare the averages across all test runs.4K Random Workload FIO Test Results4KB random workloads were tested using the FIO synthetic IO generation tool and theCeph RADOS Block Device (RBD) driver4K Random Write Workload Analysis4KB random writes reached a maximum of 477K IOPS at 100 clients. Average latencyramped up linearly with the number of clients, reaching a maximum of 6.71ms at 100clients. Figures show that IOPS increased rapidly, then flattened out at 50 – 60 clients. Atthis point, Ceph is CPU-limited. 99.99% tail latency increased linearly up to 70 clients, thenspiked upward, going from 59ms at 70 clients to 98.45ms at 80 clients.Table 7.84KB Random Write ResultsFIO Clients4KBRandomWrite U Util.10 Clients294,7141.08 ms1.48 ms22.97 ms58.8%20 Clients369,0921.73 ms2.60 ms34.75 ms73.6%30 Clients405,3532.36 ms4.09 ms40.03 ms79.6%40 Clients426,8763.00 ms6.15 ms44.84 ms82.8%50 Clients441,3913.62 ms8.40 ms50.31 ms84.7%60 Clients451,3084.25 ms10.61 ms55.00 ms86.1%70 Clients458,8094.88 ms12.63 ms59.38 ms86.8%80 Clients464,9055.51 ms14.46 ms98.45 ms87.6%90 Clients471,6966.11 ms16.21 ms93.93 ms88.3 %100 Clients477,0296.71 ms17.98 ms70.40 ms89.3 %Supermicro All-Flash NVMe Solution for Ceph Storage Cluster

Supermicro All-Flash NVMe Solutionfor Ceph Storage Cluster4K Random Read Workload Analysis4KB random reads scaled from 302K IOPS up to 2 Million IOPS. Ceph reached maximumCPU utilization at a queue depth of 16; Average latency showed an increase as the queuedepth increased, reaching a maximum average latency of only 0.33ms at queue depth 32.Tail latency spiked from 35.4ms at queue depth 16 to 240ms at queue depth 32, a result ofCPU saturation above queue depth 16.Table 8.4KB Random Read ResultsQueue Depth4KBRandomRead U Util.QD 1302,5980.33 ms2.00 ms3.30 ms14.1%QD 21,044,4470.38 ms2.52 ms6.92 ms49.9%QD 81,499,7030.53 ms3.96 ms17.10 ms75.3%QD 161,898,7380.84 ms8.87 ms35.41 ms89.3%QD 322,099,4441.52 ms20.80 ms240.86 ms93.2%White Paper - September 20199

4K Random 70Read/30Write Workload Analysis70% read/30% write testing scaled from 207K IOPS at a queue depth of 1 to 904K IOPS atqueue depth 32. Read and write average latencies are graphed separately, with maximumaverage read latency at 2.11ms and max average write latency at 6.85ms. Tail latencyspiked dramatically above queue depth 16, as the 70/30 R/W test hit CPU saturation.Above queue depth 16, there was a small increase in IOPS and a large increase in latency.Table 9.104KB 70/30 Random Read/Write ResultsQueueDepthR/W IOPSAvg. ReadLatencyAvg. . CPUUtilQD 1207,7030.72 ms0.37 ms19.83 ms12.45 ms19.38%QD 2556,3691.23 ms0.49 ms42.86 ms22.91 ms61.00%QD 8702,9072.12 ms0.71 ms60.13 ms27.77 ms77.92%QD 16831,6113.77 ms1.13 ms72.67 ms36.56 ms88.86%QD 32903,8666.85 ms2.11 ms261.93 ms257.25 ms92.42%Supermicro All-Flash NVMe Solution for Ceph Storage Cluster

Supermicro All-Flash NVMe Solutionfor Ceph Storage Cluster4M Object Workloads Test ResultObject workloads were tested using RADOS Bench, a built-in Ceph benchmarking tool.These results represent the best-case scenario of Ceph accepting data from RADOSGateway nodes. The configuration of RADOS Gateway is out of scope for this RA.4MB Object Write Workload AnalysisWrites were measured by using a constant 16 threads in RADOS Bench and scaling up thenumber of clients writing to Ceph concurrently. Reads were measured by first writing out7.5TB of object data with 10 clients, then reading back that data on all 10 clients whilescaling up the number of threads used.Table 10.4MB Object Write ResultsClients @ 16 ThreadsWrite ThroughputAverage Latency (ms)2 Clients12.71 GB/s11.224 Clients20.12 GB/s13.346 Clients23.06 GB/s17.468 Clients24.23 GB/s22.1510 Clients24.95 GB/s26.90White Paper - September 201911

4MB Object Read Workload Analysis4MB Object reads reached their maximum of 48.4 GB/s and 26.5ms average latency at 16clients. CPU utilization was low for this test and never reached above 50% average CPU%.Table 11.124MB Object Read Results10 Clients @ Varied ThreadsRead ThroughputAverage Latency (ms)4 Threads38.74 GB/s8.38%8 Threads46.51 GB/s11.99%16 Threads46.86 GB/s12.75%32 Threads46.65 GB/s12.77%Supermicro All-Flash NVMe Solution for Ceph Storage Cluster

Supermicro All-Flash NVMe Solutionfor Ceph Storage ClusterSummaryNVMe SSD technologies can significantly improve Ceph performance and responsiveness,especially where Ceph Block Storage is preferred where massive amount of small blockrandom access are found.Supermicro has the most optimized NVMe enabled server and storage systems in theindustry across different form-factors and price points. Please visit www.supermicro.com/nvme to learn more.White Paper - September 201913

About Super Micro Computer, Inc.Supermicro (NASDAQ: SMCI), the leading innovator in high-performance, high-efficiency server technology is a premier provider ofadvanced server Building Block Solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop/Big Data, HPC and Embedded Systemsworldwide. Supermicro is committed to protecting the environment through its “We Keep IT Green ” initiative and provides customerswith the most energy-efficient, environmentally-friendly solutions available on the market.Learn more on www.supermicro.comAbout Micron Technology, Inc.We are an industry leader in innovative memory and storage solutions. Through our global brands — Micron , Crucial and Ballistix — ourbroad portfolio of high-performance memory and storage technologies, including DRAM, NAND, NOR Flash and 3D XPoint memory,is transforming how the world uses information to enrich life. Backed by 40 years of technology leadership, our memory and storagesolutions enable disruptive trends, including artificial intelligence, machine learning, and autonomous vehicles, in key market segmentslike cloud, data center, networking, mobile and automotive. Our common stock is traded on the NASDAQ under the MU symbol. To learnmore about Micron Technology, Inc., visit www.micron.com.Learn more at www.micron.comNo part of this document covered by copyright may be reproduced in any form or by any means — graphic, electronic, or mechanical,including photocopying, recording, taping, or storage in an electronic retrieval system — without prior written permission of thecopyright owner. Configuration and performance analysis done by Micron Technology, Inc. Results published with permission.Supermicro, the Supermicro logo, Building Block Solutions, We Keep IT Green, SuperServer, Twin, BigTwin, TwinPro, TwinPro², SuperDoctorare trademarks and/or registered trademarks of Super Micro Computer, Inc.Intel, the Intel logo, and Xeon are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries.All other brands names and trademarks are the property of their respective owners. Copyright Super Micro Computer, Inc. All rights reserved.Printed in USAPlease Recycle14 Ceph-Ultra 190912 Rev2

Ceph features, then packages changes into predictable, stable, enterprise-quality releases. Red Hat Ceph Storage 3.2 is based on the Ceph community ‘Luminous’ version 12.2.1, to which Red Hat was a leading code contributor. As a self-healing, self-managing, unified storage platform with no single point of

Related Documents:

support@supermicro.com (Technical Support) Website: www.supermicro.com Europe Address: Super Micro Computer B.V. Het Sterrenbeeld 28, 5215 ML 's-Hertogenbosch, The Netherlands Tel: 31 (0) 73-6400390 Fax: 31 (0) 73-6416525 Email: sales@supermicro.nl (General Information) support@supermicro.nl (Technical Support) rma@supermicro.nl (Customer .

NVMe SSD is in use and warn the user if so. They cannot detect all cases where an NVMe SSD is in use and so the user should verify the NVMe SSD is no longer in use prior to removing it. Some operating systems may prevent orderly removal of NVMe SSDs that are still in use. Figure 4 Prepare to Remove NVMe SSD

Austin Bolen, Dell EMC Myron Loewen, Intel Lee Prewitt, Microsoft Suds Jain, VMware David Minturn, Intel James Harris, Intel 4:55-6:00 8/7/18 NVMe-oF Transports: We will cover for NVMe over Fibre Channel, NVMe over RDMA, and NVMe over TCP. Brandon Hoff, Emulex Fazil Osman, Broadcom J Metz,

DPDK cryptodev Released In progress NVMe-oF Initiator BDEV NVMeoF BD NVMe-oF Target. 18. SPDK Virtual BDEV Perfect place to add storage algorithms SPDK NVMe NVMe-oF Target NVMe Driver BDEV NVMe BD SSD for Datacenter BDEV enables stackable SW BDEV provides abstraction for storage solutions to be inserted Storage Services can be:

Contacting Supermicro Headquarters Address: Super Micro Computer, Inc. 980 Rock Ave. San Jose, CA 95131 U.S.A. Tel: 1 (408) 503-8000 Fax: 1 (408) 503-8008 Email: marketing@supermicro.com (General Information) support@supermicro.com (Technical Support) Website: www.supermicro.com Europe Address: Super Micro Computer B.V. Het Sterrenbeeld 28 .

Contacting Supermicro Headquarters Address: Super Micro Computer, Inc. 980 Rock Ave. San Jose, CA 95131 U.S.A. Tel: 1 (408) 503-8000 Fax: 1 (408) 503-8008 Email: marketing@supermicro.com (General Information) support@supermicro.com (Technical Support) Web Site: www.supermicro.com Europe Address: Super Micro Computer B.V. Het Sterrenbeeld 28 .

support@supermicro.com (Technical Support) Website: www.supermicro.com Europe Address: Super Micro Computer B.V. Het Sterrenbeeld 28, 5215 ML 's-Hertogenbosch, The Netherlands Tel: 31 (0) 73-6400390 Fax: 31 (0) 73-6416525 Email: sales@supermicro.nl (General Information) support@supermicro.nl (Technical Support) rma@supermicro.nl (Customer .

Contacting Supermicro Headquarters Address: Super Micro Computer, Inc. 980 Rock Ave. San Jose, CA 95131 U.S.A. Tel: 1 (408) 503-8000 Fax: 1 (408) 503-8008 Email: marketing@supermicro.com (General Information) support@supermicro.com (Technical Support) Website: www.supermicro.com Europe Address: Super Micro Computer B.V. Het Sterrenbeeld 28 .