Centralized Vs. Distributed - SNIA

2y ago
17 Views
2 Downloads
3.91 MB
40 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Francisco Tran
Transcription

Centralized vs. DistributedA Great Storage DebateLive WebcastSeptember 11, 201810:00 am PT

Today’s PresentersJ MetzSNIA Board of DirectorsCiscoJohn KimSNIA ESF ChairMellanox 2018 Storage Networking Industry Association. All Rights Reserved.Alex McDonaldSNIA ESF Vice ChairNetApp2

SNIA-At-A-Glance 2018 Storage Networking Industry Association. All Rights Reserved.3

SNIA Legal NoticeThe material contained in this presentation is copyrighted by the SNIA unless otherwise noted.Member companies and individual members may use this material in presentations andliterature under the following conditions:Any slide or slides used must be reproduced in their entirety without modificationThe SNIA must be acknowledged as the source of any material used in the body of any document containing materialfrom these presentations.This presentation is a project of the SNIA.Neither the author nor the presenter is an attorney and nothing in this presentation is intendedto be, or should be construed as legal advice or an opinion of counsel. If you need legal adviceor a legal opinion please contact your attorney.The information presented herein represents the author's personal opinion and currentunderstanding of the relevant issues involved. The author, the presenter, and the SNIA do notassume any responsibility or liability for damages arising out of any reliance on or use of thisinformation.NO WARRANTIES, EXPRESS OR IMPLIED. USE AT YOUR OWN RISK. 2018 Storage Networking Industry Association. All Rights Reserved.4

Today’s AgendaThe Rules of the DebateThe “Whats and Hows”Centralized StorageDistributed StorageThe Trade-Off Debate 2018 Storage Networking Industry Association. All Rights Reserved.5

The Rules of The DebateNo hitting below the belt 2018 Storage Networking Industry Association. All Rights Reserved.Spoiler Alert: There is no“winner”This is all about the “sweetspot”Participants:Define the technologiesHow they workDiscuss the trade-offs6

Storage Has One Job!One main job“Give me back the correctbit I asked you to hold forme.”Everything we do in storage(including storagenetworking) is based aroundcompleting that job safely,securely, reliably, and withouterror 2018 Storage Networking Industry Association. All Rights Reserved.You had one job!7

All StorageNeeds to:Protect dataKeep data secureStay within regulatory complianceBe manageableBe backed up!May need to:Be scalableBe sharableBe very fast 2018 Storage Networking Industry Association. All Rights Reserved.8

Criteria for ChoosingItems to consider in choice of storageAccess — what protocols can I use?Performance — will my applications &hence users be happy?Availability — can I tolerate periodswithout access?Capacity — how big do I need?Protection — how do I ensure my data’sintegrity?Durability — how long do I need to storemy data?Security & Privacy — will sensitive databe OK here?Cost — is it cheaper than thealternatives?Let’s discuss some of these 2018 Storage Networking Industry Association. All Rights Reserved.9

John KimCENTRALIZED STORAGE 2018 Storage Networking Industry Association. All Rights Reserved.10

DefinitionsDirect Attached Storage (DAS)Storage directly attached to just one serverStorage Area Network (SAN)Centralized block storage system connected to multiple hosts usingnetworks such as Fibre Chanel, iSCSI, NVMe-oF, or InfiniBandNetwork Attached Storage (NAS)Centralized or distributed file storage connected to multiple hostsusing file protocols, usually using Ethernet networkingHyperconverged Infrastructure (HCI)Set of servers each with compute and storage resources, oftensharing those resources with each other 2018 Storage Networking Industry Association. All Rights Reserved.11

First There was Local StorageLocal storage for each serverInside the server or directly attached to oneserver (DAS)Easy to buy, set up, and consumeServer vendor/integrator can installAll operating systems/hypervisors can useNo special drivers or networking requiredBut Inefficient and difficult to manage at scaleIssues with backup, failover, utilization,sharing 2018 Storage Networking Industry Association. All Rights Reserved.12

Then Centralized StorageConsolidate storage into centralizedsystemsEach supports multiple serversConnect via PCIe, SAS, SAN, NAS orObjectEasier to share and protect dataHigher utilizationEasier backup, recovery, failover,sharing 2018 Storage Networking Industry Association. All Rights Reserved.13

Comparing Storage ModelsInternal vs. DAS vs. DHDD/SSDHBA(SAS, SATA,FC, IB)Server Server Server ServerSwitchStorage ArrayJBOD / JBOFInternal StorageDAS 2018 Storage Networking Industry Association. All Rights Reserved.SAN / NAS(Centralized Storage)14

Comparing ed?InternalInside serverSAS/SATA/PCIeNoneNoDASAttached to 1 serverSAS/SATA/PCIe,FC, IBNoneNoSANCentralized arrayFC, Ethernet, IBArrayRarelyNASCentralized array(s)EthernetFileSometimesObjectMultiple arraysEthernetObjectYesHCIIn each server, or EthernetdependsUsually 2018 Storage Networking Industry Association. All Rights Reserved.15

Alex McDonaldDISTRIBUTED STORAGE 2018 Storage Networking Industry Association. All Rights Reserved.16

Distributed Storage: a definitionDifficult to precisely defineData stored on many systems which behave as a single entityGeographically or regionally dispersed rather than local to a datacenterAccessed over LAN or WAN,commonly EthernetCloudy-ish; often implemented onshared resourcesWell, I give up Not centralized or hyperconverged (HCI)Scales out (horizontally) rather than up (vertically) 2018 Storage Networking Industry Association. All Rights Reserved.17

Access to & Performance of Distributed StorageNetwork connectivity & performance criteriaBandwidth & Latency“Bandwidth problems can be cured with money. Latency problems are harder because thespeed of light is fixed - you can’t bribe God.”Compute locationLow bandwidth & poor latency tolerable if the compute is next to the data, and we only needto send/receive small amountsLATENCYFlash technologies? SSD? NVMe?Yes; this isn’t just about cheap spinning disk any moreProtocols; tend to be application drivenObject type storage (S3, CDMI, Swift)LAN/WAN protocols (SMB, NFS)Block (iSCSI)BANDWIDTHRule of thumbThe less “cloudy” or “WANny” the access, the less likely the application willtolerate high latency and/or low bandwidth 2018 Storage Networking Industry Association. All Rights Reserved.18

Data Security & PrivacySecurity vs. PrivacySecurity is making sure only the right people/systems have access to the dataPrivacy ensures that the data isn’t misusedPrivacy is explored further u/Security measuresIdentification & authentication systemse.g. Kerberos & NFS, LDAP & SMBEnd-to-end encryption (including devices)Storing data in the right place & knowing howthe data is managedReplicas, mirroring, cloud brokering, backups can all bein different places and differently secured 2018 Storage Networking Industry Association. All Rights Reserved.19

Scaling out rather than upCapacity can be seen as infinite“It’s just a matter of cost ”More capacity tends to exacerbate these issues:More cold dataHigher bandwidth, especially to distributed storageHarder to avoid putting compute with the dataIncreased data amnesiaHarder systems management problems 2018 Storage Networking Industry Association. All Rights Reserved.20

Protection & DurabilityDistributed storage uses a variety oftechniquesStandard RAID technologiesMirroring & replication2 or 3 location copiesErasure codingFor a detailed Q&A on these techniques and an ondemand introductory webcast -no-onespride-was-hurt/Or CAPConsistent, Available, Partitioned; pick 2ARDBMSsMySQLPostGresOracle Aster CassandraSimpleDBCouchDBRiakPCMongoDB BerkeleyDBTerrastore tional Key Value Tabular Document 2018 Storage Networking Industry Association. All Rights Reserved.21

Future of Distributed StorageDistributed storage offers new & interesting solutionsNew database technologiesNoSQL, key/value, tabular, document On-disk computeKey/value stores directly on the driveProcessing on the driveData classification, analysis, automated metadataBrought together by “consolidating” applicationsIoT (Internet of Things)Big data generatorsData at the edge 2018 Storage Networking Industry Association. All Rights Reserved.ARDBMSsMySQLPostGresOracle Aster CassandraSimpleDBCouchDBRiakPCMongoDB BerkeleyDBTerrastore tional Key Value Tabular Document22

Status Check - Midway SummaryCentralizedMore efficient storageutilizationSimpler storage managementDistributedScales out, not upLatency a secondaryconsideration 2018 Storage Networking Industry Association. All Rights Reserved.23

Bring It On!So what are thetrade-offs? 2018 Storage Networking Industry Association. All Rights Reserved.24

Is Data Locality Really Important?Centralized StorageNeed servers and storage insame data centerWAN links too much latencyInstall storage near users (i.e.ROBO, cloud)Object and file can support remoteaccessBut then usually set up asdistributed storage 2018 Storage Networking Industry Association. All Rights Reserved.25

Is Data Locality Really Important?Distributed StorageAt scale, data locality hard to achieveData has mass & inertiaEasiest to process whereit’s born, centralize thesummariesPartial compute at the edgeNew technologies prevent extreme centralizationIoT, blockchain & distributed ledgers, datatypes likevideo & image, etc. 2018 Storage Networking Industry Association. All Rights Reserved.26

How to Scale Centralized?Performance scalingArray performance limitsNetwork limitsServerMay require localityCapacity ScalingWANbandwidthAdding more arraysand/orlatencyManagement burdenServer Server ServerWAN/CloudSwitchSAN / LANperformancelimitStorage ArrayArraylimitsSAN / NAS 2018 Storage Networking Industry Association. All Rights Reserved.(Centralized Storage)27

How to Scale Distributed?Just add more!Limits of scalingmay constrain the solutionEconomics: cost, bandwidth, latencyLegal: data placement & securityTechnical: bandwidth, latencyApplication plays a partNot all distributed systems can scale out to infinityCAP limitations ensure that 2018 Storage Networking Industry Association. All Rights Reserved.28

Shared ResourcesCentralizedArrays not sharedNetwork & admins sometimessharedMight share management toolsDifferent arrays for differentworkloadsMore flexibility in featuresExtra management headaches 2018 Storage Networking Industry Association. All Rights Reserved.29

Shared ResourcesDistributedData location is a moveable feastBackups, mirroring, shardingRecovery scenarios can be complexWho & what is impacted byfailure & restores?Fully understand security & privacyAuthentication & authorizationSafe Harbor & GDPR importanthereImpacts on performance & capability“Noisy neighbors” 2018 Storage Networking Industry Association. All Rights Reserved.30

Installation, Configuration, ManagementCentralized StorageComplex to deploy, manageNeed reliable networkMight need special driversArray/network mgmt. skillsSecurityChallenges at large scaleManaging many arraysBalancing capacity & workloadsMay be difficult to automate 2018 Storage Networking Industry Association. All Rights iversSAN / orage ArrayArrayexpertise31

Installation, Configuration, ManagementDistributed StorageA range of toolsInstallation & sizing toolsCapacity, performance,application usage, user usage,chargeback & showbackOpenStack, Docker,Kubernetes Offer management consoles &dashboardsSoftware defined configurationsCompute, network & storagevirtualization on one pane ofglass 2018 Storage Networking Industry Association. All Rights Reserved.New DevOps tools “understand”applicationsAnsible, Chef, Puppet Issues:Data amnesia; forgettingwhat was put where it is a bigissueData migration from systemto system can be a challengeData can suffer from“container lock in”Many dashboards areproduct specific & can beincompatible with each otherToo much choice in DevOpstools?32

What’s the Cost/Economic Profile?CentralizedUsually custom (bespoke) hardwareDedicated storage platformsOften uses dedicated networkLess likely to be SDS or cloudMore likely to be Cap/ExOp/Ex model available throughleasing, cloud 2018 Storage Networking Industry Association. All Rights Reserved.33

What’s the Cost/Economic Profile?DistributedCap/Ex or Op/Ex? - “It’s the economy,stupid!”Cost is a big factorConsider a longer term cost profileLargely due to scaleFuture unknown, but historical /bytecost has fallen pretty consistentlyFor applications to be of value, their costcomponents have to be manageable andsmaller than the benefitsPressure ofSystems management costsNew application models (likecontainer, serverless) 2018 Storage Networking Industry Association. All Rights Reserved.34

Backup and Data ProtectionCentralizedEasy to backup, fast restoresA big reason to go centralizedUsually includes RAID, snapshots, clonesReplication and remote backup optionsTo local system, remote system, orthe cloud 2018 Storage Networking Industry Association. All Rights Reserved.Snapshots: point-in-timecopies of your dataStorage clones: start identical,change over time35

Backup and Data ProtectionDistributedBackup can be harderBackup implies a complete redundant copyRemember CAP & eventually consistentDurabilityNot all data needs to be durableBut when it must be, avoiding “bit rot” & “deviceobsolescence” requires data to be movedLong term data retention especially an issueRegister for: “The 100 Year Archive Survey Results”October 10, 2018https://www.brighttalk.com/webcast/663/335255 2018 Storage Networking Industry Association. All Rights Reserved.36

Debate SummaryCentralized makes each array the centerof attentionEach array handles backup, security,managementAt scale, requires lots of attention,managementDistributed spreads performance andcapacity across multiple systemsEasy scalability, often lower costsSecurity and backup can be morecomplexBoth ways have advantages 2018 Storage Networking Industry Association. All Rights Reserved.37

More WebcastsOther Great Storage DebatesFCoE vs. iSCSI vs. ibre Channel vs. 7File vs. Block vs. Object 609RoCE vs. iWARP: mand “Everything You Wanted To Know About Storage ButWere Too Proud To Ask” bcasts-topics 2018 Storage Networking Industry Association. All Rights Reserved.38

After This WebcastPlease rate this webcast and provide us with feedbackThis webcast and a PDF of the slides will be posted to the SNIAEthernet Storage Forum (ESF) website and available on-demand atwww.snia.org/forums/esf/knowledge/webcastsA full Q&A from this webcast, including answers to questions wecouldn't get to today, will be posted to the SNIA-ESF blog:sniaesfblog.orgFollow us on Twitter @SNIAESF 2018 Storage Networking Industry Association. All Rights Reserved.39

Thank You!

Sep 11, 2018 · Internal vs. DAS vs. SAN/NAS 14 Server Server SAS/SATA HBA HDD/ SSD HDD/ SSD Internal Storage PCIe bus NVMe SSD Server HBA (SAS, SATA, FC, IB) . Riak BigTable Hypertable HBase MongoDB Terrastore Scalaris Berkeley

Related Documents:

Everything You Wanted To Know About Storage (But Were Too Proud To Ask) Part Chartreuse . Your Hosts 2 J Metz SNIA Board of Directors Cisco Chad Hintz SNIA-ESF Board Cisco Fred Knight Standards Technologist NetApp John Kim SNIA-ESF Chair Mellanox . About SNIA 3 . SNIA Legal Notice ! The mate

Distributed Database Design Distributed Directory/Catalogue Mgmt Distributed Query Processing and Optimization Distributed Transaction Mgmt -Distributed Concurreny Control -Distributed Deadlock Mgmt -Distributed Recovery Mgmt influences query processing directory management distributed DB design reliability (log) concurrency control (lock)

Feb 13, 2020 · The SNIA must be acknowledged as the source of any material used in the body of any document containing material from these presentations. This presentation is a project of the SNIA. Neither the author nor the presenter is an attorney and nothing in this presentation is intended to be, or

Jun 02, 2020 · And a new computational storage work area launched in late 2018 All this has led to the evolution of the SNIA Solid State Storage Initiative into the SNIA Compute, Memory, and Storage Initiative Recognizes this fundamental opportunity to combine storage, me

Using SPEC SFS with the SNIA Emerald Program for EPA Energy Star Data Center Storage Program Industry Stakeholders Update Verno

Q1 2017. EDSFF group formed. Q4 2017. EDSFF hands off specs to SNIA SFF-TA. SFF-TA-1009 1.0 published

Chad Hintz, Cisco Eric Forgette, Nimble Storage . SNIA Legal Notice ! The material contained in this presentation is copyrighted by the SNIA unless otherwise noted. ! Member companies and individu

Managing Private and Hybrid Clouds for Data Storage 4 2010 STORAGE NETWORKING INDUSTRY ASSOCIATION Improving Private and Hybrid Cloud Storage - CDMI Building on this use of the SNIA Storage Industry Resource Domain Model, the management of private and hybrid clouds (as well as public clouds) is addressed by the SNIA in CDMI. Designed to enable