WHAT’S COMING IN CEPH OCTOPUS

2y ago
66 Views
2 Downloads
1.14 MB
22 Pages
Last View : 2m ago
Last Download : 2m ago
Upload by : Aydin Oneil
Transcription

WHAT’S COMING IN CEPHOCTOPUS1Douglas FullerSC19 Ceph BoF2019.11.19

CEPH IS A UNIFIED STORAGE SYSTEMOBJECTBLOCKFILERGWRBDCEPHFSS3 and Swiftobject storageVirtual block deviceDistributed networkfile systemLIBRADOSLow-level storage APIRADOSReliable, elastic, distributed storage layer withreplication and erasure coding2

FIVE tem

FIVE tem

ORCHESTRATOR API 5ceph-mgr: orchestrator APICreate, destroy, start, stop daemonsBlink disk lightsReplace ceph-deploy with ssh backend DASHBOARDrookssh (run this command on that host)Expose provisioning functions to CLI, GUI CLImgr API to interface with deployment toolBootstrap: create mon mgr on local host“Day 2” operations to provision rest ofclusterProvision containers exclusivelyPave way for cleanup of docs.ceph.comAutomated upgradesRookceph-monsshceph-mds?ceph-osd.

DASHBOARD Integration with orchestrator Sidebar with display notifications and progress barsCephFS management features Snapshots and quotasRGW multisite managementPassword improvements 6Initial parts doneAdding OSDs nextStrength indicatorEnforce change after first login

MISC Improvements to progress bars (ceph -s output) Health alert muting PG autoscaler on by defaultBalancer on by default‘ceph tell’ and ‘ceph daemon’ unification 7TTL on mutesAuto-unmute when alert changes, increases in severityHands-off defaults More RADOS events/process supported (mark out, rebalancing)Time estimatesSame expanded command set via either interface (over-the-wire or local unix socket)

FIVE tem

TELEMETRY AND CRASH REPORTS Opt-in 9 basic - cluster size, version, etc.ident - contact info (off by default)crash - anonymized crash metadatadevice - device health (SMART) dataDashboard nag to enable?Backend tools to summarize, query,browse telemetry dataInitial focus on crash reports Telemetry channels Require re-opt-in if telemetry contentexpandedExplicitly acknowledge data sharinglicense Identify crash signatures by stack trace (orother key properties)Correlate crashes with ceph version orother propertiesImproved device failure prediction model Predict error rate instead of binaryfailed/not-failed or life expectancyEvaluating value of some vendor-specificdata

FIVE stem

RADOS QoS PROGRESS Partially implemented dmclock-basedquality-of-service in OSD/libradosBlocked because a deep queue inBlueStore obscures scheduling decisionsGoal: understand, tune, and (hopefully)autotune bluestore queue depth OSDPriority/QoS queueTo replicaTo replicaCurrent status: 11Device type (HDD, SSD, hybrid)Workload (IO size, type)Client IONo luck yet with autotuning, but we havesemi-repeatable process to manuallycalibrate to a particular devicePivot to rebasing dmclock patches,evaluate effectiveness for Background vs client Client vs clientOrderedTransaction QueueBlueStore

CEPHFS ASYNC CREATE, UNLINK Each CephFS metadata operation is a round-trip to the MDSuntar, rm -r tend are dominated by client/MDS network latencyCephFS aggressively leases/delegates state/capabilities to the clientsAllow async creates Except it’s complex 12Linux client can immediately return, queue async operation with MDSSame for unlinktar xf, rm -r, etc. become much faster!Current ordering between request, locks in MDS, and client capabilities

PROJECT CRIMSONWhy Not just about how many IOPS we do More about IOPS per CPU core Current Ceph is based on traditionalmulti-threaded programming model Context switching is too expensive whenstorage is almost as fast as memory 13New hardware devices coming DIMM form-factor persistent memory ZNS - zone-based SSDsWhat Rewrite IO path in using Seastar Preallocate cores One thread per core Explicitly shard all data structuresand work over cores No locks and no blocking Message passing between cores Polling for IO DPDK, SPDK Kernel bypass for network andstorage IO

FIVE stem

CEPHFS MULTI-SITE REPLICATION Scheduling of snapshots and snapshot pruningAutomate periodic snapshot sync to remote cluster Discussing more sophisticated models 15Arbitrary source tree, destination in remote clusterSync snapshots via rsyncMay support non-CephFS targetsBidirectional, loosely/eventually consistent syncSimple conflict resolution behavior?

RBD SNAPSHOT-BASED MIRRORING Today: RBD mirroring provides async replication to another cluster rbd-nbd runner improvements to drive multiple images from one instanceVastly-simplified setup procedure One command on each cluster; copy paste string blobNew: snapshot-based mirroring mode 16Point-in-time (“crash”) consistencyPerfect for disaster recoveryManaged on per-pool or per-image basis(Just like CephFS)Same rbd-mirror daemon, same overall infrastructure/architectureWill work with kernel RBD (RBD mirroring today requires librbd, rbd-nbd, or similar)

FIVE stem

NEW WITH CEPH-CSI AND ROOK Much investment in ceph-csi RWO and RWX support via RBD and/orCephFSSnapshots, clones, and so on Turn-key ceph-csi by defaultDynamic bucket provisioning ObjectBucketClaimExternal cluster modeRun mons or OSDs on top of other PVsUpgrade improvements Wait for healthy between steps Pod disruption budgetsImproved configurationRook: RBD mirroring Rook 1.1 18 Rook: RGW multisite Manage RBD mirroring via CRDsInvestment in better rbd-nbd support toprovide RBD mirroring in KubernetesNew, simpler snapshot-based mirroringFederation of multiple clusters into singlenamespaceSite-granularity replicationRook: CephFS mirroring Eventually.

SAMBA CEPHFS 19Expose inode ‘birth time’Expose snapshot creation time (birth time)Protect snapshots from deletionSupplementary group handling

PROJECT ZIPPER Internal abstraction layer for buckets -- a bucket “VFS”Traditional RADOS backend Pass-through to external store “Stateless” pass through of bucket “foo” to external (e.g., S3) bucketAuth credential translationAPI translation (e.g., Azure Blob Storage backend)Layering 20Index buckets in RADOS; stripe object data over RADOS objectsCompose a bucket from multiple layers

FOR MORE INFORMATION https://ceph.io/Twitter: @cephDocs: http://docs.ceph.com/Mailing lists: http://lists.ceph.io/ IRC: irc.oftc.net 21ceph-announce@ceph.io announcementsceph-users@ceph.io user discussiondev@ceph.io developer discussion#ceph, #ceph-develGitHub: https://github.com/ceph/YouTube ‘Ceph’ channel

22March 4-5Developer Summit - March 3CFP open until December 6https://ceph.io/cephalocon/seoul-2020

SC19 Ceph BoF 2019.11.19 WHAT’S COMING IN CEPH OCTOPUS. 2 CEPH IS A UNIFIED STORAGE SYSTEM RGW S3 and Swift object storage LIBRADOS Low-level storage API RADOS Reliable, elastic, distributed storage layer with replication and erasure coding RBD Virtual block device CEPHFS Distributed network file system OBJECT BLOCK FILE. 3

Related Documents:

Ceph Storage Cluster. Ceph's monitoring and self-repair features minimize administration overhead. You can configure a Ceph Storage Cluster on non-identical hardware from different manufacturers. Ceph Storage for Oracle Linux Release 3.0 is based on the Ceph Community Luminous release (v12.2.5).

Ceph Manager: The Ceph Manager maintains detailed information about placement groups, process metadata and host metadata in lieu of the Ceph Monitor— s ignificantly improving performance at scale. The Ceph Manager handles execution of many of the read-only Ceph CLI queries, such as placement group statistics.

A Red Hat Ceph Storage cluster is the foundation for all Ceph deployments. After deploying a Red Hat Ceph Storage cluster, there are administrative operations for keeping a Red Hat Ceph Storage cluster healthy and performing optimally. The Red Hat Ceph Storage Administration Guide helps storage administrators to perform such tasks as:

Currently, Ceph can be configured to use one of these storage backends freely. Due to Ceph’s popularity in the cloud computing environ-ment, several research efforts have been made to find optimal Ceph configurations under a given Ceph cluster setting [4], [5] or to tune its performance for fast storage like SSD (Solid-State Drive) [6].

Ceph and MySQL represent highly complementary technologies, providing: Strong synergies . MySQL, OpenStack, and Ceph are often chosen to work together. Ceph is the leading open source software-defined storage solution. MySQL is the leading open source rela-tional database management system (RDBMS). 1 Moreover, Ceph is the number-one block .

Ceph orchestrator: OSD Once the hosts are added, we can list the available devices Then we can add the available devices with: ceph orch apply osd --all-available-devices or ceph orch daemon add osd : Or remove from the map and host with: ceph orch osd rm if the osd is marked as destroyed 88

Introduction and Ceph overview Ceph distributed architecture overview A Ceph storage cluster is built from large numbers of Ceph nodes for scalability, fault tolerance, and performance. Each node is based on commodity hardware and uses intelligent Ceph daemons that communicate with each other to: Store and retrieve data Replicate data

Alfredo Chavero (1981) concluye que los anteojos no son otra cosa que ex-presiones de las nubes y en cuanto a los colmillos, . lo señala Alfredo López Austin (1990): .como creador, Tláloc lo fue de la luna, del agua y de la lluvia y fue también uno de los cuatro soles cosmogónicos que precedieron al actual. Además de esto, reinaba en su propio paraí-so, el Tlalocan, que se .