Accelerating NoSQL

2y ago
18 Views
2 Downloads
924.48 KB
43 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Azalea Piercy
Transcription

Accelerating NoSQLRunning Voldemort on HailDBSunny GleasonMarch 11, 2011

whoami Sunny Gleason, human passion: distributed systems engineering previous.Ning : custom social networksAmazon.com : infra & web services now.building cloud infrastructure

whereami twitter : twitter.com/sunnygleason github : github.com/sunnygleason linkedin : linkedin.com/in/sunnygleason

what’s in this presentation? NoSQL Roundup Voldemort who? HailDB wha? Results & Next Steps Special Bonus Material

NoSQL “Not Only” SQL What’s the point? Proponent: “reaching next level of scale” Cynic: “cloud is hype, ops nightmare”

what does it gain? Higher performance, scalability, availability More robust fault-tolerance Simplified systems design Easier operations

what does it lose? Reduced / simplified programming model No ad-hoc queries, no joins, no txns Not ACID: Atomicity / Consistency /Isolation / Durability Operations / management is still evolving Challenging to quantify health of system Fewer domain experts

NoSQL MapKV Stores(volatile)KV RiakCouchDB,MongoDBCassandra,BigTable,HBaseNeo4J

NoSQL MapKV Stores(volatile)KV umnStoreGraphStoreDynamo,Voldemort,Riak

motivation database on 1 box : ok database with master/slave replication : ok database on cluster : tricky database on SAN : time bomb

performanceComplexity Sharding FusionIO SSDMySQLVoldemort ClusterMemcached1K10K100KAggregate Operations / Sec1M

dynamo case study Amazon : high read throughput, alwaysaccessible writes Shopping cart application ‘Glitches’ ok, duplicate or missing item Data loss or unavailability is unacceptable Solution: K-V schema plus smart routing &data placement

key-value storage Essentially, a gigantic hash table Typically assign byte[] values to byte[] keys Plus versioning mixed in to handle failuresand conflicts Yes, you *can* do range partitioning; inpractice, avoid it because of hot spots

k-v: durable vs. volatile RAM is ridiculous speed (ns), not durable Disk is persistent and slow (3-7ms) RAID eases the pain a bit (4-8x throughput) SSD is providing good promise (100-300us) FusionIO is redefining the space (30-100us)

dynamo clones Voldemort : from LinkedIn, dynamoimplementation in Java (default: BDB-JE) Riak : from Basho, dynamo implementationin Erlang (default: embedded InnoDB)

Voldemort Developed at LinkedIn Scalable Key-Value Storage Based on Amazon Dynamo model High Read Throughput Always Writable

Voldemort features Consistent Hashing Quorum settings : R, W, N Auto-sharding & rebalancing Pluggable storage engines

Consistent Hashing* Arrange keys around ring* Compute token in ringusing hash function* Determine nodes responsiblefor token using live set

R/W/N N : maximum number of nodes to queryfor an operation R : read quorum W : write quorum Can adjust ‘quorum’ to balance throughputand fault-tolerance

setting up Voldemort 1Step 1: Download the codeDownload either a recent stable release or, for those who like to live moredangerously, the up-to-the-minute build from the build server.Step 2: Start single node cluster bin/voldemort-server.sh config/single node cluster /tmp/voldemort.log &Step 3: Start commandline test client and do some operations bin/voldemort-shell.sh test tcp://localhost:6666Established connection to test via tcp://localhost:6666 put "hello" "world" get "hello"version(0:1): "world" delete "hello" get "hello"null exitk k thx bye.

setting up Voldemort 2 For a cluster, use cloud startup scripts Works with Amazon EC2 See sting-Infrastructure

Voldemort client libraries Java, Scala, Clojure Ruby Python C

storage engines BDB-JE (Oracle Sleepycat, the original) Krati (LinkedIn, pretty new) HailDB (new!) MySQL (old / dated)

BDB-JE Log-Structured B-Tree Fast Storage When Mostly Cached Configured without fsync() by default -writes are batched and flushed periodically

Krati Fast Hash-Oriented Storage Uses memory-mapped files for speed Configured without fsync() by default -writes are batched and flushed periodically

HailDB Fork of MySQL InnoDB plugin(contributors : Oracle, Google, Facebook,Percona) Higher stability for large data sets Fast crash recovery External from Java heap (ease GC pain) apt-get install haildb (from launchpad PPA) Use “flush-once-per-second” mode

HailDB, Java & VoldemortVoldemort ClientVoldemort Nodev-storage-innog414-haildbJNAHailDB(log, buffer pool,tablespace)Voldemort NodeVoldemort Node

HailDB & Java g414-haildb : where the magic happens uses JNA: Java Native Access dynamic binding to libhaildb shared library auto-generated from .h file (w/ JNAerator) Pointer classes & other shenanigans

HailDB schemakey VARBINARY(200)version VARBINARY(200)value BLOBPRIMARY KEY( key, version)

implementation gotchas InnoDB API-level usage is unclear Synchronization & locking is unclear Therefore. I learned to love reading C Error handling is *nasty* Installation a bit of a pain

experimental setup OS X: 8-Core Xeon, 32GB RAM, 200GBOWC SSD Faban Benchmark : PUT 64-byte key, 1024byte value Scenarios:1, 2, 4, 8 threads 512M Java Heap

Perf: BDB Put

Perf: Krati Put

Perf: HailDB Put

future work Improve Packaging / Installation Schema refinements & perf enhancements Online backup/export with XtraBackup JNI Bindings

schema refinements Build upon Nokia work on fast k-v schema 8-byte ‘long’ key hash vs. full key bytes Smart use of secondary indexes Native representation of vector clocks Delayed / soft deletion Expect 40-50% performance boost

InnoDB tuning Skinny columns, skinny rows! (esp. Primary Key) Varchar enum ‘bad’, int or smallint ‘good’ fixed-width rows allows in-place updates Use covering indexes strategically More data per page means faster index scans,more efficient buffer pool utilization You only get so many trx’s on given CPU/RAMconfiguration - benchmark this!

refined schemaid BIGINT (auto increment)key hash BIGINTkey VARBINARY(200)version VARBINARY(200)value BLOBPRIMARY KEY( id)KEY( key hash)

online backup hot backup of data to other machine /destination test Percona Xtrabackup with HailDB next step: backup/export to Hadoop/HDFS(similar to Cloudera Sqoop tool)

JNI bindings JNI can get 2-5x perf boost vs. JNA . at the expense of nasty code Will go for schema optimizations andInnoDB tuning tips *first*

resources github.com/voldemort/voldemortfreenode #voldemort on/g414-haildb jna.dev.java.net

more resources Amazon Dynamo Faban / XFaban HailDB Drizzle PBXT

Thank You!

Riak Memcached, Redis Column Store Cassandra, BigTable, HBase Graph Store Document Store CouchDB, MongoDB Neo4J. NoSQL Map NoSQL Key-Value Store KV Stores (durable) KV Stores (vo

Related Documents:

towards NoSQL databases is the high cost of legacy RDBMS vendors versus NoSQL software. In general, NoSQL software is a fraction of what vendors such as IBM and Oracle charge for their databases. What Constitutes an Enterprise NoSQL Solution? What should a technology leader or decision-maker look for in a NoSQL offering that defines it as truly

Chapter 2: NoSQL Tutorial: Learn NoSQL Features, Types, What is, Advantages What is NoSQL? NoSQL is a non-relational DMS, that does not require a fixed schema, avoids joins, and is easy to scale. NoSQL database is used for distributed data stores with humongous data storage needs. No

1. SQL Interface to RDB and NoSQL Database. To access both RDB and NoSQL databases, we provide a general SQL interface. It consists of a SQL query parser and Apache Phoenix to connect HBase as a NoSQL database to a SQL translator and a MySQL JDBC driver to an RDB connector. The application does not need to change the queries or manage NoSQL .

Oracle NoSQL Database Hands on Workshop Lab Exercise 1 - Start Oracle NoSQL Database instance and access data from Formatter classes In this exercise, you will start an Oracle NoSQL Database instance that has movie data preloaded. KVLite will be used as the Oracle NoSQL Database Instance. A very brief introduction to KVLite follows:

NoSQL database. A NoSQL database can be used to solve new problems that require: Scalability - A NoSQL database can scale horizontally to the scale required by big data. Applications can run in parallel on a cloud-based cluster comprising of dozens, hundreds, or even thousands of commodity servers. The NoSQL scale-out architecture

MongoDB! Riak! Couchbase! Voldemort! Neo4J Titan!for!HBase! DOCUMENT COLUMN GEO GRAPH SEARCH OBJECT . NOSQL PROJECTS WITH BACKING . NOSQL COMPANIES MONGODB!!! CASSANDRA!! RIAK!! COUCH*! NEO4J ELASTICSEARCH! NOSQL INVESTMENTS 73 39 26

family of NoSQL storage systems (e.g. column, document stores). Most notably is the Yahoo! Cloud Serving Bench-mark (YCSB), which has become the de-facto standard for cross-family comparison of NoSQL databases [8]. In the remainder of this section we compare benchmark initiatives per NoSQL category, starting with key-value benchmarks and YCSB.

1. A paradigm shift from the traditional data model. SQL databases enforce a strict schema, whereas NoSQL databases has a week notion of schema. At the core all NoSQL databases are key/value systems, the difference is whether the database understands the value or not. Different type of NoSQL databases have different properties. We'll see four major