• Have any questions?
  • info.zbook.org@gmail.com

Phase Reconciliation For Contended In-Memory Transactions

1m ago
2.09 MB
43 Pages
Last View : 10d ago
Last Download : n/a
Upload by : Konnor Frawley

Phase Reconciliation forContended In-MemoryTransactionsNeha Narula, Cody Cutler, Eddie Kohler, Robert MorrisMIT CSAIL and Harvard1

IncrTxn(k Key) {INCR(k, 1)}LikePageTxn(page Key, user Key) {INCR(page, 1)liked pages : GET(user)PUT(user, liked pages page)}FriendTxn(u1 Key, u2 Key) {PUT(friend:u1:u2, 1)PUT(friend:u2:u1, 1)}2

IncrTxn(k Key) {INCR(k, 1)}LikePageTxn(page Key, user Key) {INCR(page, 1)liked pages : GET(user)PUT(user, liked pages page)}FriendTxn(u1 Key, u2 Key) {PUT(friend:u1:u2, 1)PUT(friend:u2:u1, 1)}3

ProblemApplications experience write contention onpopular data4


Concurrency Control Enforces SerialExecutioncore 0core 1core 2INCR(x,1)INCR(x,1)INCR(x,1)timeTransactions on the same recordsexecute one at a time6

Throughput (txns/sec)Throughput on a ContentiousTransactional WorkloadOCC010203040cores506070807

Throughput (txns/sec)Throughput on a ContentiousTransactional WorkloadDoppelOCC010203040cores506070808

INCR on the Same Records CanExecute in Parallelcore 0core 1core 2INCR(x0,1)INCR(x1,1)per-core slicesof record x1x is split acrosscores1INCR(x2,1)1time Transactions on the same record can proceed inparallel on per-core slices and be reconciled later This is correct because INCR commutes9

Databases Must Support GeneralPurpose TransactionsIncrTxn(k Key) {IncrPutTxn(k1 Key, k2 Key, v Value) {INCR(k, 1)Must happen INCR(k1, 1)atomically}PUT(k2, v)}PutMaxTxn(k1 Key, k2 Key) {v1 : GET(k1)v2 : GET(k2)if v1 v2:Must happenatomicallyPUT(k1, v2)else:PUT(k2, v1)return v1,v2Returns a value}10

ChallengeFast, general-purpose serializable transactionexecution with per-core slices for contendedrecords11

Phase Reconciliation Database automaticallydetects contention tosplit a record amongcores Database cyclesthrough phases: split,reconciliation, , an in-memory transactional database12

ContributionsPhase reconciliation– Splittable operations– Efficient detection and response to contentionon individual records– Reordering of split transactions and reads toreduce conflict– Fast reconciliation of split values13

Outline1.2.3.4.Phase reconciliationOperationsDetecting contentionPerformance evaluation14

Split Phasesplit phasecore 0INCR(x,1)core 0INCR(x0,1)core 1INCR(x,1) PUT(y,2)core 1INCR(x1,1) PUT(y,2)core 2INCR(x,1) PUT(z,1)core 2core 3INCR(x,1) PUT(y,2)core 3INCR(x2,1) PUT(z,1)INCR(x3,1) PUT(y,2) The split phase transforms operations on contendedrecords (x) into operations on per-core slices (x0, x1, x2, x3)15

split phasecore 0INCR(x0,1)core 1INCR(x1,1) PUT(y,2)core 2core 3INCR(x2,1) PUT(z,1)INCR(x3,1) PUT(y,2) Transactions can operate on split and non-split records Rest of the records use OCC (y, z) OCC ensures serializability for the non-split parts of thetransaction16

split phasecore 0INCR(x0,1)core 1INCR(x1,1) PUT(y,2)core 2core 3INCR(x2,1) 3,1) PUT(y,2) Split records have assigned operations for a given split phase Cannot correctly process a read of x in the current state Stash transaction to execute after reconciliation17

split phasecore 0INCR(x0,1)core 1INCR(x1,1) PUT(y,2)core 2INCR(x2,1) PUT(z,1)core 3INCR(x1,1)INCR(x1,1)INCR(x2,1)INCR(x3,1) PUT(y,2)GET(x) All threads hear they should reconcile their per-core state Stop processing per-core writes18

reconciliation phasecore 0core 1joined phasex x x0x x x1core 2x x x2core 3x x x3GET(x) Reconcile state to global store Wait until all threads have finished reconciliation Resume stashed read transactions in joined phase19

reconciliation phasecore 0core 1core 2core 3joined phasex x x0GET(x)x x x1x x x2x x x3 Reconcile state to global store Wait until all threads have finished reconciliation Resume stashed read transactions in joined phase20

joined phasecore 0GET(x)core 1core 2INCR(x)INCR(x, 1)GET(x)GET(x)core 3 Process new transactions in joined phase using OCC No split data21

Batching Amortizes the Cost ofReconciliationjoinedsplit phasecore 0INCR(x0,1)core 1INCR(x1,1) INCR(y,2)core 2core 3INCR(x2,1) 1) INCR(z,1)INCR(x3,1) INCR(y,2)GET(x)GET(x)GET(x) Wait to accumulate stashed transactions, batch for joined phase Amortize the cost of reconciliation over many transactions Reads would have conflicted; now they do not22

Phase Reconciliation Summary Many contentious writes happen in parallelin split phases Reads and any other incompatibleoperations happen correctly in joinedphases23

Outline1.2.3.4.Phase reconciliationOperationsDetecting contentionPerformance evaluation24

Operation ModelDevelopers write transactions as stored procedures whichare composed of operations on keys and values:Traditional key/valueoperationsOperations on numericvalues which modify theexisting valueOrdered PUT and insertto an ordered listvalue GET(k)void PUT(k,v)void INCR(k,n)void MAX(k,n)void MULT(k,n)Not splittableSplittablevoid OPUT(k,v,o)void TOPK INSERT(k,v,o)25

MAX Can Be Efficiently Reconciledcore 0core 1core 552721x 55 Each core keeps one piece of state xi O(#cores) time to reconcile x Result is compatible with any order26

What Operations Does Doppel Split?Properties of operations that Doppel can split:– Commutative– Can be efficiently reconciled– Single key– Have no return valueHowever:– Only one operation per record per split phase27

Outline1.2.3.4.Phase reconciliationOperationsDetecting contentionPerformance evaluation28

Which Records Does Doppel Split? Database starts out with no split data Count conflicts on records– Make key split if #conflicts conflictThreshold Count stashes on records in the split phase– Move key back to non-split if #stashes too high29

Outline1.2.3.4.Phase reconciliationOperationsDetecting contentionPerformance evaluation30

Experimental Setup andImplementation All experiments run on an 80 core Intel serverrunning 64 bit Linux 3.12 with 256GB of RAM Doppel implemented as a multithreaded Goserver; one worker thread per core Transactions are procedures written in Go All data fits in memory; don’t measure RPC All graphs measure throughput in transactions/sec31

Performance Evaluation How much does Doppel improvethroughput on contentious write-onlyworkloads? What kinds of read/write workloads benefit? Does Doppel improve throughput for arealistic application: RUBiS?32

Doppel Executes ConflictingWorkloads in ParallelThroughput (millions txns/sec)35302520151050DoppelOCC2PL20 cores, 1M 16 byte keys, transaction: INCR(x,1) all on same key33

Doppel Outperforms OCC Even WithLow Contention35MThroughput (txns/sec)30M5% of writes tocontended key25M20MDoppelOCC15M10M5M0M0204060% of transactions with hot key8020 cores, 1M 16 byte keys, transaction: INCR(x,1) on different keys10034

Contentious Workloads Scale Well100MDoppelOCCThroughput (txns/sec)90M80MCommunication ofphase changing70M60M50M40M30M20M10M0M01020304050number of cores601M 16 byte keys, transaction: INCR(x,1) all writing same key708035

LIKE Benchmark Users liking pages on a social network 2 tables: users, pages Two transactions:– Increment page’s like count, insert user like of page– Read a page’s like count, read user’s last like 1M users, 1M pages, Zipfian distribution of pagepopularityDoppel splits the page-like-counts for popular pagesBut those counts are also read more often36

Throughput (millions txns/sec)9Benefits Even When There AreReads and Writes to the SamePopular Keys876543210DoppelOCC20 cores, transactions: 50% LIKE read, 50% LIKE write37

Doppel Outperforms OCC For AWide Range of Read/Write Mixes18MThroughput (txns/sec)16M14MMore12M10Mstashed read traDoppel does not splitany data and performsthe same as OCC!nsactions8M6M4MDoppelOCC2M0M0204060% of transactions that read20 cores, transactions: LIKE read, LIKE write8010038

RUBiS Auction application modeled after eBay– Users bid on auctions, comment, list new items, search 1M users and 33K auctions 7 tables, 17 transactions 85% read only transactions (RUBiS bidding mix) Two workloads:– Uniform distribution of bids– Skewed distribution of bids; a few auctions are verypopular39

StoreBid TransactionStoreBidTxn(bidder, amount, item) {INCR(NumBidsKey(item),1)MAX(MaxBidKey(item), amount)OPUT(MaxBidderKey(item), bidder, amount)PUT(NewBidKey(), Bid{bidder, amount, item})}All commutative operations onpotentially conflicting auction metadataInserting new bids is not likely to conflict40

Doppel Improves Throughput on anApplication BenchmarkThroughput (millions txns/sec)121083.2xthroughputimprovement8% StoreBidTransactions6DoppelOCC420UniformSkewed80 cores, 1M users 33K auctions, RUBiS bidding mix41

Related Work Commutativity in distributed systems and concurrencycontrol––––[Weihl ’88]CRDTs [Shapiro ’11]RedBlue consistency [Li ’12]Walter [Lloyd ’12] Optimistic concurrency control– [Kung ’81]– Silo [Tu ’13] Split counters in multicore OSes42

ConclusionDoppel: Achieves parallel performance when many transactionsconflict by combining per-core data and concurrencycontrol Performs comparably to OCC on uniform or read-heavyworkloads while improving performance significantly onskewed workloads.http://pdos.csail.mit.edu/doppel43

Wait until all threads have finished reconciliation Resume stashed read transactions in joined phase 19 core 0 core 1 cor