NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL

2y ago
14 Views
2 Downloads
984.10 KB
93 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Roy Essex
Transcription

NON-BLOCKING DATA STRUCTURESAND TRANSACTIONAL MEMORYTim Harris, 28 November 2014

Lecture 8 Problems with locks Atomic blocks and composition Hardware transactional memory Software transactional memory

Transactional MemoryCompanion slides forThe Art of Multiprocessor Programmingby Maurice Herlihy & Nir Shavit

Our Vision for the FutureIn this course, we covered &.Best practices &New and clever ideas &And common-sense observations.Art of MultiprocessorProgramming4

Our Vision for the FutureIn this course, we covered &.Nevertheless &Best practices &Concurrent programming is still too hard &New and clever ideas &Here we explore why this is &.And common-sense observations.And what we can do about it.Art of MultiprocessorProgramming5

LockingArt of MultiprocessorProgramming6

Coarse-Grained LockingEasily made correct &But not scalable.Art of MultiprocessorProgramming7

Fine-Grained LockingCan be tricky &Art of MultiprocessorProgramming8

Locks are not RobustIf a thread holdinga lock is delayed &No one else canmake progressArt of MultiprocessorProgramming9

Locking Relies on Conventions Relation betweenActual comment– Locks and objectsfrom Linux Kernel(hat tip: Bradley Kuszmaul)– Exists only in programmer’s mind/** When a locked buffer is visible to the I/O layer* BH Launder is set. This means before unlocking* we must clear BH Launder, mb() on alpha and then* clear BH Lock, so no reader can see BH Launder set* on an unlocked buffer and then risk to deadlock.*/Art of MultiprocessorProgramming

Simple Problems are hardenq(x)double-ended queueenq(y)No interference ifends “far apart”Interference OK ifqueue is smallClean solution ispublishable result:[Michael & Scott PODC 97]Art of MultiprocessorProgramming11

Locks Not ComposableTransfer item from onequeue to anotherMust be atomic :No duplicate or missing itemsArt of MultiprocessorProgramming12

Locks Not ComposableLock sourceUnlock source& targetLock targetArt of MultiprocessorProgramming13

Locks Not ComposableLock sourceMethods cannot bjects must exposelocking protocols to clientsLock targetClients must devise andfollow protocolsAbstraction broken!Art of MultiprocessorProgramming14

Monitor Wait and SignalEmptybufferzzzYes!If buffer is empty,wait for item to show upArt of MultiprocessorProgramming15

Wait and Signal do not Composeemptyemptyzzz&Wait for either?Art of MultiprocessorProgramming16

The Transactional Manifesto Current practice inadequate– to meet the multicore challenge Research Agenda– Replace locking with a transactional API– Design languages or libraries– Implement efficient run-time systemsArt of MultiprocessorProgramming17

TransactionsBlock of code &.Atomic: appears to happeninstantaneouslySerializable: all appear tohappen in one-at-a-timeCommit:ordertakes effect(atomically)Abort: has no effect(typically restarted)Art of MultiprocessorProgramming18

Atomic Blocksatomic {x.remove(3);y.add(3);}atomic {y null;}Art of MultiprocessorProgramming19

Atomic Blocksatomic {x.remove(3);y.add(3);}No data raceatomic {y null;}Art of MultiprocessorProgramming20

A Double-Ended Queuepublic void LeftEnq(item x) {Qnode q new Qnode(x);q.left left;left.right q;left q;}Write sequential CodeArt of MultiprocessorProgramming21

A Double-Ended Queuepublic void LeftEnq(item x)atomic {Qnode q new Qnode(x);q.left left;left.right q;left q;}}Art of MultiprocessorProgramming22

A Double-Ended Queuepublic void LeftEnq(item x) {atomic {Qnode q new Qnode(x);q.left left;left.right q;left q;}}Enclose in atomic blockArt of MultiprocessorProgramming23

Warning Not always this simple– Conditional waits– Enhanced concurrency– Complex patterns But often it is&Art of MultiprocessorProgramming24

Composition?Art of MultiprocessorProgramming25

Composition?public void Transfer(Queue T q1, q2){atomic {T x q1.deq();Trivial or what?q2.enq(x);}}Art of MultiprocessorProgramming26

Conditional Waitingpublic T LeftDeq() {atomic {if (left null)retry; }}Roll back transactionand restart whensomething changesArt of MultiprocessorProgramming27

Composable Conditional Waitingatomic {x q1.deq();} orElse {x q2.deq();}Run 1st ndmethod. If it retries %Run 2 method. If it retries %Entire statement retriesArt of MultiprocessorProgramming28

Hardware TransactionalMemory Exploit Cache coherence Already almost does it– Invalidation– Consistency checking Speculative execution– Branch prediction optimistic synch!Art of MultiprocessorProgramming29

HW Transactional MemoryreadactiveTcachesInterconnectmemoryArt of MultiprocessorProgramming30

Transactional MemoryactivereadactiveTTcachesmemoryArt of MultiprocessorProgramming31

Transactional MemoryactivecommittedactiveTTcachesmemoryArt of MultiprocessorProgramming32

Transactional MemorywritecommittedactiveTD cachesmemoryArt of MultiprocessorProgramming33

Rewindaborted activewriteactiveTTD cachesmemoryArt of MultiprocessorProgramming34

Transaction Commit At commit point– If no cache conflicts, we win. Mark transactional entries– Read-only: valid– Modified: dirty (eventually written back) That’s all, folks!– Except for a few details &Art of MultiprocessorProgramming35

Not all Skittles and Beer Limits to– Transactional cache size– Scheduling quantum Transaction cannot commit if it is– Too big– Too slow– Actual limits platform-dependentArt of MultiprocessorProgramming36

HTM Strengths &Weaknesses Ideal for lock-free data structures

HTM Strengths &Weaknesses Ideal for lock-free data structures Practical proposals have limits on– Transaction size and length– Bounded HW resources– Guarantees vs best-effort

HTM Strengths &Weaknesses Ideal for lock-free data structures Practical proposals have limits on– Transaction size and length– Bounded HW resources– Guarantees vs best-effort On fail– Diagnostics essential– Try again in software?

CompositionLocks don’t compose, transactions do.Composition necessary for Software Engineering.But practical HTM doesn’t really supportcomposition!Why we need STM

Transactional Consistency Memory Transactions are collections ofreads and writes executed atomically They should maintain consistency– External: with respect to the interleavingsof other transactions (linearizability).– Internal: the transaction itself shouldoperate on a consistent state.

External ConsistencyInvariant x 2y84 X42 YTransaction A:Write xWrite yTransaction B:Read xRead yCompute z 1/(x-y) 1/2ApplicationMemory

A Simple Lock-Based STM STMs come in different forms– Lock-based– Lock-free Here : a simple lock-based STM Lets start by Guaranteeing ExternalConsistencyArt of MultiprocessorProgramming43

Synchronization Transaction keeps– Read set: locations & values read– Write set: locations & values to be written Deferred update– Changes installed at commit Lazy conflict detection– Conflicts detected at commitArt of MultiprocessorProgramming44

STM: Transactional LockingMapV#ApplicationMemoryV#Array ofversion #s &locksV#Art of MultiprocessorProgramming45

Reading an ObjectMemLocksV#V#Add version numbers& values to read setV#V#V#Art of MultiprocessorProgramming46

To Write an ObjectMemLocksV#V#Add version numbers &new values to write setV#V#V#Art of MultiprocessorProgramming47

To CommitMemLocksV#XV#V# 1V#Acquire write locksCheck version numbersunchangedInstall new valuesIncrement version numbersUnlock.V#YV# 1V#Art of MultiprocessorProgramming48

Encounter Order Locking (Undo Log)MemLocksV#V#XXV# 1 0010V# 1V#V#V#V#YY0000V#V# 1V#V# 1 0010V#V#00V#V#V#0001.2.3.4.5.6.To Read: load lock locationCheck unlocked add to Read-SetTo Write: lock location, store valueAdd old value to undo-setValidate read-set v#’s unchangedRelease each lock with v# 1Quick read of values freshlywritten by the reading transaction

Commit Time Locking (Write Buff)MemXXYYLocksV#V#V#000V# 1V#V#V# 100001V#V#00V#V# 1V#V#V# 1V# 1001000V#V#00V#V#V#V#00001.2.3.4.5.6.7.To Read: load lock locationLocation in write-set? (Bloom Filter)Check unlocked add to Read-SetTo Write: add value to write setAcquire LocksValidate read/write v#’s unchangedRelease each lock with v# 1Hold locks for very short duration

COM vs. ENC High LoadRed-Black Tree 20%20%Update 60% Lookup(c) SmallDeleteRed-Black L:Enc:POmutexspinlockmcslockstm fraserstm 0002000000ENC1000000MCS002468threads10121416

COM vs. ENC Low LoadLargeDeleteRed-Black TreeRed-Black Tree (b)5%5%5%/5%/90%Update 90% mutexspinlockmcslockstm fraserstm 0600000040000002000000MCS002468threads10121416

Problem: InternalInconsistency A Zombie is an active transaction destined toabort. If Zombies see inconsistent states bad thingscan happen

Internal Consistency48 x24 yInvariant: x 2yTransaction A: reads x 4Transaction B: writes8 to x, 16 to y, aborts A )Transaction A: (zombie)reads y 4computes 1/(x-y)Divide by zero FAIL!Art of MultiprocessorProgramming54

Solution: The Global Clock(The TL2 Algorithm) Have one shared global clock Incremented by (small subset of) writingtransactions Read by all transactions Used to validate that state worked on isalways consistentArt of MultiprocessorProgramming55

Read-Only TransactionsMemLocksCopy version clock to localread version clock1232561910010017Shared VersionClockPrivate ReadVersion (RV)Art of MultiprocessorProgramming56

Read-Only TransactionsMemLocks1232Copy version clock to localread version clockRead lock, version #, andmemory561910010017Shared VersionClockPrivate ReadVersion (RV)Art of MultiprocessorProgramming57

Read-Only TransactionsMemLocks1232561917Copy version clock to localreadversionversion#,clockRead lock,andmemory, check version # lessOn Commit:than read clockcheck unlocked &version #s less thanlocal read100 clock100Shared VersionClockPrivate ReadVersion (RV)Art of MultiprocessorProgramming58

Read-Only TransactionsMemLocksCopy version clock to localread version clockRead lock, version #, and32memoryWe have taken a snapshotwithoutOnCommit:keepingan explicitreadunlockedset!56check&version #s less than19local read100 clock1001217Shared VersionClockPrivate ReadVersion (RV)Art of MultiprocessorProgramming59

Example Execution: Read 05050V#00100Shared Version Clock1. RV Shared Version Clock2. On Read: read lock, read mem,read lock: check unlocked,unchanged, and v# RV3. Commit.Reads form a snapshot of memory.No read set!RV

Ordinary (Writing) TransactionsMemLocksCopy version clock to localread version clock1232561910010017Shared VersionClockPrivate ReadVersion (RV)Art of MultiprocessorProgramming61

Ordinary TransactionsMemLocks123256Copy version clock to localread version clockOn read/write, check:Unlocked & version # RVAdd to R/W set1910010017Shared VersionClockPrivate ReadVersion (RV)Art of MultiprocessorProgramming62

On CommitMemLocksAcquire write locks1232561910010017Shared VersionClockPrivate ReadVersion (RV)Art of MultiprocessorProgramming63

On CommitMemLocks12Acquire write locksIncrement Version Clock32561910010010117Shared VersionClockPrivate ReadVersion (RV)Art of MultiprocessorProgramming64

On CommitMemLocks1232Acquire write locksIncrement Version ClockCheck version numbers RV561910010010117Shared VersionClockPrivate ReadVersion (RV)Art of MultiprocessorProgramming65

On CommitMemLocks12x32Acquire write locksIncrement Version ClockCheck version numbers RVUpdate memory5619100100101y17Shared VersionClockPrivate ReadVersion (RV)Art of MultiprocessorProgramming66

On CommitMemLocks12x3210156Acquire write locksIncrement Version ClockCheck version numbers RVUpdate memoryUpdate write version #s19100100101y10117Shared VersionClockPrivate ReadVersion (RV)Art of MultiprocessorProgramming67

Example: Writing 00Shared Version Clock1. RV Shared Version Clock2. On Read/Write: checkunlocked and v# RV thenadd to Read/Write-Set3. Acquire Locks4. WV F&I(VClock)5. Validate each v# RV6. Release locks with v# WVRVReads Inc Writes serializable

TM Design Issues Implementationchoices Language designissues Semantic issuesArt of MultiprocessorProgramming69

Granularity Object– managed languages, Java, C#, &– Easy to control interactions betweentransactional & non-trans threads Word– C, C , &– Hard to control interactions betweentransactional & non-trans threadsArt of MultiprocessorProgramming70

Direct/Deferred Update Deferred– modify private copies & install on commit– Commit requires work– Consistency easier Direct– Modify in place, roll back on abort– Makes commit efficient– Consistency harderArt of MultiprocessorProgramming71

Conflict Detection Eager– Detect before conflict arises– “Contention manager” module resolves Lazy– Detect on commit/abort Mixed– Eager write/write, lazy read/write &Art of MultiprocessorProgramming72

Conflict Detection Eager detection may abort transactionsthat could have committed. Lazy detection discards morecomputation.Art of MultiprocessorProgramming73

Contention Management &Scheduling How to resolveconflicts? Who moves forwardand who rolls back? Lots of empiricalwork but formalwork in infancyArt of MultiprocessorProgramming74

Contention Manager Strategies Exponential backoff Priority to– Oldest?– Most work?– Non-waiting? None Dominates But needed anywayJudgment of SolomonArt of MultiprocessorProgramming75

I/O & System Calls? Some I/O revocable– Provide transactionsafe libraries– Undoable filesystem/DB calls Some not– Opening cashdrawer– Firing missileArt of MultiprocessorProgramming76

I/O & System Calls One solution: make transactionirrevocable– If transaction tries I/O, switch to irrevocablemode. There can be only one &– Requires serial execution No explicit aborts– In irrevocable transactionsArt of MultiprocessorProgramming77

Exceptionsint i 0;try {atomic {i ;node new Node();}} catch (Exception e) {print(i);}Art of MultiprocessorProgramming78

ExceptionsThrows OutOfMemoryException!int i 0;try {atomic {i ;node new Node();}} catch (Exception e) {print(i);}Art of MultiprocessorProgramming79

ExceptionsThrows OutOfMemoryException!int i 0;try {atomic {i ;node new Node();}} catch (Exception e) {print(i);}Art of MultiprocessorProgrammingWhat isprinted?80

Unhandled Exceptions Aborts transaction– Preserves invariants– Safer Commits transaction– Like locking semantics– What if exception object refers to valuesmodified in transaction?Art of MultiprocessorProgramming81

Nested Transactionsatomic void foo() {bar();}atomic void bar() { }Art of MultiprocessorProgramming82

Nested Transactions Needed for modularity– Who knew that cosine() contained atransaction? Flat nesting– If child aborts, so does parent First-class nesting– If child aborts, partial rollback of child onlyArt of MultiprocessorProgramming83

Hatin’ on TMSTM is too inefficient

Hatin’ on TMRequires radical change in programming style

Hatin’ on TMErlang-style shared nothing only true path to salvation

Hatin’ on TMThere is nothing wrong with what we do today.

Gartner Hype CycleYou are hereHat tip: Jeremy Kemp

I, for one, Welcome our newMulticore Overlords & Multicore forces us to rethinkalmost everythingArt of MultiprocessorProgramming89

I, for one, Welcome our newMulticore Overlords & Multicore forces us to rethinkalmost everything Standard approaches toocomplexArt of MultiprocessorProgramming90

I, for one, Welcome our newMulticore Overlords & Multicore forces us to rethinkalmost everything Standard approaches won’tscale Transactions might make lifesimpler&Art of MultiprocessorProgramming91

I, for one, Welcome our newMulticore Overlords & Multicore forces us to rethink almosteverything Standard approaches won’t scale Transactions might & Multicore programmingPlenty more to do&Maybe you will save us&Art of MultiprocessorProgramming92

Thanks ! תודה Art of MultiprocessorProgramming93

Art of Multiprocessor Programming 5599 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy version clock to local read version clock Read lock, version #, and On Commit:memory check unlocked & version #s less than local read clock100 We have taken a snapshot without keeping an explicit read set!

Related Documents:

6.1 Blocking Credit Card Numbers 34 6.2 Blocking Names 35. 3 6.3 Blocking Domain Names 36 6.4 Blocking IP and Class C Addresses 37 6.5 Setting Maximum Purchase Limit 37 6.6 Setting Auto Lockout and Duplicate 38 6.7 Setting a Country Profile 38 . . Virtual Terminal. . , the , Virtual Terminal. processing. .

COMPARA TIVE COST OF THE DIFFERENT BLOCKING AGENTS Table 3 compares the relative cost of the different blocking agents per 100 ml of a 10% solution. Bovine serum albumin and normal goat serum are the two most expensive blocking agents costing US 7·97/100 ml and US 6·22/100 ml respectively. Skimmed milk and both fresh and crude egg

26. 2-Point Stance ( WR & DE ) 27. 3-Point Stance 28. 4-Point Stance Section 3: OFFENSIVE SKILLS 29. Under Center Snaps 30. Direct Snaps 31. Bird Dog 32. Sled Blocking 33. Board Drill 34. Double Teams 35. Pulling Progression 36. Crab Blocking 37. Smart Blocking 38. Wedge Progression 39. Splatter Blocking 40. Get a Grip 41. End Run 42. Gauntlet 43.

BD Blocking distance E Empty calibration F Full calibration 2.4 Measuring range 2.4.1 Blocking distance, Nozzle mounting Install the Prosonic M at a height so that the blocking distance BD is not undershot, even at maximum fill level. Use a pipe no zzle if you cannot maintain the blocking distance in any other way.

etc.) invoke callbacks Because thread never blocks, it can fully use a core by starting operations and handling completion callbacks This is more than non-blocking operations, which just says it won't block: non-blocking by itself has to use spin loops Example of non-blocking: waitpid with WNOHANG

Non-primitive data structures . are more complicated data structures and are derived from primitive data structures. They emphasize on grouping same or different data items with relationship between each data item. Arrays, lists and files come under this category. Figure 1.1 shows the classification of data structures.

Keywords: NMBAs, (NMJ), Non-depolarizing blocking agents, Depolarizing blocking agents, Tubocurarine. _ Introduction A neuromuscular junction (NMJ) is the synapse or junction of the axon terminal of a motoneuron with the motor end plate, th

Apr 16, 2009 · 1 Data Structures and Algorithms 3 1.1 A Philosophy of Data Structures 4 1.1.1 The Need for Data Structures 4 1.1.2 Costs and Benefits 6 1.2 Abstract Data Types and Data Structures 8 1.3 Design Patterns 12 1.3.1 Flyweight 13 1.3.2 Visitor 14 1.3.3 Composite 15 1.3.4 Strategy 16 1.4 Problems, Algorith