Chapter 6 Memory - University Of Houston–Downtown

3y ago
44 Views
3 Downloads
516.90 KB
20 Pages
Last View : 11d ago
Last Download : 3m ago
Upload by : Konnor Frawley
Transcription

ObjectivesMemoryyCS 2401 Comp. Org.& AssemblyMaster the concepts of hierarchical memoryorganization.iiUnderstand how each level of memorycontributes to system performance, andhow the performance is measured.Master the concepts behind cache memory,virtual memory,memory memory segmentation,segmentationpaging and address translation.Memory -- Chapter 61IntroductionMemory -- Chapter 626.2 Types of MemoryMemory lies at the heart of the storedprogram computer (Von Neumann model).model)In previous chapters, we studied the waysin which memory is accessed by variousISAs.In this chapter, we focus on memoryorganization or memory hierarchy systems.A clear understanding of these ideas isessential for the analysis of systemperformance.CS 2401 Comp. Org. &AssemblyCS 2401 Comp. Org. &AssemblyMemory -- Chapter 63There are two kinds of main memory:random access memory, RAMread-only-memory, ROM.There are two types of RAM,dynamic RAM (DRAM)static RAM (SRAM).DRAM consists of capacitors that slowly leak theircharge over time. Thus they must be refreshed everyfew milliseconds to prevent data loss.DRAM is “cheap”cheap memory owing to its simple designdesign.CS 2401 Comp. Org. &AssemblyMemory -- Chapter 641

6.2 Dynamic RAM6.2 Dynamic RAMFPM RAM (Fast Page Mode RAM) -- 30 MHzSDRAM (synchronous dynamic RAM) -- 100 MHz allows faster access to data in the same row or page.page works by eliminating the need for a row address if datais located in the row previously accessed. can run at much higher clock speeds than conventionalmemory synchronizes itself with the CPU's bus and is capable ofrunning at 133 MHz, about three times faster thanconventional FPM RAM, and about twice as fast EDODRAM and BEDO DRAMEDO RAM (enhanced data-out RAM) -- 66 MHz can start fetching the next block of memory at thesame time that it sends the previous block to the CPUDDR RAM (double data rate SDRAM) – 200MHzBEDO RAM (burst enhanced data-out RAM) can process four memory addresses in one burst can only stay synchronized with the CPU clock for shortperiods can't keep up with processors whose buses run fasterthan 66 MHzCS 2401 Comp. Org. &AssemblyMemory -- Chapter 656.2 Static RAMMemory -- Chapter 6CS 2401 Comp. Org. &AssemblyMemory -- Chapter 666.2 Static RAMSRAM consists of circuits similar to the Dflip flop.flip-flopSRAM is very fast memory and it doesn’tneed to be refreshed like DRAM does. It isused to build cache memory.ROM also does not need to be refreshed,either. In fact, it needs very little charge toretain its memory.ROMO is usedd to store permanent, or semipermanent data that persists even whilethe system is turned off.CS 2401 Comp. Org. &Assembly a type of SDRAM that supports data transfers on bothedges of each clock cycle (the rising and falling edges),effectivelyffl doublingd bltheh memory chip'sh ' datadthroughputhh DDR-SDRAM also consumes less power7RAM chip primary for specialhigh-speed memory calledlevel-1level1 cache memorySRAM (static RAM) - faster and moreexpensive than DRAMspeeds between 8 and12 nssynchronous orasynchronousdoes not require arefresh operationPBSRAM (pipeline burstSRAM) - collect and send multiplerequest for memory as asingle pipelined requestCS 2401 Comp. Org. &AssemblyMemory -- Chapter 682

6.3 The Memory Hierarchy6.3 The Memory HierarchyGenerally speaking, faster memory is moreexpensive than slower memory.memoryTo provide the best performance at thelowest cost, memory is organized in ahierarchical fashion.Small, fast storage elements are kept in theCPU, larger, slower main memory isaccessed through the data bus.Larger, (almost)( l) permanent storage in thehform of disk and tape drives is still furtherfrom the CPU.CS 2401 Comp. Org. &AssemblyMemory -- Chapter 69Storage HierarchyMemory -- Chapter 610To access a particular piece of data, theCPU firstfisendsd a request to itsi nearestmemory, usually cache.If the data is not in cache, then mainmemory is queried. If the data is not inmain memory, then the request goes todisk.Once the data is located, then the data,and a number of its nearby data elementsare fetched into cache memory.A storageg that is activelyy accessible byy the computerpwithouthuman interactionHard driveNear-on-line storage (secondary storage)A storage that can be accessible by the computer humaninteractionfloppy diskCD-ROff-line storage (archival storage)Use as a backupmagnetic tapesMemory -- Chapter 6CS 2401 Comp. Org. &Assembly6.3 The Memory HierarchyOn-line storage (primary storage):CS 2401 Comp. Org. &AssemblyThis storage organization can bethought of as a pyramid:11CS 2401 Comp. Org. &AssemblyMemory -- Chapter 6123

6.3 The Memory Hierarchy6.3.1 Locality of ReferenceThis leads us to some definitions.A hit is when data is found at a given memory level.A miss is when it is not found.The hit rate is the percentage of time data is found at agiven memory level.The miss rate is the percentage of time it is not.Miss rate 1 hit rate.The hit time is the time required to access data at agiven memory level.The miss penalty is the time required to process a miss,i l diincludingththe titime ththatt it takest k tot replacela blockbl k offmemory plus the time it takes to deliver the data to theprocessor.CS 2401 Comp. Org. &AssemblyMemory -- Chapter 6136.4 Cache MemoryMemory -- Chapter 6Temporal locality- Recently-accessed dataelements tend to be accessed again.Spatial locality - Accesses tend to cluster (arraysor loops).Sequential locality - Instructions tend to beaccessed sequentially.CS 2401 Comp. Org. &AssemblyMemory -- Chapter 6146.4 Cache MemoryThe purpose of cache memory is to speed up accessesy storingg recentlyy used data closer to the CPU,,byinstead of storing it in main memory.Although cache is much smaller than main memory,its access time is a fraction of that of main memory.Unlike main memory, which is accessed by address,cache is typically accessed by content; hence, it isoften called content addressable memory.Because of this, a single large cache memory isn’talways desirable-- it takes longer to search.CS 2401 Comp. Org. &AssemblyAn entire blocks of data is copied after a hitbecause the principle of locality tells us thatonce a byte is accessed, it is likely that anearby data element will be needed soon.There are three forms of locality:15The “content” that is addressed in contentaddressable cache memory is a subset ofthe bits of a main memory address called afield.The fields into which a memory address isdivided provide a many-to-one mappingbetween larger main memory and thesmaller cache memory.Many blocks of main memory map to asingle block of cache. A tag field in thecache block distinguishes one cachedmemory block from another.CS 2401 Comp. Org. &AssemblyMemory -- Chapter 6164

6.4.1 Cache Mapping Schemes6.4.1 Direct Mapping SchemeThe simplest cache mapping scheme is direct mappedcache.In a direct mapped cache consisting of N blocks ofcache, block X of main memory maps to cache block Y X mod N.Thus, if we have 10 blocks of cache, block 7 of cachemay hold blocks 7, 17, 27, 37, . . . of main memory.Once a block of memory is copied into its slot incache, a valid bit is set for the cache block to let thesystem know that the block contains valid data.What could happen without having a valid bit?CS 2401 Comp. Org. &AssemblyMemory -- Chapter 6176.4.1 Direct Mapping SchemeBlock 0 contains multiple words from main memory,identified with the tag 00000000. Block 1 containswords identified with the tag 11110101.11110101The other two blocks are not valid.Memory -- Chapter 6Memory -- Chapter 6186.4.1 Direct Mapping SchemeThe diagram below is a schematic of what cache lookslike.CS 2401 Comp. Org. &AssemblyCS 2401 Comp. Org. &Assembly19The size of each field into which a memory address ispon the size of the cache.divided dependsSuppose our memory consists of 214 words, cache has16 24 blocks, and each block holds 8 words.Thus memory is divided into 214 / 23 211 blocks.For our field sizes, we know we need 4 bits for theblock, 3 bits for the word, and the tag is what’s leftover:CS 2401 Comp. Org. &AssemblyMemory -- Chapter 6205

6.4.1 Direct Mapping Scheme6.4.1 Direct Mapping SchemeAs an example, suppose a system using directpp g with 16 words of main memoryy divided intomapping8 blocks, 4 blocks cache.4Main memory consists of 2 words, cache has 4 22blocksFor our field sizes, we know we need 2 bits for theblock, 1 bit for the word, and 1 bit for the tagCS 2401 Comp. Org. &AssemblyMemory -- Chapter 6216.4.1 Direct Mapping SchemeMain MemoryyMapsp toCacheBlock 0 (addresses 0, 1)Block 0Block 1 (addresses 2, 3)Block 1Block 2 (addresses 4, 5)Block 2Block 3 (addresses 6, 7)Block 3Block 4 (addresses 8, 10)Block 0Block 5 (addresses 10, 11)Block 1Block 6 (addresses 12, 13)Block 2Block 7 (addresses 14, 15)Block 3CS 2401 Comp. Org. &AssemblyMemory -- Chapter 6226.4.1 Direct Mapping SchemeAs an example, suppose a programgenerates the address 1AA1AA. In 1414-bitbitbinary, this number is: 00000110101010.The first 7 bits of this address go in the tagfield, the next 4 bits go in the block field,and the final 3 bits indicate the word withinthe block.If subsequently the program generates the addressg for in block 0101,,1AB,, it will find the data it is lookingword 011.However, if the program generates the address, 3AB,instead the block loaded for address 1AA would beinstead,evicted from the cache, and replaced by the blocksassociated with the 3AB reference.CS 2401 Comp. Org. &AssemblyMemory -- Chapter 623CS 2401 Comp. Org. &AssemblyMemory -- Chapter 6246

ExampleProblem 1 on Page 3201. Suppose a computer using 15-bit mainmemory addresses and 64 blocks ofcache, each block contains 8 words.1. Suppose a computer using directmappedd cacheh hash 220 wordsd off mainimemory, and a cache of 32 blocks,where each cache block contains 16words.a)How many blocks of main memory are there?b)What is the format of a memory address asseen by the cache, i.e., what are the sizes ofthe tagtag, blockblock, and word fields?c)To which cache block will the memoryreference 1028 map?CS 2401 Comp. Org. &AssemblyMemory -- Chapter 6256.4.1 Direct Mapping SchemeMemory -- Chapter 6CS 2401 Comp. Org. &AssemblyMemory -- Chapter 6266.4.1 Fully Associate CacheSuppose a program generates a series ofmemory references such as: 1AB,1AB 3AB,3AB1AB, 3AB, . . . The cache will continuallyevict and replace blocks.The theoretical advantage offered by thecache is lost in this extreme case.This is the main disadvantage of directmapped cache.O h cacheOtherh mapping schemesharedesigned to prevent this kind of thrashing.CS 2401 Comp. Org. &Assemblya)How many blocks of main memory are there?b)What is the format of a memory address asseen byb theth cache,h i.e.,iwhath t are theth sizesioffthe tag, block, and word fields?c)To which cache block will the memoryreference 0DB6316 map?Instead of placing memory blocks inspecific cache locations based on memoryaddress, we could allow a block to goanywhere in cache.In this way, cache would have to fill upbefore any blocks are evicted.This is how fully associative cache works.A memory addressddis partitionedd into onlyltwo fields: the tag and the word.27CS 2401 Comp. Org. &AssemblyMemory -- Chapter 6287

6.4.1 Fully Associate CacheProblem 3 on Page 3203. Suppose a computer usingfullyassociative cache has 216 words of mainmemory and a cache of 64 blocks,where each cache block contains 32words.Suppose, as before, we have 14-bit memoryaddresses and a cache with 16 blocks,, each block ofsize 8. The field format of a memory reference is:a) How many blocks of main memory arethere?b) What is the format of a memory address asseen byb theth cache,h i.e.,iwhath t are theth sizesioffthe tag and word fields?c) To which cache block will the memoryreference F8C9 map?When the cache is searched, all tags are searched inparallel to retrieve the data quickly.This requires special,special costly hardware.hardwareCS 2401 Comp. Org. &AssemblyMemory -- Chapter 6296.4.1 Cache Mapping SchemesMemory -- Chapter 6Memory -- Chapter 6306.4.1 Set Associate CacheYou will recall that direct mapped cacheevicts a block whenever another memoryreference needs that block.With fully associative cache, we have nosuch mapping, thus we must devise analgorithm to determine which block to evictfrom the cache.The block that is evicted is the victim block.Therehare a numberb off ways to pickkavictim we will discuss that later.CS 2401 Comp. Org. &AssemblyCS 2401 Comp. Org. &Assembly31Set associative cache combines the ideas of directppcache and fullyy associative cache.mappedAn N-way set associative cache mapping is like directmapped cache in that a memory reference maps to aparticular location in cache.Unlike direct mapped cache, a memory referencemaps to a set of several cache blocks, similar to theway in which fully associative cache works.Instead of mapping anywhere in the entire cache, amemory reference can map only to the subset ofcacheh slots.l tCS 2401 Comp. Org. &AssemblyMemory -- Chapter 6328

6.4.1 Set Associate Cache6.4.1 Set Associate CacheThe number of cache blocks per set in set associativeg to overall systemydesign.gcache varies accordingFor example, a 2-way set associative cache can beconceptualized as shown in the schematic below.Each set contains two different memory blocks.CS 2401 Comp. Org. &AssemblyMemory -- Chapter 633In set associative cache mapping, a memoryg, set,, andreference is divided into three fields: tag,word, as shown below.As with direct-mapped cache, the word field choosesthe word within the cache block, and the tag fielduniquely identifies the memory address.The set field determines the set to which the memoryblock maps.CS 2401 Comp. Org. &AssemblyMemory -- Chapter 634Problem 7 Page 3216.4.1 Set Associate CacheSuppose we have a main memory of 214 bytes.This memoryy is mappedppto a 2-wayy set associative cachehaving 16 blocks where each block contains 8 words.Since this is a 2-way cache, each set consists of 2 blocks,and there are 8 sets.Thus, we need 3 bits for the set, 3 bits for the word, giving8 leftover bits for the tag:CS 2401 Comp. Org. &AssemblyMemory -- Chapter 635CS 2401 Comp. Org. &AssemblyMemory -- Chapter 6369

6.4.2 Replacement Policies6.4.2 Replacement PoliciesWith fully associative and set associative cache, apppolicyy is invoked when it becomesreplacementnecessary to evict a block from cache.An optimal replacement policy would be able to lookinto the future to see which blocks won’t be neededfor the longest period of time.Although it is impossible to implement an optimalreplacement algorithm, it is instructive to use it as abenchmark for assessing the efficiency of any otherscheme we come up with.CS 2401 Comp. Org. &AssemblyMemory -- Chapter 6The replacement policy that we choose depends upony that we are tryingy g to optimize-pusually,y,the localitywe are interested in temporal locality.A least recently used (LRU) algorithm keeps track ofthe last time that a block was accessed and evicts theblock that has been unused for the longest period oftime.The disadvantage of this approach is its complexity:LRU has to maintain an access history for each block,which ultimately slows down the cache.37First-in, first-out (FIFO) is a popular cachereplacement policy.policyIn FIFO, the block that has been in thecache the longest, regardless of when itwas last used.A random replacement policy does what itsname implies: It picks a block at randomand replaces it with a new block.Randomdreplacementlcan certainlyl evict ablock that will be needed often or neededsoon, but it never thrashes.Memory -- Chapter 6Memory -- Chapter 6386.4.3 Effective Access Time andHit Ratio6.4.2 Replacement PoliciesCS 2401 Comp. Org. &AssemblyCS 2401 Comp. Org. &Assembly39The performance of hierarchical memory ismeasured by its effective access time(EAT).EAT is a weighted average that takes intoaccount the hit ratio and relative accesstimes of successive levels of memory.The EAT for a two-level memory is givenby:EAT H AccessC (1-H)() AccessMM.where H is the cache hit rate and AccessC andAccessMM are the access times for cache and mainmemory, respectively.CS 2401 Comp. Org. &AssemblyMemory -- Chapter 64010

6.4.3 Effective Access Time andHit RatioProblem 9 Page 321For example, consider a system with a main memoryaccess time of 200ns supported by a cache having a10ns access time and a hit rate of 99%.The EAT is:0.99(10ns) 0.01(200ns) 9.9ns 2ns 11ns.This equation for determining the effective accesstime can be extended to any number of memorylevels, as we will see in later sections.CS 2401 Comp. Org. &AssemblyMemory -- Chapter 6416.4.5 Cache Write PoliciesMemory -- Chapter 6Memory -- Chapter 6426.4.5 Cache Write PoliciesCache replacement policies must also takeinto account dirty blocks,blocks those blocks thathave been updated while they were in thecache.Dirty blocks must be written back tomemory. A write policy determines howthis will be done.There are two types of write policies, writethrough and write back.backWrite through updates cache and mainmemory simultaneously on every write.CS 2401 Comp. Org. &AssemblyCS 2401 Comp. Org. &Assembly43Write back (also called copyback) updates memoryy when the block is selected for replacement.ponlyThe disadvantage of write through is that memorymust be updated with each cache write, which slowsdown the access time on updates. This slowdown isusually negligible, because the majority of accessestend to be reads, not writes.The advantage of write back is that memory traffic isminimized, but its disadvantage is that memory doesnot always agree with the value in cache, causingproblems in systems with many concurrent users.usersCS 2401 Comp. Org. &AssemblyMemory -- Chapter 64411

6.5 Virtual Memory6.5 Virtual MemoryCache memory enhances performance by providingy access speed.pfaster memoryVirtual memory enhances performance by providinggreater memory capacity, without the expense ofadding main memory.Instead, a portion of a disk drive serves as anextension of main memory.If a system uses paging, virtual memory partitionsmain memory into individually managed page frames,that are written (or paged) to disk when they are notiimmediatelydi t l needed.d dCS 2401 Comp. Org. &AssemblyMemory -- Chapter 6456.5.1 PagingCS 2401 Comp. Org. &AssemblyMemory -- Chapter 6466.5.1 PagingA physical address is the actual memoryaddress of physical memory.memoryPrograms create virtual addresses that aremapped to physical addresses by thememory manager.Page faults occur when a logical addressrequires that a page be brought in fromdisk.Memory fragmentationfoccurs whenhthehpaging process results in the creation ofsmall, unusable clusters of memoryaddresses.CS 2401 Comp. Org. &AssemblyVirtual address -- The logical or program address that the processuses. The CPU generates an address in terms of virtual address.Ph i l addressPhysicaldd-- TheTh reall addressddiin physicalh i l memory.Mapping -- The mechanism by which virtual addresses aretranslated into physical ones.Page frames -- The equal-size blocks into which main memory isdivided.Pages -- The blocks into which virtual memory is divided, eachequal in size to page frame.Paging -- The process of coping a virtual page from disk to pageframe in main memory.Fragmentationg-- Memoryy that becomes unsuable.Page fault -- An event that occurs when a requested page is not inmain memory and must be copied into memory fromdisk.Memory -- Chapter 6Main memory and virtual memory are divided intoequal sized pages.The entire address space required by a process neednot be in memory at once. Some parts can be on disk,while others are in main memory.Further, the pages allocated to a process do not needto be stored contiguously-- either on disk or inmemory.In this way,y, onlyy the needed pagesp g are in memoryy atany time, the unnecessary pages are in slower diskstorage.47CS 2401 Comp. Org. &AssemblyMemory -- Chapter 64812

6.5.1 Paging6.5.1 PagingInformation concern

Memory -- Chapter 6 2 virtual memory, memory segmentation, paging and address translation. Introduction Memory lies at the heart of the stored-program computer (Von Neumann model) . In previous chapters, we studied the ways in which memory is accessed by various ISAs. In this chapter, we focus on memory organization or memory hierarchy systems.

Related Documents:

Part One: Heir of Ash Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7 Chapter 8 Chapter 9 Chapter 10 Chapter 11 Chapter 12 Chapter 13 Chapter 14 Chapter 15 Chapter 16 Chapter 17 Chapter 18 Chapter 19 Chapter 20 Chapter 21 Chapter 22 Chapter 23 Chapter 24 Chapter 25 Chapter 26 Chapter 27 Chapter 28 Chapter 29 Chapter 30 .

TO KILL A MOCKINGBIRD. Contents Dedication Epigraph Part One Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7 Chapter 8 Chapter 9 Chapter 10 Chapter 11 Part Two Chapter 12 Chapter 13 Chapter 14 Chapter 15 Chapter 16 Chapter 17 Chapter 18. Chapter 19 Chapter 20 Chapter 21 Chapter 22 Chapter 23 Chapter 24 Chapter 25 Chapter 26

DEDICATION PART ONE Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7 Chapter 8 Chapter 9 Chapter 10 Chapter 11 PART TWO Chapter 12 Chapter 13 Chapter 14 Chapter 15 Chapter 16 Chapter 17 Chapter 18 Chapter 19 Chapter 20 Chapter 21 Chapter 22 Chapter 23 .

In memory of Paul Laliberte In memory of Raymond Proulx In memory of Robert G. Jones In memory of Jim Walsh In memory of Jay Kronan In memory of Beth Ann Findlen In memory of Richard L. Small, Jr. In memory of Amalia Phillips In honor of Volunteers (9) In honor of Andrew Dowgiert In memory of

Memory Management Ideally programmers want memory that is o large o fast o non volatile o and cheap Memory hierarchy o small amount of fast, expensive memory -cache o some medium-speed, medium price main memory o gigabytes of slow, cheap disk storage Memory management tasks o Allocate and de-allocate memory for processes o Keep track of used memory and by whom

CMPS375 Class Notes (Chap06) Page 2 / 17 by Kuo-pao Yang 6.1 Memory 281 In this chapter we examine the various types of memory and how each is part of memory hierarchy system We then look at cache memory (a special high-speed memory) and a method that utilizes memory to its fullest by means of virtual memory implemented via paging.

Chapter 2 Memory Hierarchy Design 2 Introduction Goal: unlimited amount of memory with low latency Fast memory technology is more expensive per bit than slower memory –Use principle of locality (spatial and temporal) Solution: organize memory system into a hierarchy –Entire addressable memory space available in largest, slowest memory –Incrementally smaller and faster memories, each .

Magnetic Flux Controllers in Induction Heating and Melting Robert Goldstein, Fluxtrol, Inc. MAGNETIC FLUX CONTROLLERS are materials other than the copper coilthat are used