Decoupling Cores, Kernels, And Operating Systems

3y ago
191 Views
2 Downloads
1.05 MB
16 Pages
Last View : 6d ago
Last Download : 3m ago
Upload by : Asher Boatman
Transcription

Decoupling Cores, Kernels, and Operating SystemsGerd Zellweger, Simon Gerber, Kornilios Kourtis, and Timothy Roscoe, ETH chnical-sessions/presentation/zellwegerThis paper is included in the Proceedings of the11th USENIX Symposium onOperating Systems Design and Implementation.October 6–8, 2014 Broomfield, CO978-1-931971-16-4Open access to the Proceedings of the11th USENIX Symposium on Operating SystemsDesign and Implementationis sponsored by USENIX.

Decoupling Cores, Kernels, and Operating SystemsGerd Zellweger, Simon Gerber, Kornilios Kourtis, Timothy RoscoeSystems Group, Department of Computer Science, ETH ZurichAbstractA key challenge with dynamic cores is safely disposing of per-core OS state when removing a core from thesystem: this process takes time and can dominate the hardware latency of powering the core down, reducing anybenefit in energy consumption. Barrelfish/DC addressesthis challenge by externalizing all the per-core OS andapplication state of a system into objects called OSnodes,which can be executed lazily on another core. Whilethis general idea has been proposed before (notably, it isused in Chameleon [37] to clean up interrupt state), Barrelfish/DC takes the concept much further in completelydecoupling the OSnode from the kernel, and this in turnfrom the physical core.While transparent to applications, this new designchoice implies additional benefits not seen in prior systems: Barrelfish/DC can completely replace the OS kernelcode running on any single core or subset of cores in thesystem at runtime, without disruption to any other OSor application code, including that running on the core.Kernels can be upgraded or bugs fixed without downtime,or replaced temporarily, for example to enable detailedinstrumentation, to change a scheduling algorithm, or toprovide a different kind of service such as performanceisolated, hard real-time processing for a bounded period.Furthermore, per-core OS state can be moved betweenslow, low-power cores and fast, energy-hungry cores.Multiple cores’ state can be temporarily aggregated onto asingle core to further trade-off performance and power, orto dedicate an entire package to running a single job for alimited period. Parts of Barrelfish/DC can be moved ontoand off cores optimized for particular workloads. Corescan be fused [26] transparently, and SMT threads [29, 34]or cores sharing functional units [12] can be selectivelyused for application threads or OS accelerators.Barrelfish/DC relies on several innovations which formthe main contributions of this paper. Barrelfish/DC treatsa CPU core as being a special case of a peripheral device,and introduces the concept of a boot driver, which canstart, stop, and restart a core while running elsewhere. WeWe present Barrelfish/DC, an extension to the Barrelfish OS which decouples physical cores from a nativeOS kernel, and furthermore the kernel itself from the restof the OS and application state. In Barrelfish/DC, nativekernel code on any core can be quickly replaced, kernelstate moved between cores, and cores added and removedfrom the system transparently to applications and OSprocesses, which continue to execute.Barrelfish/DC is a multikernel with two novel ideas: theuse of boot drivers to abstract cores as regular devices, anda partitioned capability system for memory managementwhich externalizes core-local kernel state.We show by performance measurements of real applications and device drivers that the approach is practicalenough to be used for a number of purposes, such asonline kernel upgrades, and temporarily delivering hardreal-time performance by executing a process under aspecialized, single-application kernel.1IntroductionThe hardware landscape is increasingly dynamic. Futuremachines will contain large numbers of heterogeneouscores which will be powered on and off individually inresponse to workload changes. Cores themselves willhave porous boundaries: some may be dynamically fusedor split to provide more energy-efficient computation. Existing OS designs like Linux and Windows assume a staticnumber of homogeneous cores, with recent extensions toallow core hotplugging.We present Barrelfish/DC, an OS design based on theprinciple that all cores are fully dynamic. Barrelfish/DCis based on the Barrelfish research OS [5] and exploitsthe “multikernel” architecture to separate the OS statefor each core. We show that Barrelfish/DC can handledynamic cores more flexibly and with far less overheadthan Linux, and also that the approach brings additionalbenefits in functionality.1USENIX Association11th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’14) 17

use a partitioned capability system for memory management which allows us to completely externalize all OSstate for a core. This in turn permits a kernel to be essentially stateless, and easily replaced while Barrelfish/DCcontinues to run. We factor the OS into per-core kernels1 and OSnodes, and a Kernel Control Block providesa kernel-readable handle on the total state of an OSnode.In the next section, we lay out the recent trends inhardware design and software requirements that motivatethe ideas in Barrelfish/DC. Following this, in Section 3we discuss in more detail the background to our work,and related systems and techniques. In Section 4 wepresent the design of Barrelfish/DC, in particular the keyideas mentioned above. In Section 5 we show by meansof microbenchmarks and real applications (a web serverand the PostgreSQL database) that the new functionalityof Barrelfish/DC incurs negligible overhead, as well asdemonstrating how Barrelfish/DC can provide worst-caseexecution time guarantees for applications by temporarilyisolating cores. Finally, we discuss Barrelfish/DC limitations and future work in Section 6, and conclude inSection 7.2in particular the OS, must tackle the complex problem ofscheduling both OS tasks and those of applications acrossa number of processors based on memory locality.At the same time, cores themselves are becoming nonuniform: Asymmetric multicore processors (AMP) [31]mix cores of different microarchitectures (and thereforeperformance and energy characteristics) on a single processor. A key motivation for this is power reduction forembedded systems like smartphones: under high CPUload, complex, high-performance cores can completetasks more quickly, resulting in power reduction in otherareas of the system. Under light CPU load, however, it ismore efficient to run tasks on simple, low-power cores.While migration between cores can be transparent tothe OS (as is possible with, e.g., ARM’s “big.LITTLE”AMP architecture) a better solution is for the OS to manage a heterogeneous collection of cores itself, poweringindividual cores on and off reactively.Alternatively, Intel’s Turbo Boost feature, which increases the frequency and voltage of a core when otherson the same die are sufficiently idle to keep the chipwithin its thermal envelope, is arguably a dynamic formof AMP [15].At the same time, hotplug of processors, once theprovince of specialized machines like the Tandem NonStop systems [6], is becoming more mainstream. Moreradical proposals for reconfiguring physical processorsinclude Core Fusion [26], whereby multiple independentcores can be morphed into a larger CPU, pooling cachesand functional units to improve the performance of sequential programs.Ultimately, the age of “dark silicon” [21] may welllead to increased core counts, but with a hard limit on thenumber that may be powered on at any given time. Performance advances and energy savings subsequently willhave to derive from specialized hardware for particularworkloads or operations [47].The implications for a future OS are that it must manage a dynamic set of physical cores, and be able to adjustto changes in the number, configuration, and microarchitecture of cores available at runtime, while maintaining astable execution environment for applications.Motivation and BackgroundBarrelfish/DC fully decouples cores from kernels (supervisory programs running in kernel mode), and moreoverboth of them from the per-core state of the OS as a wholeand its associated applications (threads, address spaces,communication channels, etc.). This goes considerablybeyond the core hotplug or dynamic core support in today’s OSes. Figure 1 shows the range of primitive kerneloperations that Barrelfish/DC supports transparently to applications and without downtime as the system executes: A kernel on a core can be rebooted or replaced. The per-core OS state can be moved between cores. Multiple per-core OS components can be relocatedto temporarily “share” a core.In this section we argue why such functionality willbecome important in the future, based on recent trends inhardware and software.2.12.2HardwareSoftwareAlongside hardware trends, there is increasing interest inmodifying, upgrading, patching, or replacing OS kernelsat runtime. Baumann et al. [9] implement dynamic kernel updates in K42, leveraging the object-oriented designof the OS, and later extend this to interface changes using object adapters and lazy update [7]. More recently,Ksplice [3] allows binary patching of Linux kernels without reboot, and works by comparing generated object codeand replacing entire functions. Dynamic instrumentationIt is by now commonplace to remark that core counts,both on a single chip and in a complete system, are increasing, with a corresponding increase in the complexityof the memory system – non-uniform memory access andmultiple levels of cache sharing. Systems software, and1 Barrelfish uses the term CPU driver to refer to the kernel-modecode running on a core. In this paper, we use the term “kernel” instead,to avoid confusion with boot driver.218 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’14)USENIX Association

OSnode βOSnode αk. B3kernel Akernel B1kernel B2updatekernel B2multiplexercore 1core 0k. Ccore 2moveparkcore 1unparkTimeFigure 1: Shows the supported operations of a decoupled OS. Update: The entire kernel, dispatching OSnode α, is replacedat runtime. Move: OSnode α containing all per-core state, entailing applications is migrated to another core and kernel. Park:OSnode α is moved to a new core and kernel that temporarily dispatches two OSnodes. Unpark: OSnode α is transferred back toits previous core.hundreds of milliseconds [23], overhead that increasesfurther when the system is under CPU load [25]. We showfurther evidence of this cost in Section 5.1 where we compare Linux’ CPU hotplug with Barrelfish/DC’ core updateoperations.systems like Dtrace [13] provide mechanisms that modifythe kernel at run-time to analyze program behavior.All these systems show that the key challenges in updating an OS online are to maintain critical invariantsacross the update and to do so with minimal interruptionof service (the system should pause, if at all, for a minimal period). This is particularly hard in a multiprocessorkernel with shared state.Recognizing that processors will be configured muchmore frequently in the future for reasons of energy usageand performance optimization, Chameleon [37] identifiesseveral bottlenecks in the existing Linux implementationdue to global locks, and argues that current OSes are illequipped for processor sets that can be reconfigured atruntime. Chameleon extends Linux to provide support forchanging the set of processors efficiently at runtime, anda scheduling framework for exploiting this new functionality. Chameleon can perform processor reconfigurationup to 100,000 times faster than Linux 2.6.In this paper, we argue for addressing all these challenges in a single framework for core and kernel management in the OS, although the structure of Unix-likeoperating systems presents a barrier to such a unifiedframework. The rest of this paper describes the unifiedapproach we adopted in Barrelfish/DC.3Barrelfish/DC is inspired in part by this work, butadopts a very different approach. Where Chameleon targets a single, monolithic shared kernel, Barrelfish/DCadopts a multikernel model and uses the ability to rebootindividual kernels one by one to support CPU reconfiguration.Related workOur work combines several directions in OS design andimplementation: core hotplugging, kernel update andreplacement, and multikernel architectures.3.1The abstractions provided are accordingly different:Chameleon abstracts hardware processors behind processor proxies and execution objects, in part to handle theproblem of per-core state (primarily interrupt handlers)on an offline or de-configured processor. In contrast, Barrelfish/DC abstracts the per-core state (typically muchlarger in a shared-nothing multikernel than in a sharedmemory monolithic kernel) behind OSnode and kernelcontrol block abstractions.CPU HotplugMost modern OS designs today support some form of corehotplug. Since the overriding motivation is reliability, unplugging or plugging a core is considered a rare eventand the OS optimizes the common case where the coresare not being hotplugged. For example, Linux CPU hotplug uses the stop machine() kernel call, which haltsapplication execution on all online CPUs for typically3USENIX Association11th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’14) 19

In a very different approach, Kozuch et al. [30] showhow commodity OS hibernation and hotplug facilities canbe used to migrate a complete OS between different machines (with different hardware configurations) withoutvirtualization.Hypervisors are typically capable of simulating hotplugging of CPUs within a virtual machine. Barrelfish/DCcan be deployed as a guest OS to manage a variable setof virtual CPUs allocated by the hypervisor. Indeed, Barrelfish/DC addresses a long-standing issue in virtualization: it is hard to fully virtualize the microarchitecture of aprocessor when VMs might migrate between asymmetriccores or between phy

Operating Systems Design and mplementation. Octoer –8, 01 roomfield, CO 978-1-931971-16-4 Open access to the roceedings o the 11t SENI Symposium on Operating Systems Design and mplementation is sponsored y SENIX. Decoupling Cores, Kernels, and Operating Systems Gerd Zellweger, Simon Gerber, Kornilios Kourtis, and Timothy Roscoe, ETH Zürich

Related Documents:

ISO 5264-2:2002 Pulps -- Laboratory beating -- Part 2: PFI mill method 7.3 Cores - Tests on cores ISO 11093-1:1994 Paper and board -- Testing of cores -- Part 1: Sampling ISO 11093-2:1994 Paper and board -- Testing of cores -- Part 2: Conditioning of test samples ISO 11093-3:1994 Paper and board -- Testing of cores -- Part 3: Determination of moisture content using the oven drying method

SPARC @ Oracle 16 x 2nd Gen cores 6MB L2 Cache 1.7 GHz 8 x 3 rd Gen Cores 4MB L3 Cache 3.0 GHz 16 x 3rd Gen Cores 8MB L3 Cache 3.6 GHz 12 x 3rd Gen 48MB L3 Cache 3.6 GHz 6 x 3 Gen Cores 48MB L3 Cache 3.6 GHz T3 T4 T5 M5 M6 S7 32 x 4th Gen Cores 64MB L3 Cache 4.1 GHz DAX1 M7 8 x 4th Gen Co

1.5 Requirements Data production/consumptiondecoupling Space decoupling: producers and consumers are distributed Synchronisation decoupling: asynchronous and anonymous communication Time decoupling: production and consumption at different times Scalability: in messages per second, in data per second, in clients (producers and consumers) at a given instant

The PAM8403 is a high performance CMOS audio amplifier that requires adequa supply decoupling to ensure the output THD andte power PSRR as low as possible. Power supply decoupling affects low frequency response. Optimum decoupling is achieved by using twocapacitors of different types targeting to different types of noise on the power supply leads.

Conventional methods of decoupling (noise suppression) include the use of decoupling capacitors external to the IC package, such as monolithic multilayer ceramic chip capacitors. One external connection scheme of this type which has been found to be quite successful is to mount a decoupling capacitor underneath an integrated circuit.

File systems offer a common interface for applications to access data. Although micro-kernels implement file sys-tems in user space [1,16], most file systems are part of monolithic kernels [6,22,34]. Kernel implementations avoid the high message-passing overheads of micro-kernels and user

combo", Memorax, Singing Machine, Yes "Lecteurs Mp3" 2007 . Contributeurs Volontaires 3 Tarif 2018 Entreprise Marques Depuis Tarif . Kernels Popcorn Ltd. Kernels, Kernels Extraordinary Popcorn, Practice Safe Snacks 2018 . Sunbeam Corporation Canada Ltd. Bionaire, Coleman, Crock-Pot, Food Saver, .

Cambridge IGCSE and O Level Accounting 1.4 The statement of financial position The accounting equation may be shown in the form of a statement of financial posi