Android Platform Optimizations SNPS 20111027

2y ago
38 Views
2 Downloads
5.03 MB
36 Pages
Last View : 2d ago
Last Download : 3m ago
Upload by : Milena Petrie
Transcription

Android Platform OptimizationsELC-EuropePrague, October 2011Ruud Derwig Synopsys 20111

Helping Design the Chips InsideMobile MultimediaMedical Synopsys 2011Digital HomeAutomotive2Data Center & NetworkingIndustrialComputing & PeripheralsMilitary / AerospaceOther

AgendaMarket & value driversWhat to optimize?How to optimize?Results & conclusion Synopsys 20113

Android Markets SmartphonesTabletsTVSTB / multimedia Others / new Synopsys 20114

Android Markets SmartphonesTabletsTVSTB / multimedia Others / new Synopsys 20115

Android Markets SmartphonesTabletsTVSTB / multimedia Others / new Synopsys 20116

Key Value Drivers &System Architecture Choices Power consumption optimize performance / mW Product cost optimize performance / area optimize development efficiency Hardware – Software trade-offs– Maximum flexibility & developer efficiency : “virtual everything” PC model, multi-GHz SMP processor centric designs– Minimal power & optimal performance: “dedicated hardware” dedicated, fixed function device– Sweetspot : “heterogeneous, HW accelerated multi-core” Mix of CPU, DSP, and dedicated HW Synopsys 20117

AgendaMarket & value driversDDRPHYARC CIecontroller controller controller controller processingWhat to optimize?SATAEthernetcontroller controllerHow to optimize?I2CGPIOUARTXAUIPHYSATAPHYDatapath Synopsys 20118AudioCodecVideoFront EndARC AudioprocessorARC VideoprocessorAMBA 3 AXI & AMBA 2.0 AHBAMBA APBResults & conclusionADCsDACsLogic LibrariesMIPI DigRF, SD/MMCCSI, DSIcontrollercontrollerMIPI AMs)

Linux Kernel & Library Optimizations Important, but not Android specific Optimization options– Optimize hotspots compiler handwritten assembler– CPU hardware optimizations MMU special instructions Synopsys 20119

Dalvik Virtual Machine “Java” * virtual machine– Register-based architecture (Java VMs are stack machines). Dalvikregisters are typically stored in memory (on the stack, like localvariables in C).– Own bytecode Three virtual machines– Portable: completely C-based, in fact one large switch{} statement witha case x: for every Dalvik opcode.– Fast (a.k.a. MTERP): assembly-coded handlers for every Dalvikopcode, which are aligned on 64 bytes addresses, so that the addressof the handler can be easily calculated from the opcode, saving alookup.– JIT: just-in-time compiler, initially starts as fast/mterp interpreter, but willidentify ‘hot’ traces and pass these to the compiler thread.*Dalvik is a clean-room implementation of Java for copyright reasons. The syntax is similar. Synopsys 201110

Android Media Player ArchitectureMedia Player AppJAVAMedia Player AppFrameworkLinux User Space Media PlayerServiceAudioFlingerOther AudioDriverAlsa KernelDriverLinux Kernel Space Synopsys 201111

Android Media Player Architecture Media Player AppJAVAMedia Player AppFrameworkLinux User Space Media PlayerServiceAudioFlingerOther AudioDriverGoogle’s player of choice is the Stagefright multi-format A/V player, newly developed for AndroidSimple fixed graph – selects demuxer and decoderbased on file extensionAlternatives exist Gstreamer basedproprietary / legacyMedia Player yerMIDIPlayerAlsa KernelDriverLinux Kernel SpaceReader(Read from Source) Synopsys 2011VorbisPlayer12VideoDecodingVideo Rendering( SurfaceFlinger)AudioDecodingAudio Rendering( AudioFlinger)Demuxer(Parse containerformat)

Audio Optimization Option:off-load audio processing to DSPDemuxer(Parse containerformat)Reader(Read from Source)Audio Rendering( AudioFlinger)Audio DecodingGP-CPUAudio DSPControl APIAudio DecodingControl APIDSPARC Sound Processor Synopsys 201113PostProcessing(SonicFocus)

Android Graphics - Architecture 2D– Canvas/Skia– OpenVG 3D– OpenGL-ES 1.x– OpenGL-ES 2ApplicationCanvasSkiaRenderScriptOpenGLSurface Renderscript– Expose native GPU/SMP to (portable) applications– C99 - LLVM intermediate bitcode - machine code Synopsys 201114

Android Graphics - ceSurfaceFlingerOpenGLGPU Synopsys 201115PixelFlinger2D blitter

Graphics Optimization Options Graphics drawing/rendering– Software/assembler optimization Skia, PixelFlinger– Hardware acceleration GPU (OpenGL-ES 2) 2D accelerator (OpenVG compatible or other) Memory architecture, caching– Renderscript Surface Composition– Scaling, colorspace conversion Custom instructions GPU Dedicated hardware acceleration (bitblit) Synopsys 201116

AgendaMarket & value driversWhat to optimize?How to optimize?Results & conclusion Synopsys 201117

Optimized Designware ARC Android Full port of the AndroidARC SoundFroyo/GingerbreadAudio DSP(work in progress)release to the ARCprocessor architectureand build environment Including NDK and SDKto support Androidapplicationbuilding/portingARC OptimizedpixelflingerARC Optimized V8JavaScript engineARC OptimizedDalvik VM(work in progress) Google/OHA CompatibilityTest Suite testedARC Linux kerneland Drivers Synopsys 201118ARC Optimizedbionic C library

Differences between VM ImplementationsPortableMTerpJITswitch (opcode) {case add: a b c;break;case sub: a b – c;break;.ld r0, [b]ld r1, [c]add r0, r0, r1st r0, [a]ld r0, [next opcode]asl r0, r0, 6add r0, r13, r0j [r0]ld r0, [b]ld r1, [c]add r0, r0, r1st r0, [a]ld r0, [next opcode] pipeline stall ld.as r1, [jump table, r0] pipeline stall j [r1] Synopsys 201119ORadd r20, r20, r21

Reused from Google I/O presentation Synopsys 201120

Register- and Stack-based VMsExample: a b cJavaDalvikDalvik for ARCiload biload ciaddistore aadd-int a, b, cadd-int a, b, cld r0, [b]push r0ld r0, [c]push r0pop r0pop r1add r0, r0, r1push r0pop r0st r0, [a]ld r0, [b]ld r1, [c]add r0, r0, r1st r0, [a]add r20, r21, r22 Synopsys 2011Registers are onlysaved/restored whenchanging stackframes or whenmoving to interpreter21

Audio Processing on DSPDemuxer(Parse containerformat)Reader(Read from Source)Audio Rendering( AudioFlinger)Audio DecodingGP-CPUAudio DSPControl APIAudio DecodingControl APIDSPPostProcessing(SonicFocus)ARC Sound Processor Audio decoding and Post-processing off-loaded to ARC Sound Processor Special host Audio Decoder implementation that takes care of off-loading with standard host decoder interfaces, so seamless integration Post-processing control through Renderer on host (special Renderer or Rendererplug-in component) Synopsys 201122

MSF Media Streaming FrameworkARC DSP optimized, lightweightstreaming frameworkMedia Player AppMedia Player App Framework MQX Real-time Operating system(Control API)StageFright Player(Control API)DecoderStagefrightDecoder WrapperRendererReaderDemuxer RPC/IPC Remote Procedure call /Inter Processor CommunicationPostProcessing“ plug-in”DecoderSonicFocusControl APIRemote-d Control APIARC Sound Remote-d MSFRPC/IPCAudio Data, Control Synopsys 2011RPC/IPCMSF FrameworkAndroid (Linux) OSMQX OSAndroid Host ProcessorARC Audio Processor(s) - AS2xx23

Android & Audio APIs Stagefright supports 2 types of interfaces– OpenMax-IL : for re-use of OMX components– Stagefright codec interface : for native StagefrightcodecsStagefrightOMX2MSFOMX2MSFOMX2MSF AudioFlinger uses dedicated interfaces– standard implementation using “ALSA” exist– developments ongoing (?) to support OpenSL-ESKhronos standard (like OMX) SNPS API choice not yet made– OMX-IL pro : open standard– OMX-IL con: efficiency, complexity: standard bycommittee – Stagefright pro : efficient integration with Stagefright– Stagefright con : not an open standard, no deeptunneling Synopsys 201124IPCcodeccodeccodec

Alternative: Gstreamer GStreamer Android Player– see e.g. ELC-E 2010presentation “The goal of the project is toboth allow hardware makersto standardize on GStreameraccross their softwareplatforms, but also to makethe advanced functionality ofGStreamer available on theAndroid platform, like videoediting, DLNA Support andVideo conferencing.” Synopsys 201125

GStreamer DSP Off-loading with“Deep Tunneling” Gstreamer-MSF integrationmakes heterogeneousmulti-core SW developmenttransparent to user Instantiation of Gstreamerelement instantiation ofmodule on one of the ARCcoresHost CPUGGGGHost-ARCStreamingFIFOARC-ARCStreamingsrc Creation of link local connection orcore-crossing connectionbetween modulesMMsinksrcMFIFOdriverARC core #0LocalStreamingARC core #1Dataflow graph instantiated on Audio Subsystem Synopsys 201126

Gstreamer Deep Tunneling Synopsys 201127

ARC HW ExtensionsextensioninstructionsARC750DI ARC EIA (Extension Interface Automation): supports user defined custom instructions accelerates typical Dalvik (Java VM) andpixelflinger (2D GUI) instruction sequencesPrefetcher: Eliminates pipeline stalls in high latencymemory environments L2 not required in this case Configurable depending on applicationD CPU area, excluding memoriesprefetchunitCPU FPUprefetcherextra instructionsAXI busVGA / HDMI / DSIDDR memory (including frame buffer) Synopsys 201128display driver

Leveraging the ARC EIA CapabilitiesExample: Colour Space ConversionABGR88888 operations are required for a7 6 5 4 3to2 1 0 7 6 5 4 3 2 1 0conversion from ABGR8888RGB565.7 6 5 4 3shift leftmaskThis can be combined into one7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5single EIA stw.abInstructionOperandsr1, [r4, 0x4]r2, r1, 0xf8r2, r2, 8r3, r1, 0xfc00r11, r3, 5r2, r2, r11Operandsupk8r1, [r4, 0x4]r2, r1, r6stw.abr2, [r5, 0x2]ld.ab7 6 5 4 3 2mask7 6 5 4 3 2 1 0 7 6 5 4 37 6 5 4 3r3, r1, 0xf80000r11, r3, 19r2, r2, r11r2, [r5, 0x2]bitwise orbitwise or7 6 5 4 3 7 6 5 4 3 2 7 6 5 4 3 Synopsys 201129shift rightshift rightmaskbitwise orbitwise orRGB565

AgendaMarket & value driversWhat to optimize?How to optimize?Results & conclusion Synopsys 201130

Optimizing Dalvik VM Synopsys 201131

Optimizing Dalvik VMDalvik JIT OptimizationRelative Performance Compared to Interpreter Performance 20%x 4.9 and goingCoreMarkx 11.3 and goingCaffeineMarkWithoutL2 cache1,94,9/MHz3790/mW1435/MHz/mm2measurements are done on 50MHz FPGAresults are without performance gains from hardware extensions Synopsys 201132

Optimizing HardwareCustom Instructions & Prefetching70000Execution time (ARC cycles)60000Original Code - No prefetchingCode with Android extensions - No prefetchingOriginal code - PrefetchingCode with Android extensions - Prefetching500004000030000200001000000124816Memory latency (cycles) Synopsys 2011333264128

Linux kernel ARC HW optimizations Synopsys 201134

Conclusions There are more markets for Android than high-end smartphone There are more optimizations possible than relying on Moore’s lawfor GHz multi-cores Optimize performance / mW & performance / area Sweetspot : “heterogeneous, HW accelerated multi-core”– Mix of CPU, DSP, and dedicated HW– Highly optimized platform infrastructure SW hides heterogeneous complexities ‘Simple’ ARC processor with SW optimized Dalvik VM performsequal or better as others, thanks to careful SW optimizations, andthe use of simple HW acceleration– Custom instructions tailored for specific tasks– Prefetcher iso. general purpose 2nd level cache– DSP more efficient in audio processing than CPU Synopsys 201135

Fast Forward to Predictable Success Synopsys 201136

Oct 27, 2011 · Android Media Player Architecture StageFright Player Media Player Service Media Player App Media Player App Framework Media Player Service JAVA Linux User Space Google’s player of choice is the Stagefright multi-format A/V player, newly developed for Android Simple fixed gr

Related Documents:

2.4. SNPs2ChIP identi es relevant functions of the non-coding genome To illustrate the utility of SNPs2ChIP to infer the function of non-coding genome, we applied the pipeline to known GWAS SNPs and ChIP-seq peaks from previously published datasets. 0 200 400 # of Missed SNPs 0 50 100 150 200 250 # of Found SNPs (A) High Specificity 0.0 0.1 0.2 .

Android Studio IDE Android SDK tool Latest Android API Platform - Android 6.0 (Marshmallow) Latest Android API emulator system image - Android 6.0 Android Studio is multi-platform Windows, MAC, Linux Advanced GUI preview panel See what your app looks like in different devices Development environment Android Studio 9

2020 Aetna Inc. 3 Proprietary Our objectives Explain Dual Eligible Special Needs Plans(D-SNPs) Describe what D-SNPs offer

quality rating system for D-SNPs where high-quality D-SNPs would be able to retain a larger percentage of their rebate dollars because their members' SDOH needs, which can impact their Star Ratings, would be taken into account in the quality measurement system. Allow D-SNPs to Retain a Higher Percentage of Their Rebate Dollars.

2010 - May: Android 2.2 / Froyo 2010 - Dec: Android 2.3 / Gingerbread 2011 - Jan : Android 3.0 / Honeycomb - Tablet-optimized 2011 - May: Android 3.1 - USB host support 2011 - Nov: Android 4.0 / Ice-Cream Sandwich - merge Gingerbread and Honeycomb 2012 - Jun: Android 4.1 / Jelly Bean - Platform Optimization

ADT (Android Development Tool) bundle or ! Eclipse ADT plug-in Android SDK or ! Android studio ! Download earlier SDK versions using SDK manager if needed . Android Virtual Device (AVD) ! Android emulator allows . Android App Essentials ! Layout ! View objects: UI widgets such as buttons, text box etc. .

Android Development Tools ADT A plug-in for Eclipse (see Eclipse) to develop Android applications. Android Operating system for smartphones. Android Market The Android distribution service of mobile applications. Android Lifecycle A model Android uses to handle the lifecycle of an activity in applications.

continue to meet with strategy groups and conduct shared reading and guided reading groups with a focus on print strategies and fluency based on students [ needs. **Although the unit details 22 sessions, this unit could easily utilize 6 weeks of instruction within the reading workshop.