Transitioning From ARM V7 To ARM V8: All The Basics You .

2y ago
2 Views
2 Downloads
1.29 MB
18 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : River Barajas
Transcription

Transitioning from ARM v7 to ARM v8:All the basics you must know.Mario SmarduchSenior Virtualization ArchitectOpen Source GroupSamsung Research America (Silicon Valley)m.smarduch@samsung.comSmarduchSenior Virtualization ArchitectOpen Source GroupSamsung Research America (Silicon Valley)m.smarduch@samsung.com 2014 SAMSUNG Electronics Co.

Registers-ARMv7 (no VFP/SIMD)ARMv7 – ARM instructions 32ba1-a4– (a4)ip– variable regs– platform reg.- subroutine must preserve v1-v8– inter procedure call scratch reg- linker scratch reg. practically usedr4(v1)r5(v2)as regular registerspr6(v3)lrr7(v4)– stack pointer– link reg, bl saves return addrpcr8(v5)r9(v6)– program counter 8r14(lr) SP irqLR irqSP svcLR svcSPSR irqSPSR svcr15(pc)CPSRArith/CondFlags31 30 29 28 27irq usr:; svc mode- get LR, SPSR from irq stack save to SVC stack- get user lr, sp use { } save to SVC stack- Dispatch IRQ handler; restore user contest- restore – r0-r12, {sp, lr} - restore – user cpsr, - from spsr saved in irq mode- resotre lr from one saved in irq modemovs pc, lrb( .) .r11(v8)r12(ip)vector irq:; use IRQ stack temporarily; LR – user RA, SPSR user CPSR,- save LR, SPSR, r0 – ptr to irq stack; changed modes to SVC mode- set spsr mode to SVC mode; movs in this context changes modesmovs pc, lra( . )push {r6,r7,r8,lr}sub sp, #48add r7, sp, #8 ; frame-pointer; mov or push stack argumentsmov r3, argstr r3, [r7, #4] .bl a; adjust spmov sp, r7pop {r6, r7, r8, pc}r10(v7)r13(sp)Exception HandlingProcedure call‘pc’ – accessible register, lr popped into itpop/push is alias of ‘ldm/stm’Register set – 12 usable regs – 32 bitMore cache activity, effects power In irq mode most step off into SVCIrq mode can’t take nested interrupts LR – would be wiped out Except HYP mode ELR HYPIf-thenThumb only26 25 24 23N Z C V Q IT[1:0] J20 191615RESERVED GE[3:0]10 9 8 7 6 5 4IT[2:7]E A I F TM[4:0]Modes0Usr – 10000Fiq – 10001mrs r0, cpsr, or msrmsr cpsr c, r0Irq – 10010Svc – 10011APSRMon – 10110Abt –J,TEA– instruction mode ARM, Thumb, Jazelle, Jazelle-RCTSys– endiannes BE/LE – setend le be then do a load-11111Hyp – 11010 – hyp mode– enable/disable Async faults due (memory errors)I,F–10111Und – 11011irq, fiq mask bits2 2014 SAMSUNG Electronics Co.

Registers-ARMv8(no VFP/SIMD)Procedure call64bx0x1x2x3x4x5x6x6x7x8a( . ); push LR, FPw0w1w2w3w4w5w6w7w8w9stp x29, x30, sp[,#-48] ; pre-incr; set FP to current framemov x29, sp ; frame pointer; pass argumentsmov w0, arg .bl bldp x29, x30, [sp], #48 ; post-incr; return using LRreta()stack frame .x30(lr)x29(fp)b()stack frame 0ELRPState Registers 64 bit, 32 bit variant ‘wx’‘pc’ – no longer part of register filestore/load pair – used for stack ops (16 bytes)FP used to keep track of stack frame (x29)‘sp’ can be stack pointer or 0 reg ‘wzr’ Depends on context useLR per each priv. levelRegister set – way more usuable regs less cache references, lowers powerNZCV DAIF SPSel CurrentELmrs x1, esr el1; determine reasonb.eq dabortb.eq pabort .; restore SPSR, LR from spsr el1, elr el1ERET Makes sense after exception model‘modes’ collapsed – to exception levelseach exception level – has banked sp, elr, spsras well as – exception syndrome registerSave state, process exception, restore SPSR, LR, ERETNo hidden meanings for ‘s’ bit i.e, movs pc, . state accessible in aarch64 at run-timeEquivalent of CPSR in ARMv7NZCV – arith/cond flagsDAIF – Debug, Async, Irq, Fiq masksSPsel – rare useCurrent EL – User, Kernel, HYP, Secure Excption taken at EL0EL0EL1sp el0sp el1elr el1spsr el1; look up cause – unlike ARMv7 cause/vector not 1:1b( .) .x28x29(FP)Exception Handling Pstate – fields more like registers; access field directly, Clear bit 2 disable IRQmsr DAIFSet, #2;BUT PState is really – ELR Elx, SP Elx, SPSR Elx, SPSR abt. SPSR fiq, SPSR und, SPSR abt – later on in exception model VFP/SIMD, Debug3 2014 SAMSUNG Electronics Co.

Exception Model v8 simplified the exception model vs v7? State, Privilege, Security level confusingSeveral usr, irq, fiq, svc, und, sys (also hyp, mon) –o Each had it’s own stack, banked registers and briefly usedo Also instruction state (J,T) – ARMv8 only arm64 Privilege – scattered over various states – usr – 0, system – to run privileged threads Security level (TrustZone)o Secure apps by default have higher privilegeARMv8 (aarch64 Exception Model) Simplified to Exception levels, higher values higher privelege No modes a SP, LR, SPSR per each mode Pstate CurrentEL Exception Level hEL2tEL3AArch32 modesusr (PL0)svc/abt/und/fiq/irq/sys (PL1)Hyp (PL2)EL3hEL3tmon/svc/abt/und/fiq/irq/sys (PL3)4 2014 SAMSUNG Electronics Co.

Exception Model – v7/v8 vectors With v8 Exception model vectors changesV7 vectors – entry per mode;Non-SecureException Table (tables difffer depending on HYP, Monitor, Secure, Non-Secure)0x00Reset0x04Undef Instruction0x08call – SVC0x0Cprefetch abort0x10data abort0x14-0x18IRQ0x1CFIQExc. Addr LR und, SP und SP, CPSR SPSR und, - user SP & LR {SP, LR} Exc. Addr LR svc, SP svc SP, CPSR SPSR svc, - user SP & LR {SP, LR} Exc. Addr LR abt, SP svc SP, CPSR SPSR svc, - user SP & LR {SP, LR} Exc. Addr LR abt, SP svc SP, CPSR SPSR svc, - user SP & LR {SP, LR} Exc. Addr LR irq, SP irq SP, CPSR SPSR irq, - user SP & LR {SP, LR} V8 vectors Exception entry several possibilities Executing AArch32 lower mode taken in higher mode AArch64 (i.e. virtualization, secure mode)Executing AArach64 lower taken in higher mode AArch54 (i.e. virtualization, secure mode)Exception taken to same levelAdvantage – quicker dispatch and handling, table per each levelSerr/vSErrFIQ/vFIQ0x680 IRQ/vIRQ0x600 Synchronousexecution was in lower aarch32 priv. ution was in lower aarch64 priv. ution was in same mode taken on SP Elx ynchronous5 2014 SAMSUNG Electronics Co.

Exceptions – accessing previous state Additional PStateInto EL1 ELR EL1, SPSR EL1, SPSR {abt, fiq, irq, und}, SP EL1 used in EL1 mode From EL0Into EL2 ELR EL2, SPSR EL2, SP EL2Also secure modeTwo version of SPSR – one for aarch32 other for aarch64Aarch3231 30 29 28 2726 25 24 23N Z C V Q IT[1:0] J 20 191615RESERVED GE[3:0]10 9 8 7 6 5 4IT[2:7]E A I F T1M[3:0]0T,J – tells you what instruction set was running, how to restore guest, what registers are validAarch64 – resembles run-time PState31 30 29 28 27N Z C V 26 25 24 2322 2120 191615SS IL10 9 8 7 6 5 4D A I F0M[3:0]0M[3:0] 3,2 EL; 1 0; 0 SP ELx or SP EL0 – what mode you came from 3:0 0x0 – EL0 running SP EL0 0x5 – EL1 runing 0x9 – EL2 runningThere is also EL3 for secure mode6 2014 SAMSUNG Electronics Co.

Exceptions – accessing aarch32 state If you trap from aarch32 into aarch64 you need aarch32 register stateAArch64 HVRunning Bit size change only through exceptionsAArch32 srr9usrr10usrr11usrr12usrx16x17x18x19x2064-bit OSRunningAArch32 14usr r14fiqBit EL1Application(64-bit)Operating -bit)Operating System(32-bit)Hypervisor (64-bit)Bit widthDecreases-orRemainssame7 2014 SAMSUNG Electronics Co.

Instruction Set In addition to ARM, ARMv7 supports Thumb2 – mix of thumb/arm instructions – compact codeJazelle DBX (direct byte execution) Some or no Java byte codes implemented, needs custom JVMJVM – BXJ – enter Jazelle – use software assists for unimplemented byte codes Optimized for JIT/JVM – instruction set aids language (i.e. array boundary checks)After byte codes compiled – ENTERX/LEAVEX – enters/leaves ThumbEEJazelle RCT (run time compilation target)ARMv8 does not support these modes – but you can run in ARMv7 mode on ARMv8 cpu Contrasting Instructions ARMv8 new clean instruction set Predication removed – i.e. moveq r1, r2 movs on exceptions – no applicable in ARMv8Removal of Coprocessors – for example dccmvac – v7 p15,0,Rt,c7,c0,1; v8 dc cvacGIC (general interrupt controller) – register for CPU, Dist., Redistr. – not memory mappedStack – armv8 – ldrp/strp – min. 16 – 2 regs push/pop – 16 byte alignedstrex/ldrex – lockless semaphores, local/global monitor – has been around in v7 No bus locking swp8 2014 SAMSUNG Electronics Co.

MMU – ARMv7 ARMv7 – Rev C – introduced LPAE – precursor to ARMv81st level pages 32-bit input VA (2 32) output upto 40 bit PA rangeFor 2nd stage (Virtualization – some more later) input range is 2 40App developer not much difference – except you may support larger DBs for exampleYou can still run old VMSA table format – i.e. supersections,For application usual 3GB/1GB splitTTBR0, TTBR1, TTBCR controls size of tables, inner/outer page table cachebility TTBR1 – kerenl, TTBR0 user – context switchedTable Size 512*8 4KB64 bit wide 2MB each 0101st Stage or 1only stage– 332 bit nput 42nd StageUpto 512GB510511Table Size 4kBTable Covers 512 GB – 2 39 01Each entry 64 bits 1GB510511.1GB01510511Table size 512*8 4KB51064 bit wide511.0151051101510511015105114K Pages.With no HYPmode canaddress 4GB of PhysMemory. Few additions to sections/pages & tables APTable – permissions for next lever page tablesPXNTable, PXN – prevent priv. execution from non-priv memoryXNTable, XN – prevent execution periodFew other for Secure/Non-Secure settings9 2014 SAMSUNG Electronics Co.

ARMv8 MMUTCR EL1.T1SZ2 64-1controlsTransition point 0xFFFF FFFF FFFF FFFFTTBR1 – OSRegion Size: 0xFFFF FFFF FFFF2 64-2 48Not mapped FFFF 0000 0000 0000Causes accessFaultRegion size:0xFFFE 0000 0000 00002 48–1 0x0000 FFFF FFFF FFFFTTBR0 – ApplicationRegion Size: 0xFFFF FFFF FFFFTCR EL1.T0SZcontrolsTransition point 0Input VA range 49 bits – upper bit sign extendedOutput PA impl. dependent (48 bits)2 page table types – 4Kb, 64Kb pages4 & 3 level tables – 4Kb and 64Kb page table sizes 64Kb format – fewer TLB faultsBoth formats for 1st and 2nd stage tables supportedTool chaines take care of expanding addressesPage table formats a lot like LPAE10 2014 SAMSUNG Electronics Co.

Virtualization – big pictue w/QEMU Big Picture – with virtualization besides another priv-level 2 you add2nd stage mmu tables, virtual interrupts & timers, emulated/virtio devices, pass-through devices HYP or PL2 privilege level & it’s mmu, config regs, and KVMARMv8 more complexity – handle v7/v8 guests – v7 guests several instruction sets & 32 bitSet Exit reason – HCR reg., Decode exit reason – HSR register (armv8 EL2 regs)World switch sw controlled save/restore as much or little as you need QEMUEL0GuestMachine ModelIO-LoopIO ThreadvCPU- Walk through BE io handlersCPUschedule for pollingModel- Lock ‘big’ qemu mutex lock- From FD determine BE device MachineMap- Call it’s read routineUnlock‘qemu’mutexlockDeviceio handlersModel(backend)IRQRoutingvCPUGuest EL0sysbusvCPU-Loop-Can vCPU run? i.e. monitor intfcUnlock qemu mutex lockIoctl(cpu,KVM RUN)Lock qemu mutex lockMMIO access – dispatch tomachine model handlers/dev/ttyXYZHost –KVMEL1SerialConsoleVTTBR2 StageKVMHYP/EL2 IdentityMapped rtc driverMMIOTransportList RegsI/O vIRQvcpuGICHvcpuGICHHYP Code GuestTTBRsuartdrivervmexit –mmiotoQEMUInject High-ResTimer TimerCNTV CVAL,TVALCTLArchtimer1st Stagegic handle irq()- GICC IAR- DeliverGuestPagesGuest HostTTBRsMemory(mmap)uartpci-businterface rtcMemory(mmap)-KVM vCPURun -LoopvGICDgic handle irq()vmexit- GICC IAR- DeliverFor GuestFor Host(Timer,QEMU)HCR.IMO 0HYP/Host Identity MapHOSTHVBARvmenterHTTBRvmexit Save host- Save guest- Restore guest - Restore host- All CPUs Start in Secure Mode (PL3) – some conig. Can oly be done in this modeHCR.IMO 1GUEST 11 2014 SAMSUNG Electronics Co.

MMU/Interrupts with Virtualization With Virtualization things get complex – thanks to hw extensions – you get near nativeLinux GuestTTBR1 EL1Virtual AddressSpace EL0, EL1IPA – “fake”OS EL1stNot mappeddeviceDev Pass2 stage throughtablesDRAMTTBR0 EL1HOST/KVMVirtual AddressSpace EL0, EL1HPA “Real”ndstApp EL0VTTBR0 EL2Physical1 stagetables1 stagetablesTTBR1 EL1PhysicaldeviceDRAM1 stage& onlyst1 stage& onlystOS EL1Not mappedApp EL0TTBR0 EL1PhysicalmemorytablesHyp EL2TTBR0 EL2GICVQEMUDispatch Interruptin GuestList Regs.Inject interrupt to Guestvia List Reg. Phys or EmulatedvGICDGICCEmulated device Int RaisedGICC interrupt CPUGICDPhysical Int Raised12 2014 SAMSUNG Electronics Co.

ARMv8 more then just processor - interconnect AMBA5 CHI (coherent hub interface) Goes beyond AXI4/3 – handle big serverMessage based, clock scales to cluster – terabit fabric – flow control with buffersSeveral coherency options – snoop filter, directoryShould support for PCI Config Cycles (not pre-defined address ranges)Bearing on Cache Coherency, TLB management, BarriersInterrupt ConrollerISUp to 4CortexA57 coresUp to 4 Up to 4Up to 4CortexCortexCortexA57 cores A57 cores A57 coresL2 CacheL2 Cache L2 CacheGPUL2 CacheAMBA5 CHIL3 Cache (upto 16 Mbytes)RAMOSUSBPCIeGigEI/O MMUSnoop Filter/DirectoryAPBRAMFlashhwclock Cache line invalidate – dci x0 – miss broadcast snoop to clusters Tags update on response for snoop requests Barriers isb, dsb, dmb driven to all cluster – wait for completion side effect, memory retire13 2014 SAMSUNG Electronics Co.

Caches/TLBs Similar between v7 and v8 Instruction -PIVPT – started with ARMv7, continued in ARMv8Tag stateState 0123 Eliminates aliases, but read-only not a threat Data – PIPT MESI, also MOESI ‘owned’ state cache to cache on – M- O ARM many cacheline owners, TLB types Guests, Host, HYP/EL2, Secure/non-secure TLB – NS bit, VMID, ASID, HYP bit TLB hit based on execution context, no flush on VM switch Additional TLB instr. for HYP/EL2 (i.e. TLBIMVAH) TLBs for 1st/2nd stage Caches no flush on VM switch Virtualization requires additional handling Override invalidate by set/way (DC ISW) to clean/invalidate (HCR) For 2nd stage RO mapping, change DC IMVAC to DC CIMVAC Prevent data loss14 2014 SAMSUNG Electronics Co.

Caches Shareability, Coherence Both v7, v8 have these concepts – v7 since AXIv4 Inner/Outer shareability – Inner shareble – cluster, outer shareability all clusters Boundaries really SoC dependent OS spanning multiple clusters – inner, outer same & shareable Cache coherence, cache op. attributes (WB,WT) manage by page tables barrier operations – must specify scope ‘ish’, ‘osh’ PoU or PoC Point of Union – mem location visible to all CPU Instr/Data caches, TLBwalks Point of Coherence – all agents see the same values – i.e I/O suffix ‘u’, ‘c’ used for cache operations15 2014 SAMSUNG Electronics Co.

Running ARMv8 Reference https://github.com/mjsmar/running-arm64-howto.git --Summary – on Ubuntu 14.04 install gcc-aarch64-linux-gnu – see reference includes guest build tooContains config files, host/guest rootfs .Download Foundation Model - the host platform - must register with ARMGet a minimum FS i.e. otfs.tar.gzSetup NFS environment – change to root- mkdir /srv/armv8rootfs; cd /srv/armv8rootfs; tar xzf path/linarm-image*gz- For example use Host IP 192.168.0.115- add to /etc/exports: /srv/ 192.168.0.115/255.255.255.0(rw sync no root squash no subtree check insecure)- Build host kernel – config select VIRTIO BLK, SCSI BLK, VIRTIO NET, HVC DRIVER, VIRTIO CONSOLE, VIRTIO,VIRTIO MMIO; enable NFS Boot- Download recent kernel 3.17 – make arch arm64 CROSS COMPILE aarch64-linux-gnu- defconfig- Followed by ‘menuconfig’ select options (or use prepared config file) and ‘Image’- Get boot wrapper – clone inas/boot-wrapper-aarch64.git- Mini boot loader Foundation model first runs, in boot wrapper directory- Create links – dtc kernel dtc directory; founation-v8.dts arch/arm64/boot/dts/foundat-v8.dtsImage arch/arm64/boot/Image; rtsm ve-motherboard.dts ; rtsm ve-aemv8a.dts skeleton.dtsi # kernel command lines to boot from NFSmake ARCH arm64 CROSS COMPILE aarch64-linux-gnuBOOTARGS “root /dev/nfs nfsroot 192.168.0.115:/srv/armv8rootfs/ rw ip 255.0:kvmfm:eth0 console ttyaAMA0” FDT SRC foundation-v8.dts- You should have linux-system.axf- Create tap device – tunctl; create brdige add tapX to bridge; set brX IP to 192.168.0.115- Restart NFS services – sudo service nfs-kernel-server restart;- showmount –e 192.168.0.115 – should see nfs mount point; add Foundation v8 to your PATH- Run: Foundation v8 –image path /linux-system.axf –network bridged –network-bridge tap0- Should be able to ssh nowsee reference how to build kvmtool and run guest in Foundation Model; may add qemu build too - later16 2014 SAMSUNG Electronics Co.

Transitioning from ARM v7 to ARM v8:All the basics you must know.Q&A 2014 SAMSUNG Electronics Co.

Thank you.Mario SmarduchSenior Virtualization ArchitectOpen Source GroupSamsung Research America (Silicon Valley)m.smarduch@samsung.com

v8 simplified the exception model vs v7? State, Privilege, Security level confusing Several usr, irq, fiq, svc, und, sys (also hyp, mon) – o Each had it’s own stack, banked registers and briefly used o Also instruction state (J,T) –ARMv8 only arm64 Privilege –scattered over various states –usr–0, system –to run privileged threads

Related Documents:

Reference Material §ARM ARM(“Architecture Reference Manual ”) §ARM DDI 0100E covers v5TE DSP extensions §Can be purchased from booksellers - ISBN 0-201-737191 (Addison-Wesley) §Available for download from ARM’swebsite §ARM v7-M ARM available for download from ARM’swebsite §Conta

(Dual-Monitor Arm, Single Monitor Arm, Monitor Arm Laptop Stand) If your monitor is too heavy (arm won't stay up) or too light (arm won't stay down), you need to . adjust the spring arm tension to hold them at the right height. Insert the long M6 Allen Wrench that came with your Monitor Arm into the . opening to adjust the tension bolt.

Trx arm workouts pdf. Trx band arm workout. Trx full arm workout. Trx chest and arm workout. Trx shoulder and arm workout. Trx arm workout video. Trx arm workout youtube. Whether you're looking for TRX exercises for beginners or a more advanced TRX workout plan, these moves have something for everyone. "The pike is a personal favorite of mine!"

Figure 2. Design of Space craft with robotic arm space in the launching vehicle compared to the traditional rigid, fixed geometry robotic arm. Figure 3. Morphing robotic arm section 3. DYNAMIC MODEL OF ROBOTIC ARM In this section, dynamic model of the morphing arm based on telescopic type morphing beam is derived. The robotic arm is assumed to .

8.) Once the Loading Arm is mounted to inlet-supply piping, reconnect torsion-spring link arm via pivot pin. 9.) Secure pivot pin with E-clips. 10.) Raise outboard section of Loading Arm until Link Arm holes align with pin lugs on outboard piping. 11.) For a supported boom-style arm, Loading Arm must be supported until the pillow block or the

LabVIEW uses the parallel communication to send the joint angles of the robot arm to the ARM. ARM microcontroller uses five PWM (Pulse Width Modulation) signals in order to control the robot arm, which was geared up with servo motors. Robot arm was controlled manually through the LabVIEW GUI (Graphical User Interface) controls.

4. Snap the cable-management clip (laptop arm) (16) onto the swivel arm (1). 5. Run the laptop power cable along the swivel arm and through the hook in the cable-management clip (laptop arm). 6. To adjust the placement of cable-management clip (laptop arm) on the swivel arm, slide the cable-management clip left or right. (figure 15) figure 15

rather than replace other ARM documentation availabl e for Cortex-A series processors, such as the ARM Technical Reference Manuals (TRMs) for the processors themselves, documentation for individual devices or boards or, most importantly, the ARM Architecture Reference Manual (the ARM ARM).