Verifying Oracle’s SPARC Processors With ACL2

2y ago
98 Views
2 Downloads
8.40 MB
21 Pages
Last View : 6d ago
Last Download : 3m ago
Upload by : Farrah Jaffe
Transcription

Verifying Oracle’s SPARC processors with ACL2Greg GrohoskiVP, Hardware DevelopmentOracle MicroelectronicsMay 23, 2017Copyright 2017, Oracle and/or its affiliates. All rights reserved.

Outline SPARC refresher Processor verification challenges ACL2 experience Q&ACopyright 2017 Oracle and/or its affiliates. All rights reserved. 2

SPARC @ Oracle201020112013201320132015T3T4T5M5M6M7S716 x 2nd Gencores6MB L2 Cache1.7 GHz8 x 3rd GenCores4MB L3 Cache3.0 GHz16 x 3rd GenCores8MB L3 Cache3.6 GHz6 x 3rd GenCores48MB L3Cache3.6 GHz12 x 3rd GenCores48MB L3Cache3.6 GHz32 x 4th GenCores64MB L3 Cache4.1 GHzDAX18 x 4th GenCores16MB L3 cache4.2 GHzDAX1Copyright 2017 Oracle and/or its affiliates. All rights reserved. 2016Next upM83

SPARC M7 World’s fastest microprocessor– Highest throughput of any commercial processor– 20 commercial benchmark 32.html 32, 8-threaded 2-issue OOO cores, 4.1 GHz, 64MB partitioned L3 cache, 160 GB/smemory BW Most secure– Silicon Secured Memory (SSM)– Widest cipher acceleration (DES/3DES, AES, Camellia, MD5, all SHA variants, public-keygeneration including ECC) – always-on encryption Numerous software-in-silicon features– Database acceleration (DAX)Copyright 2017 Oracle and/or its affiliates. All rights reserved. 4

CORECORECLUSTER CLUSTERCORECORECLUSTER CLUSTERACCELERATORSMEMORY CONTROLCORECORECLUSTER CLUSTERL3 & ON-CHIP NETWORKCOHERENCE, SMP & I/OINTERCONNECTACCELERATORSMEMORY CONTROLCORECORECLUSTER CLUSTERCOHERENCE, SMP & I/OINTERCONNECTM7 Processor Overview§ 32 SPARC 8-threaded C4 cores @4.1 GHz§ 8 partitions each with a core clusterand 8MB of L3 § 8 Data Analytics Accelerators forquery acceleration and messaging§ 8 DDR4 memory schedulersproviding 160 GB/s§ A coherency subsystem for 1 to 32socket scaling§ A coherency and IO interconnect thatprovides 336GB/s of peak BW§ Extensive SW-in-Silicon feature setCopyright 2017 Oracle and/or its affiliates. All rights reserved. 5

Silicon Secured Memory SPARC M7 protects data inmemory Hidden “color” bits added topointers (key), and content (lock) Pointer color (key) must matchcontent color or program isabortedPointer “B”GOPointer “R”GO Set on memory allocation, changedon memory free Protects against access off end ofstructure, bugs, stale pointeraccess, and malicious attacksPointer “Y”ApplicationsM7 ProcessorCopyright 2017 Oracle and/or its affiliates. All rights reserved. Memory6

SQL in Silicon: Data Analytics Accelerator (DAX)Local SRAMOn-ChipNetworkM7 DAX (1 of 8 Units)DataInputQueuesResultDecompress Unpack/ PredicateAlignEval compressUnpack/ PredicateResultAlignEvalFormat/EncodeUnpack/ PredicateResultEvalAlignFormat/EncodeResultUnpack/ aOutputQueuesOn-ChipNetwork Stream Processor for data parallel operations DSP-like pipe for efficient filtering operations, typical of first phase of any query Cache sparing design with more complex processing in general purpose coresCopyright 2017 Oracle and/or its affiliates. All rights reserved. 7

Processor development challenges Increasing complexityConstrained development cost, scheduleLong, lengthening fab timesHigh mask costsProcessor verification tools, techniques have plateauedèIncreased risk of late-cycle bugsèGeometric increase in bug cost vs. timeCopyright 2017 Oracle and/or its affiliates. All rights reserved. 8

Example: processor core bugs vs. time beyond first TOCopyright 2017 Oracle and/or its affiliates. All rights reserved. 9

What to do. Simplify, leverage (of course.) Use more simulation cycles, resources– Not helping much – 2x00 2x,000 control states Take more development time, tape-outs– Not feasible Reconfigurable logic?– Microcode (for some classes of bugs.) Embrace formal verification techniques– Had been using model checking, but.what about theorem proving?Copyright 2017 Oracle and/or its affiliates. All rights reserved. 10

Summer 2013 Improving divide, square root performance for new core via new algorithm– Floating-point divider (32b and 64b)– Floating-point square root (32b and 64b)– Integer divider (signed and unsigned, 32b and 64b) How to verify?– Stand-alone testbench – impossible to cover all operand values– Mathematical proof – needed expertise Asked for help from Oracle Labs Labs contacted UT Austin– ACL2Copyright 2017 Oracle and/or its affiliates. All rights reserved. 11

Case study Floating-point division and square root verification known to be difficult– Project was estimated to be a 2 year effort We had less than 1 year to RTL freeze to find any errors Team started internally from scratch Vision for project: Apply additional formal verification techniques to increase confidence in correctnessof our designs– Improve pre- and post-silicon verificationCopyright 2017 Oracle and/or its affiliates. All rights reserved. 12

Summer 2014 Wrote 1st draft of ACL2 spec for IEEE 754 Standard on Floating-Point Arithmetic Integer division variants Verified Floating-point implementations wrt significand Integer division implementations Found improvements for SPARC core Reduced look-up tables by 60% Improvements based on careful error analysis Simplified square-root implementationCopyright 2017 Oracle and/or its affiliates. All rights reserved. 13

Summer 2015 Validated ACL2 spec of IEEE 754 Standard against 8M vectors Verified all 4 floating-point implementations wrt Sign, exponent, and significand All exceptions Found more improvements for future SPARC core– Reduced latency further for fdivs, fdivd fsqrts, fsqrtd idiv Improvements were all proven correct first and then implemented No extensive 24x7 testing neededCopyright 2017 Oracle and/or its affiliates. All rights reserved. 14

Summer 2016 Applied ACL2 to other areas Temporal properties Proved absence of starvation for certain instructions Looked at verification of complex crypto instructions Verified Montgomery algorithms implementations wrt their math specification Saved 25% latency in one instruction Generated test vectors to exercise special cases for these instructions Verified and improved fault-tolerant cache-coherency protocol (like CCIX)– “Bake off” between ACL2 (TP) and Murphi/PReach (MC)Copyright 2017 Oracle and/or its affiliates. All rights reserved. 15

Summer 2017 Found more improvements for integer division for future SPARC core Further, significant latency reduction for integer division Again, optimization was first verified, then implemented Verified another function using Cellular Automata Shift Registers Applied ACL2 to prove absence of starvation for certain instructions– Another “bake off” between ACL2 (TP) and SVA model checkingCopyright 2017 Oracle and/or its affiliates. All rights reserved. 16

Future directions Increase use of theorem-proving where possible– State machines?– Help validate inter-module protocols Improve tools– Scale (to better handle inter-module behavior)– Automate extraction from Verilog to ACL2 Need to simplify/abstract expertise needed to use ACL2 .Copyright 2017 Oracle and/or its affiliates. All rights reserved. 17

Contributions to the Community We thank the ACL2 community for all their contributions A healthy ACL2 community is vital Oracle funds PhDs at UT Austin Oracle supports the ACL2 workshop Oracle contributes improvements to ACL2 libraries Oracle contracts with FHICopyright 2017 Oracle and/or its affiliates. All rights reserved. 18

Contributors Oracle– David Rager– Jo Ebergen– Dmitry Nadezhin– Austin Lee– Andrew Brock– and many others. UT Austin– Cuong Chau– Ben Selfridge– Keshav Kini– Warren Hunt– Matt KaufmannCopyright 2017 Oracle and/or its affiliates. All rights reserved. 19

Q&ACopyright 2017 Oracle and/or its affiliates. All rights reserved. 20

Copyright 2017 Oracle and/or its affiliates. All rights reserved. 21

SPARC @ Oracle 16 x 2nd Gen cores 6MB L2 Cache 1.7 GHz 8 x 3 rd Gen Cores 4MB L3 Cache 3.0 GHz 16 x 3rd Gen Cores 8MB L3 Cache 3.6 GHz 12 x 3rd Gen 48MB L3 Cache 3.6 GHz 6 x 3 Gen Cores 48MB L3 Cache 3.6 GHz T3 T4 T5 M5 M6 S7 32 x 4th Gen Cores 64MB L3 Cache 4.1 GHz DAX1 M7 8 x 4th Gen Co

Related Documents:

Table 2 provides a comparison between the SPARC T5, SPARC T4, SPARC T3 processors. The SPARC T5 leverages many of the elements from the SPARC T4 processor. TABLE 2. SPARC T5, SPARC T4, AND SPARC T3 PROCESSOR FEATURE COMPARISON FEATURE SPARC T5 PROCESSOR SPARC T4 PROCESSOR SPARC T3 PROCESS

SPARC T3-4 ActiveAug 2012 SPARC T4-1 Mar 2016 Active SPARC T4-1B Sep 2014 Active SPARC T4-2 Dec 2014 Active SPARC T4-4 Dec 2014 Active SPARC T5-1 Aug 2016 Active SPARC T5-2 Aug 2017 Active SPARC T5-4 Aug 2017 Active SPARC T5-8 Aug 2017 Active SPARC T7-1 Aug 2020 Active SPARC

Name SPARC T3-1 Server SPARC T3-2 Server Sun SPARC Enterprise T5120 Server Sun SPARC Enterprise T5220 Server Processor/CPUs 16 -core 1.65 SPARC T3 processors; one processor per system, max. 128 threads core 1.65 SPARC T3 processors; two processors per system, max. 256 threads 4 core 1.2 GHz, 8 core

SPARC @ Oracle 5 Processors in 4 Years 2010 2011 2012 SPARC T3 § 16 S2 cores § 4MB L3 § 40 nm technology § 1.65 GHz SPARC T4 § 8 S3 Cores § 4MB L3 § 40nm Technology § 3.0 GHz SPARC T5 § 16 S3 Cores § 8MB L3 § 28nm Technology § 3.6 GHz 2012 2013 3 SPARC M5 § 6 S3 Cores § 48MB L3 § 28nm T

Maximizing Service Uptime with Oracle's SPARC T3-1, SPARC T3-2, SPARC T3-4, and SPARC T3-1B Servers horizontally scaled architecture. Although techniques that distribute network-centric applications across multiple systems take some pressure off individual server availability, continuous operation of each system remains important.

M7-8 Server Safety and Compliance Guide to review the important safety and compliance information. 2. Refer to the SPARC M8 and SPARC M7 Servers Security Guide to understand the security issues relevant to . Oracle VM Server for SPARC is a virtualization feature that provides the ability to create logical discrete groupings called logical .

Oracle e-Commerce Gateway, Oracle Business Intelligence System, Oracle Financial Analyzer, Oracle Reports, Oracle Strategic Enterprise Management, Oracle Financials, Oracle Internet Procurement, Oracle Supply Chain, Oracle Call Center, Oracle e-Commerce, Oracle Integration Products & Technologies, Oracle Marketing, Oracle Service,

Oracle VM Server for SPARC Logical Domains (now Oracle VM Server for SPARC) is a virtualization technol-ogy that creates SPARC virtual machines, also called domains. This new style of hypervisor permits operation of virtual machines with less overhead than traditional designs by changing the way guests access physical CPU, memory, and I/O .