GPU-Accelerated 2D And Web Rendering - NVIDIA

2y ago
42 Views
5 Downloads
4.96 MB
79 Pages
Last View : 2m ago
Last Download : 3m ago
Upload by : Gideon Hoey
Transcription

GPU-Accelerated 2D andWeb RenderingMark Kilgard

Talk DetailsLocation: West Hall Meeting Room 503, Los Angeles Convention CenterDate: Wednesday, August 8, 2012Time: 2:40 PM – 3:40 PMMark Kilgard (Principal Software Engineer, NVIDIA)Abstract: The future of GPU-based visual computing integrates the web, resolutionindependent 2D graphics, and 3D to maximize interactivity and quality while minimizingconsumed power. See what NVIDIA is doing today to accelerate resolution-independent 2Dgraphics for web content. This presentation explains NVIDIA's unique "stencil, then cover"approach to accelerating path rendering with OpenGL and demonstrates the wide variety of webcontent that can be accelerated with this approach.Topic Areas: GPU Accelerated Internet; Digital Content Creation & Film; VisualizationLevel: Intermediate

Mark KilgardPrincipal System Software EngineerOpenGL driver and API evolutionCg (“C for graphics”) shading languageGPU-accelerated path renderingOpenGL Utility Toolkit (GLUT) implementerAuthor of OpenGL for the X Window SystemCo-author of Cg Tutorial

GPUs are good at a lot of stuff

GamesBattlefield 3, EA

Data visualization

Product designCatia

Physics simulationCUDA N-Body

Interactive ray tracingOptiX

Training

Molecular modelingNCSA

Impressive stuff

What about advancing 2D graphics?

Can GPUs render & improve the immersive web?

What is path rendering?A rendering approachResolution-independent two-dimensionalgraphicsOcclusion & transparency depend on renderingorderSo called “Painter’s Algorithm”Basic primitive is a path to be filled or strokedPath is a sequence of path commandsCommands are– moveto, lineto, curveto, arcto, closepath,etc.StandardsContent: PostScript, PDF, TrueType fonts,Flash, Scalable Vector Graphics (SVG), HTML5Canvas, Silverlight, Office drawingsAPIs: Apple Quartz 2D, Khronos OpenVG,Microsoft Direct2D, Cairo, Skia, Qt::QPainter,Anti-grain Graphics

Seminal Path Rendering PaperJohn Warnock & Douglas Wyatt, Xerox PARCPresented SIGGRAPH 1982Warnock founded Adobe months laterJohn WarnockAdobe founder

Path Rendering StandardsDocumentPrinting xperience2D licationsJava aphicsMac OS X2D APIOpen XMLPaper (XPS)HTML 5Khronos APIAdobe IllustratorInkscapeOpen Source

Live DemoClassic PostScript contentComplex text renderingFlash contentYesterday’s New York Times rendered fromits resolution-independent form

Last Year’s SIGGRAPH Results in Real-timeRon Maharik, Mikhail Bessmeltsev,Alla Sheffer, Ariel Shamir andNathan CarrSIGGRAPH 2011, July 2011“Girl with Words in Her Hair” scene591 paths338,507 commands1,244,474 coordinates

3D Rendering vs. Path RenderingCharacteristicGPU 3D renderingPath renderingDimensionalityProjective 3D2D, typically affinePixel mappingResolution independentResolution independentOcclusionDepth bufferingPainter’s algorithmRendering primitivesPoints, lines, trianglesPathsPrimitive constituentsVerticesControl pointsConstituents per primitive1, 2, or 3 respectivelyUnboundedTopology of filled primitivesAlways convexCan be concave, self-intersecting, and have holesDegree of primitives1st order (linear)Up to 3rd order (cubic)Rendering modesFilled, wire-frameFilling, strokingLine propertiesWidth, stipple patternWidth, dash pattern, capping, join styleColor processingProgrammable shadingPainting filter effectsText renderingNo direct support (2nd class support)Omni-present (1st class support)Raster operationsBlendingBrushes, blend modes, compositingColor modelRGB or sRGBRGB, sRGB, CYMK, or grayscaleClipping operationsClip planes, scissoring, stencilingClipping to an arbitrary clip pathCoverage determinationPer-color sampleSub-color sample

CPU vs. GPU atRendering Tasks over %40%30%30%20%20%10%10%0%1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012Pipelined 3D Interactive Rendering0%GPUCPU1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012Path RenderingGoal of NV path rendering is to make path rendering a GPU taskRender all interactive pixels, whether 3D or 2D or web content with the GPU

What is NV path rendering?OpenGL extension to GPU-accelerate path renderingUses “stencil, then cover” (StC) approachCreate a path objectStep 1: “Stencil” the path object into the stencil bufferGPU provides fast stenciling of filled or stroked pathsStep 2: “Cover” the path object and stencil test against its coverage stenciled by theprior stepApplication can configure arbitrary shading during the stepMore details laterSupports the union of functionality of all major path rendering standardsIncludes all stroking embellishmentsIncludes first-class text and font supportAllows functionality to mix with traditional 3D and programmable shading

ConfigurationGPU: GeForce 480 GTX (GF100)CPU: Core i7 950 @ 3.07 GHzNV path renderingCompared to AlternativesAlternative APIs rendering same contentWith Release 300 driver NV path rendering2,000.002,000.0016x1,800.001,600.00QtSkia Bitmap1,400.00Skia Ganesh FBO (16x)Skia Ganesh Aliased (1x)1,200.00Direct2D GPUDirect2D WARP8x4x1,600.002x1,400.00Frames per second1x1,200.001,000.00800.001,000.00Alternative approachesare all much ndow Resolution in x400Window Resolution in Pixels100x100-100x100Frames per secondCairo1,800.00

ConfigurationGPU: GeForce 480 GTX (GF100)CPU: Core i7 950 @ 3.07 GHzDetail on AlternativesSame results, changed Y AxisAlternative APIs rendering same content250.002,000.001,800.001,600.00QtSkia Bitmap1,400.00Skia Ganesh FBO (16x)Skia Ganesh Aliased (1x)1,200.00Direct2D GPUDirect2D WARPFrames per second200.001,000.00800.00150.00Fast, but 00x1100900x9001000x1000Window Resolution in 0x20011 00x1 100900x900100x100Window Resolution in Pixels10 00x1 00100x100F r a m e s p e r s e co n dCai roQtSki a Bi tmapSki a Ganes h FBO (16x)Ski a Ganes h Al i ased (1x)Di rect2D GPUDi rect2D WARPCairo

kiaGaneshNVpr16/Direct2D GPUNVpr16/Direct2D W ARP10.001.000.10Y axis is logarithmic—shows how many TIMES faster NV path rendering is that eEm brace the 0x1100s pikesAm erican Sam 0Wels h dragonCeltic round dogsb 1000x10001100x1100Across an range of scenes Release 300 GeForce GTX 480 Speedups over AlternativesCougartiger clipped by he

pr16/SkiaGaneshNVpr16/D2DNVpr16/W e the WorldY okozaw 0x600700x700800x800900x9001000x10001100x1100A merican Samoa cow 9001000x10001100x1100Welsh dragonCeltic round dogs butterf x600700x700800x800900x9001000x10001100x1100GeForce 650 (Kepler) ResultsCougartiger clipped by hear

Tiger Scene on GeForce 650Absolute Frames/Second on GeForce 650500.0450.0NVpr “peaks” at1,800 FPS at 100x100400.0NV path rendering (16x)Cairo350.0Frames per secondQtSkia Bitmap300.0Skia Ganesh FBOSkia Ganesh 1x (aliased)250.0Direct2D GPUDirect2D WARP200.0poor 00500x500600x600700x700Window resolution800x800900x9001000x10001100x1100

NV path rendering is more than justmatching CPU vector graphics3D and vector graphics mixSuperior qualityGPU2D in perspective is freeCPUCompetitorsArbitrary programmable shader on paths—bump mapping

Partial Solutions Not EnoughPath rendering has 30 years of heritage and historyCan’t do a 90% solution and Software to changeTrying to “mix” CPU and GPU methods doesn’t workExpensive to move software—needs to be an unambiguous winMust surpass CPU approaches on all frontsJohn WarnockAdobe founderPerformanceQualityFunctionalityConformance to standardsMore power efficientEnable new applicationsInspiration: Perceptive Pixel

Path Filling and Strokingjust fillingjust strokingfilling stroke intended content

Dashing Content ExamplesSame cakemissing dashedstroking detailsFrosting on cake is dashedelliptical arcs with roundend caps for “beaded” look;flowers are also dashingAll content shownis fully GPU renderedArtist made windowswith dashed linesegmentTechnical diagramsand charts often employdashingDashing character outlines for quilted look

Excellent Geometric Fidelity for StrokingCorrect stroking is hardLots of CPU implementationsapproximate strokingGPU-acceleratedOpenVG referenceGPU-accelerated stroking avoidssuch short-cutsGPU has FLOPS to compute truestroke point containmentCairoQtStroking with tight end-point curve

The ApproachStep 1StencilStep 2:Coverrepeat“Stencil, then Cover” (StC)Map the path rendering task from a sequentialalgorithm to a pipelined and massively parallel taskBreak path rendering into two stepsFirst, “stencil” the path’s coverage into stencil bufferSecond, conservatively “cover” pathTest against path coverage determined in the 1st stepShade the pathAnd reset the stencil value to render next path

Pixel pipelineVertex pipelinePath pipelineApplicationPath specificationVertex assemblyPixel assemblyTransform path(unpack)Vertex operationstransformfeedbackPrimitive assemblyPixel operationsPrimitive operationsPixel ill/StrokeCoveringFragment operationsRaster operationsFramebufferFill/StrokeStencilingDisplay

Key Operations for RenderingPath ObjectsStencil operationonly updates stencil bufferglStencilFillPathNV, glStencilStrokePathNVCover operationglCoverFillPathNV, glCoverStrokePathNVrenders hull polygons guaranteed to “cover” region updated bycorresponding stencilTwo-step rendering paradigmstencil, then cover (StC)Application controls cover stenciling and shading operationsGives application considerable controlNo vertex, tessellation, or geometry shaders active duringstepsWhy? Paths have control points & rasterized regions, not vertices,triangles

Path Rendering Example (1 of 3)Let’s draw a green concave 5-point stareven-odd fill stylenon-zero fill stylePath specification by string of a starGLuint pathObj 42;const char *pathString "M100,180 L40,10 L190,120 L10,120 L160,10 z";glPathStringNV(pathObj,GL PATH FORMAT SVG NV,strlen(pathString),pathString);Alternative: path specification by datastatic const GLubyte pathCommands[5] {GL MOVE TO NV, GL LINE TO NV, GL LINE TO NV, GL LINE TO NV, GL LINE TO NV,GL CLOSE PATH NV };static const GLshort pathVertices[5][2] { {100,180}, {40,10}, {190,120}, {10,120}, {160,10} };glPathCommandsNV(pathObj, 6, pathCommands, GL SHORT, 10, pathVertices);

Path Rendering Example (2 of 3)InitializationClear the stencil buffer to zero and the color buffer to ncilMask( 0);glClear(GL COLOR BUFFER BIT GL STENCIL BUFFER BIT);Specify the Path's TransformglMatrixIdentityEXT(GL PROJECTION);glMatrixOrthoEXT(GL MODELVIEW, 0,200, 0,200, -1,1); // uses DSA!Nothing really specific to path rendering hereDSA OpenGL’s Direct State Access extension (EXT direct state access)

Path Rendering Example (3 of 3)Render star with non-zero fill styleStencil pathglStencilFillPathNV(pathObj, GL COUNT UP NV, 0x1F);non-zero fill styleCover pathglEnable(GL STENCIL TEST);glStencilFunc(GL NOTEQUAL, 0, 0x1F);glStencilOp(GL KEEP, GL KEEP, GL ZERO);glColor3f(0,1,0); // greenglCoverFillPathNV(pathObj, GL BOUNDING BOX NV);Alternative: for even-odd fill styleeven-odd fill styleJust program glStencilFunc differentlyglStencilFunc(GL NOTEQUAL, 0, 0x1);// alternative mask

“Stencil, then Cover”Path Fill StencilingSpecify a pathSpecify arbitrary path transformationProjective (4x4) allowedDepth values can be generated fordepth testingstencil fillpath commandper-pathfill regionoperationsSample accessibility determinedWinding number w.r.t. thetransformed path is computedAdded to stencil value ofaccessible samplesprojectivetransformclipping &scissoringpathobjectsampleaccessibilitywindow, depth& stencil testsAccessibility can be limited by anyor all ofScissor test, depth test, stenciltest, view frustum, user-definedclip planes, sample mask, stipplepattern, and window cilingspecificpath windingnumbercomputationstencilupdate: , -, or invertstencilbuffer

“Stencil, then Cover”Path Fill CoveringSpecify a pathSpecify arbitrary pathtransformationcover fillpath commandper-pathfill regionoperationsProjective (4x4) allowedDepth values can be generated fordepth testingSample accessibility determinedAccessibility can be limited by anyor all ofScissor test, depth test, stenciltest, view frustum, user-definedclip planes, sample mask, stipplepattern, and window ownershipConservative covering geometryuses stencil to “cover” filled pathDetermined by prior stencil stepper-sampleoperationsper-fragment ivetransformclipping &scissoringpathobjectsampleaccessibilitywindow, depth& stencil testsstencilupdatetypically zeroprogrammablepathshadingstencilbuffer

Adding Stroking to the StarAfter the filling, add a stroked “rim”to the star like this Set some stroking parameters (one-time):glPathParameterfNV(pathObj, GL STROKE WIDTH NV, 10.5);glPathParameteriNV(pathObj, GL JOIN STYLE NV, GL ROUND NV);non-zero fill styleStroke the starStencil pathglStencilStrokePathNV(pathObj, 0x3, 0xF); // stroked samples marked“3”Cover pathglEnable(GL STENCIL TEST);glStencilFunc(GL EQUAL, 3, 0xF); // update if sample marked “3”glStencilOp(GL KEEP, GL KEEP, GL ZERO);glColor3f(1,1,0); // yellowglCoverStrokePathNV(pathObj, GL BOUNDING BOX NV);even-odd fill style

“Stencil, then Cover”Path Stroke StencilingSpecify a pathSpecify arbitrary path transformationProjective (4x4) allowedDepth values can be generated fordepth testingstencil strokepath commandper-pathfill ipping &scissoringSample accessibility determinedAccessibility can be limited by anyor all ofScissor test, depth test, stenciltest, view frustum, user-definedclip planes, sample mask, stipplepattern, and window ownershipPoint containment w.r.t. the strokedpath is determinedpathobjectsampleaccessibilitywindow, depth& stencil testsper-sampleoperationsReplace stencil value of tcontainmentstencilupdate:replacestencilbuffer

“Stencil, then Cover”Path Stroke CoveringSpecify a pathSpecify arbitrary pathtransformationcover strokepath commandper-pathfill regionoperationsProjective (4x4) allowedDepth values can be generated fordepth testingSample accessibility determinedAccessibility can be limited by anyor all ofScissor test, depth test,stencil test, view frustum,user-defined clip planes,sample mask, stipple pattern,and window ownershipConservative covering geometryuses stencil to “cover” stroked pathDetermined by prior stencil stepper-sampleoperationsper-fragment ivetransformclipping &scissoringpathobjectsampleaccessibilitywindow, depth& stencil testsstencilupdatetypically zeroprogrammablepathshadingstencilbuffer

First-class, Resolution-independentFont SupportFonts are a standard, first-class part of all path rendering systemsForeign to 3D graphics systems such as OpenGL and Direct3D, but natural forpath renderingBecause letter forms in fonts have outlines defined with pathsTrueType, PostScript, and OpenType fonts all use outlines to specify glyphsNV path rendering makes font support easyCan specify a range of path objects withA specified fontSequence or range of Unicode character pointsNo requirement for applications use font API to load glyphsYou can also load glyphs “manually” from your own glyph outlinesFunctionality provides OS portability and meets needs of applicationswith mundane font requirements

Handling Common Path RenderingFunctionality: FilteringGPUs are highly efficient at image filteringFast texture mappingQtMipmappingAnisotropic filteringWrap modesCPUs aren't reallyMoiréartifactsGPUCairo

Handling Uncommon Path RenderingFunctionality: ProjectionProjection “just works”Because GPU does everythingwith perspective-correctinterpolation

Projective Path Rendering Support ComparedGPUflawlesscorrectcorrectSkiayes, but portedunsupportedunsupportedunsupported

Path Geometric QueriesglIsPointInFillPathNVdetermine if object-space (x,y) position is inside or outside path, givena winding number maskglIsPointInStrokePathNVdetermine if object-space (x,y) position is inside the stroke of a pathaccounts for dash pattern, joins, and capsglGetPathLengthNVreturns approximation of geometric length of a given sub-range of pathsegmentsglPointAlongPathNVreturns the object-space (x,y) position and 2D tangent vector a givenoffset into a specified path objectUseful for “text follows a path”Queries are modeled after OpenVG queries

Accessible Samples of a Transformed PathWhen stenciled or covered, a path is transformed by OpenGL’scurrent modelview-projection matrixAllows for arbitrary 4x4 projective transformMeans (x,y,0,1) object-space coordinate can be transformed to have depthFill or stroke stenciling affects “accessible” samplesA samples is not accessible if any of these apply to the sampleclipped by user-defined or view frustum clip planesdiscarded by the polygon stipple, if enableddiscarded by the pixel ownership testdiscarded by the scissor test, if enableddiscarded by the depth test, if enableddisplaced by the polygon offset from glPathStencilDepthOffsetNVdiscarded by the depth test, if enableddiscarded by the (implicitly enabled) stencil testspecified by glPathStencilFuncNVwhere the read mask is the bitwise AND of the glPathStencilFuncNV readmask and the bit-inve

GPU-accelerated path rendering OpenGL Utility Toolkit (GLUT) implementer Author of OpenGL for the X Window System Co-author of Cg Tutorial. GPUs are good at a lot of stuff. . Warnock founded Adobe months later John Warnock Adobe founder. Path Rendering Standards Document Printing and Exch

Related Documents:

OpenCV GPU header file Upload image from CPU to GPU memory Allocate a temp output image on the GPU Process images on the GPU Process images on the GPU Download image from GPU to CPU mem OpenCV CUDA example #include opencv2/opencv.hpp #include <

plify development of HPC applications, they can increase the difficulty of tuning GPU kernels (routines compiled for offloading to a GPU) for high performance by separating developers from many key details, such as what GPU code is generated and how it will be executed. To harness the full power of GPU-accelerated nodes, application

transplant a parallel approach from a single-GPU to a multi-GPU system. One major reason is the lacks of both program-ming models and well-established inter-GPU communication for a multi-GPU system. Although major GPU suppliers, such as NVIDIA and AMD, support multi-GPUs by establishing Scalable Link Interface (SLI) and Crossfire, respectively .

NVIDIA vCS Virtual GPU Types NVIDIA vGPU software uses temporal partitioning and has full IOMMU protection for the virtual machines that are configured with vGPUs. Virtual GPU provides access to shared resources and the execution engines of the GPU: Graphics/Compute , Copy Engines. A GPU hardware scheduler is used when VMs share GPU resources.

GPU Tutorial 1: Introduction to GPU Computing Summary This tutorial introduces the concept of GPU computation. CUDA is employed as a framework for this, but the principles map to any vendor’s hardware. We provide an overview of GPU computation, its origins and development, before presenting both the CUDA hardware and software APIs. New Concepts

limitation, GPU implementers made the pixel processor in the GPU programmable (via small programs called shaders). Over time, to handle increasing shader complexity, the GPU processing elements were redesigned to support more generalized mathematical, logic and flow control operations. Enabling GPU Computing: Introduction to OpenCL

Possibly: OptiX speeds both ray tracing and GPU devel. Not Always: Out-of-Core Support with OptiX 2.5 GPU Ray Tracing Myths 1. The only technique possible on the GPU is “path tracing” 2. You can only use (expensive) Professional GPUs 3. A GPU farm is more expensive than a CPU farm 4. A

Latest developments in GPU acceleration for 3D Full Wave Electromagnetic simulation. Current and future GPU developments at CST; detailed simulation results. Keywords: gpu acceleration; 3d full wave electromagnetic simulation, cst studio suite, mpi-gpu, gpu technology confere