Programming The GPU: High-Level Shading LanguagesHigh .

2y ago
7 Views
2 Downloads
3.73 MB
43 Pages
Last View : 5m ago
Last Download : 3m ago
Upload by : Louie Bolen
Transcription

Tutorial 5: Programming Graphics HardwareProgramming the GPU:High-Level Shading LanguagesHigh-LevelRandy FernandoDeveloper Technology Group

Talk OverviewThe Evolution of GPU Programming LanguagesGPU Programming Languages and the GraphicsPipelineSyntaxExamplesHLSL FX frameworkTutorial 5: Programming Graphics Hardware

The Evolution of GPU ProgrammingLanguagesC(AT&T, 1970s)IRIS GL(SGI, 1982)RenderMan(Pixar, 1988)C (AT&T, 1983)OpenGL(ARB, 1992)Java(Sun, 1994)Reality Lab(RenderMorphics, 1994)PixelFlowShadingLanguage(UNC, 1998)Direct3D(Microsoft, 1995)HLSL(Microsoft, 2002)Cg(NVIDIA, RB, 2003)Tutorial 5: Programming Graphics Hardware

NVIDIA’s Position onGPU Shading LanguagesBottom line: please take advantage of all thetransistors we pack into our GPUs!Use whatever language you likeWe will support youWorking with Microsoft on HLSL compilerNVIDIA compiler team working on Cg compilerNVIDIA compiler team working on GLSL compilerIf you find bugs, send them to us and we’ll getthem fixedTutorial 5: Programming Graphics Hardware

The Need for ProgrammabilityVirtua FighterDead or Alive 3Dawn(SEGA Corporation)(Tecmo Corporation)(NVIDIA Corporation)NV150K triangles/sec1M pixel ops/sec1M transistorsXbox (NV2A)100M triangles/sec1G pixel ops/sec20M transistorsGeForce FX (NV30)200M triangles/sec2G pixel ops/sec120M transistors199520012003Tutorial 5: Programming Graphics Hardware

The Need for ProgrammabilityVirtua FighterDead or Alive 3Dawn(SEGA Corporation)(Tecmo Corporation)(NVIDIA Corporation)NV116-bit color640 x 480Nearest filteringXbox (NV2A)32-bit color640 x 480Trilinear filteringGeForce FX (NV30)128-bit color1024 x 7688:1 Aniso filtering199520012003Tutorial 5: Programming Graphics Hardware

Where We Are Now222M Transistors660M tris/second64 Gflops128-bit color1600 x 120016:1 anisofilteringTutorial 5: Programming Graphics Hardware

The Motivation forHigh-Level Shading LanguagesGraphics hardware has becomeincreasingly powerfulProgramming powerful hardwarewith assembly code is hardGeForce FX and GeForce 6Series GPUs support programsthat are thousands of assemblyinstructions longProgrammers need the benefitsof a high-level language:languageEasier programmingEasierEasierAssembly T.R0, c[11].xyzx, c[11].xyzx;R0, R0.x;R0, R0.x, c[11].xyzx;R1, c[3];R1, R1.x, c[0].xyzx;R2, R1.xyzx, R1.xyzx;R2, R2.x;R1, R2.x, R1.xyzx;R2, R0.xyzx, R1.xyzx;R3, R2.xyzx, R2.xyzx;R3, R3.x;R2, R3.x, R2.xyzx;R2, R1.xyzx, R2.xyzx;R2, c[3].z, R2.x;R2.z, c[3].y;R2.w, c[3].y;R2, R2;High-Level Language float3 cSpecular pow(max(0, dot(Nf, H)),phongExp).xxx;float3 cPlastic Cd * (cAmbient cDiffuse) code reuseCs * cSpecular; debuggingTutorial 5: Programming Graphics Hardware

GPU Programming Languagesand the Graphics PipelineTutorial 5: Programming Graphics Hardware

The Graphics PipelineTutorial 5: Programming Graphics Hardware

The Graphics PipelineVertexProgramFragmentProgramExecutedOnce PerVertexExecutedOnce PerFragmentTutorial 5: Programming Graphics Hardware

Shaders and the Graphics PipelineHLSL / Cg / GLSL ProgramsVertexShaderApplicationVertex dataFragmentShaderInterpolatedvaluesIn the future, other parts of the graphicspipeline may become programmable throughhigh-level languages.Tutorial 5: Programming Graphics HardwareFragmentsFrame Buffer

CompilationTutorial 5: Programming Graphics Hardware

Application and API Layers3D ApplicationDirect3DHLSLOpenGLCgGLSLGPUTutorial 5: Programming Graphics Hardware3D Graphics APIShading Language

Using GPU Programming LanguagesUse 3D API calls to specify vertex and fragmentshadersEnable vertex and fragment shadersLoad/enable textures as usualDraw geometry as usualSet blend state as usualVertex shader will execute for each vertexFragment shader will execute for each fragmentTutorial 5: Programming Graphics Hardware

Compilation TargetsCode can be compiled for specific hardwareOptimizes performanceTakes advantage of extra hardware functionalityMay limit language constructs for less capablehardwareExamples of compilation targets:vs 1 1, vs 2 0, vs 3 0ps 1 1, ps 2 0, ps 2 x, ps 2 a, ps 3 0vs 3 0 and ps 3 0 are the most capable profiles,supported only by GeForce 6 Series GPUsTutorial 5: Programming Graphics Hardware

Shader CreationShaders are created (fromscratch, from a commonrepository, authoring tools,or modified from othershaders)These shaders are used formodeling in Digital ContentCreation (DCC) applicationsor rendering in otherapplicationsA shading language compilercompiles the shaders to avariety of target platforms,including APIs, OSes, andTutorial 5: Programming Graphics HardwareGPUs

Language SyntaxTutorial 5: Programming Graphics Hardware

Let’s Pick a LanguageHLSL, Cg, and GLSL have much in commonBut all are different (HLSL and Cg are much more similar toeach other than they are to GLSL)Let’s focus on just one language (HLSL) to illustrate the keyconcepts of shading language syntaxGeneral References:HLSL: DirectX Documentation(http://www.msdn.com/DirectX)Cg: The Cg SL: The OpenGL Shading Language(http://www.opengl.org)Tutorial 5: Programming Graphics Hardware

Data Typesfloathalfboolsampler32-bit IEEE floating point16-bit IEEE-like floating pointBooleanHandle to a texture samplerstructStructure as in C/C No pointers yet.Tutorial 5: Programming Graphics Hardware

Array / Vector / Matrix DeclarationsNative support for vectors (up to length 4)and matrices (up to size 4x4):float4mycolor;float3x3 mymatrix;Declare more general arrays exactly as in C:float lightpower[8];But, arrays are first-class types, not pointersfloat v[4] ! float4 vImplementations may subset arraycapabilities to match HW restrictionsTutorial 5: Programming Graphics Hardware

Function OverloadingExamples:float myfuncA(float3 x);float myfuncA(half3 x);float myfuncB(float2 a, float2 b);float myfuncB(float3 a, float3 b);float myfuncB(float4 a, float4 b);Very useful with so many data types.Tutorial 5: Programming Graphics Hardware

Different Constant-Typing RulesIn C, it’s easy to accidentally use high precisionhalf x, y;x y * 2.0;// Multiply is at// float precision!Not in HLSLx y * 2.0;// Multiply is at// half precision (from y)Unless you want tox y * 2.0f;// Multiply is at// float precisionTutorial 5: Programming Graphics Hardware

Support for Vectors and MatricesComponent-wise - * / for vectorsDot productdot(v1,v2);// returns a scalarMatrix multiplications:assuming a float4x4 M and a float4 vmatrix-vector: mul(M, v);// returns a vectorvector-matrix: mul(v, M);// returns a vectormatrix-matrix: mul(M, N);// returns a matrixTutorial 5: Programming Graphics Hardware

New OperatorsSwizzle operator extracts elements from vector or matrixa b.xxyy;Examples:float4 vec1 float4(4.0, -2.0, 5.0, 3.0);float2 vec2 vec1.yx;// vec2 (-2.0,4.0)float scalar vec1.w;// scalar 3.0float3 vec3 scalar.xxx;// vec3 (3.0, 3.0, 3.0)float4x4 myMatrix;// Set myFloatScalar to myMatrix[3][2]float myFloatScalar myMatrix. m32;Vector constructor builds vectora float4(1.0, 0.0, 0.0, 1.0);Tutorial 5: Programming Graphics Hardware

ExamplesTutorial 5: Programming Graphics Hardware

Sample ShadersTutorial 5: Programming Graphics Hardware

Looking Through a ShaderDemonstration in FX ComposerTutorial 5: Programming Graphics Hardware

HLSL FX FrameworkTutorial 5: Programming Graphics Hardware

The Problem with Just a ShadingLanguageA shading language describes how the vertex or fragmentprocessor should behaveBut how about:Texture state?Blending state?Depth test?Alpha test?All are necessary to really encapsulate the notion of an “effect”Need to be able to apply an “effect” to any arbitrary set ofgeometry and texturesSolution: .fx file formatTutorial 5: Programming Graphics Hardware

HLSL FXPowerful shader specification and interchange formatProvides several key benefits:Encapsulation of multiple shader versionsLevel of detailFunctionalityPerformanceEditable parameters and GUI descriptionsMultipass shadersRender state and texture state specificationFX shaders use HLSL to describe shading algorithmsFor OpenGL, similar functionality is available in the form ofCgFX (shader code is written in Cg)No GLSL effect format yet, but may appear eventuallyTutorial 5: Programming Graphics Hardware

Using TechniquesEach .fx file typically represents an effectTechniques describe how to achieve the effectCan have different techniques for:Level of detailGraphics hardware with different capabilitiesPerformanceA technique is specified using the techniquekeywordCurly braces delimit the technique’s contentsTutorial 5: Programming Graphics Hardware

MultipassEach technique may contain one or more passesA pass is defined by the pass keywordCurly braces delimit the pass contentsYou can set different graphics API state in eachpassTutorial 5: Programming Graphics Hardware

An Example: SimpleTexPs.fx/************* TWEAKABLES **************/float4x4 WorldIT : WorldInverseTranspose string UIWidget "None"; ;float4x4 WorldViewProj : WorldViewProjection string UIWidget "None"; ;float4x4 World : World string UIWidget "None"; ;float4x4 ViewI : ViewInverseTranspose string UIWidget "None"; ;///////////////float3 LightPos : Position string Object "PointLight";string Space "World"; {-10.0f, 10.0f, -10.0f};float3 AmbiColor : Ambient {0.1f, 0.1f, 0.1f};Tutorial 5: Programming Graphics Hardware

An Example: SimpleTexPs.fx (Cont’d)texture ColorTexture : DIFFUSE string ResourceName "default color.dds";string TextureType "2D"; ;sampler2D cmap sampler state{Texture ColorTexture ;MinFilter Linear;MagFilter Linear;MipFilter None;};Tutorial 5: Programming Graphics Hardware

An Example: SimpleTexPs.fx (Cont’d)/* data from application vertex buffer */struct appdata {float3 Position: POSITION;float4 UV: TEXCOORD0;float4 Normal: NORMAL;};/* data passed from vertex shader to pixel shader */struct vertexOutput {float4 HPosition: POSITION;float2 TexCoord0: TEXCOORD0;float4 diffCol: COLOR0;};Tutorial 5: Programming Graphics Hardware

An Example: SimpleTexPs.fx (Cont’d)/*********** vertex shader ******/vertexOutput lambVS(appdata IN){vertexOutput OUT;float3 Nn normalize(mul(IN.Normal, WorldIT).xyz);float4 Po float4(IN.Position.xyz,1);OUT.HPosition mul(Po, WorldViewProj);float3 Pw mul(Po, World).xyz;float3 Ln normalize(LightPos - Pw);float ldn dot(Ln,Nn);float diffComp max(0,ldn);OUT.diffCol float4((diffComp.xxx AmbiColor),1);OUT.TexCoord0 IN.UV.xy;return OUT;}Tutorial 5: Programming Graphics Hardware

An Example: SimpleTexPs.fx (Cont’d)/********* pixel shader ********/float4 myps(vertexOutput IN) : COLOR {float4 texColor tex2D(cmap, IN.TexCoord0);float4 result texColor * IN.diffCol;return result;}Tutorial 5: Programming Graphics Hardware

An Example: SimpleTexPs.fx (Cont’d)technique t0{pass p0{VertexShader compile vs 1 1 lambVS();ZEnable true;ZWriteEnable true;CullMode None;PixelShader compile ps 1 1 myps();}}Tutorial 5: Programming Graphics Hardware

HLSL .fx ExampleDemonstrations in FX ComposerTutorial 5: Programming Graphics Hardware

Questions?Tutorial 5: Programming Graphics Hardware

developer.nvidia.comThe Source for GPU ProgrammingLatest documentationSDKsCutting-edge toolsPerformance analysis toolsContent creation toolsHundreds of effectsVideo presentations and tutorialsLibraries and utilitiesNews and newsletter archivesTutorial 5: Programming Graphics HardwareEverQuest content courtesy Sony Online Entertainment Inc.

GPU Gems: Programming Techniques,Tips, and Tricks for Real-Time GraphicsPractical real-time graphics techniques fromexperts at leading corporations and universitiesGreat value:Full color (300 diagrams and screenshots)Hard cover816 pagesCD-ROM with demos and sample codeFor more, visit:http://developer.nvidia.com/GPUGems“GPU Gems is a cool toolbox of advanced graphicstechniques. Novice programmers and graphics gurusalike will find the gems practical, intriguing, anduseful.”“This collection of articles isparticularly impressive for its depth andbreadth. The book includes productoriented case studies, previouslyunpublished state-of-the-art research,comprehensive tutorials, and extensivecode samples and demos throughout.”Tim SweeneyEric Haines5: atProgrammingGraphics HardwareLead programmer Tutorialof UnrealEpic GamesAuthor of Real-Time Rendering

640 x 480 Nearest filtering 1995 Dead or Alive 3 (Tecmo Corporation) Xbox (NV2A) 32-bit color 640 x 480 Trilinear filtering 2001 Dawn (NVIDIA Corporation) GeForce FX (NV30) 128-bit color 1024 x 768 8:1 Aniso filtering 2003. Tutorial 5: Programming Graphics Hardware Whe

Related Documents:

May 02, 2018 · D. Program Evaluation ͟The organization has provided a description of the framework for how each program will be evaluated. The framework should include all the elements below: ͟The evaluation methods are cost-effective for the organization ͟Quantitative and qualitative data is being collected (at Basics tier, data collection must have begun)

Silat is a combative art of self-defense and survival rooted from Matay archipelago. It was traced at thé early of Langkasuka Kingdom (2nd century CE) till thé reign of Melaka (Malaysia) Sultanate era (13th century). Silat has now evolved to become part of social culture and tradition with thé appearance of a fine physical and spiritual .

On an exceptional basis, Member States may request UNESCO to provide thé candidates with access to thé platform so they can complète thé form by themselves. Thèse requests must be addressed to esd rize unesco. or by 15 A ril 2021 UNESCO will provide thé nomineewith accessto thé platform via their émail address.

̶The leading indicator of employee engagement is based on the quality of the relationship between employee and supervisor Empower your managers! ̶Help them understand the impact on the organization ̶Share important changes, plan options, tasks, and deadlines ̶Provide key messages and talking points ̶Prepare them to answer employee questions

Dr. Sunita Bharatwal** Dr. Pawan Garga*** Abstract Customer satisfaction is derived from thè functionalities and values, a product or Service can provide. The current study aims to segregate thè dimensions of ordine Service quality and gather insights on its impact on web shopping. The trends of purchases have

OpenCV GPU header file Upload image from CPU to GPU memory Allocate a temp output image on the GPU Process images on the GPU Process images on the GPU Download image from GPU to CPU mem OpenCV CUDA example #include opencv2/opencv.hpp #include <

Chính Văn.- Còn đức Thế tôn thì tuệ giác cực kỳ trong sạch 8: hiện hành bất nhị 9, đạt đến vô tướng 10, đứng vào chỗ đứng của các đức Thế tôn 11, thể hiện tính bình đẳng của các Ngài, đến chỗ không còn chướng ngại 12, giáo pháp không thể khuynh đảo, tâm thức không bị cản trở, cái được

GPU Tutorial 1: Introduction to GPU Computing Summary This tutorial introduces the concept of GPU computation. CUDA is employed as a framework for this, but the principles map to any vendor’s hardware. We provide an overview of GPU computation, its origins and development, before presenting both the CUDA hardware and software APIs. New Concepts