Introduction To CUDA C - Artificial Intelligence Computing .

3y ago

52 Views

3 Downloads

633.75 KB

63 Pages

Last View : 1m ago

Last Download : 3m ago

Upload by : Nadine Tse

Report this link

Download PDF

Transcription

Introduction to CUDA CSan Jose Convention Center September 20, 2010

Who Am I? Jason Sanders Senior Software Engineer, NVIDIA Co-author of CUDA by Example

What is CUDA? CUDA Architecture— Expose general-purpose GPU computing as first-class capability— Retain traditional DirectX/OpenGL graphics performance CUDA C— Based on industry-standard C— A handful of language extensions to allow heterogeneous programs— Straightforward APIs to manage devices, memory, etc. This talk will introduce you to CUDA C

Introduction to CUDA C What will you learn today?— Start from “Hello, World!”— Write and launch CUDA C kernels— Manage GPU memory— Run parallel kernels in CUDA C— Parallel communication and synchronization— Race conditions and atomic operations

CUDA C Prerequisites You (probably) need experience with C or C You do not need any GPU experience You do not need any graphics experience You do not need any parallel programming experience

CUDA C: The Basics Terminology Host – The CPU and its memory (host memory) Device – The GPU and its memory (device memory)HostDeviceNote: Figure Not to Scale

Hello, World!int main( void ) {printf( "Hello, World!\n" );return 0;} This basic program is just standard C that runs on the host NVIDIA’s compiler (nvcc) will not complain about CUDA programswith no device code At its simplest, CUDA C is just C!

Hello, World! with Device Codeglobal void kernel( void ) {}int main( void ) {kernel 1,1 ();printf( "Hello, World!\n" );return 0;} Two notable additions to the original “Hello, World!”

Hello, World! with Device Codeglobal void kernel( void ) {} CUDA C keyword global indicates that a function— Runs on the device— Called from host code nvcc splits source file into host and device components— NVIDIA’s compiler handles device functions like kernel()— Standard host compiler handles host functions like main() gcc Microsoft Visual C

Hello, World! with Device Codeint main( void ) {kernel 1, 1 ();printf( "Hello, World!\n" );return 0;} Triple angle brackets mark a call from host code to device code— Sometimes called a “kernel launch”— We’ll discuss the parameters inside the angle brackets later This is all that’s required to execute a function on the GPU! The function kernel() does nothing, so this is fairly anticlimactic

A More Complex Example A simple kernel to add two integers:global void add( int *a, int *b, int *c ) {*c *a *b;} As before,globalis a CUDA C keyword meaning— add()will execute on the device— add()will be called from the host

A More Complex Example Notice that we use pointers for our variables:global void add( int *a, int *b, int *c ) {*c *a *b;} add() runs on the device so a, b, and c must point todevice memory How do we allocate memory on the GPU?

Memory Management Host and device memory are distinct entities— Device pointers point to GPU memory May be passed to and from host code May not be dereferenced from host code— Host pointers point to CPU memory May be passed to and from device code May not be dereferenced from device code Basic CUDA API for dealing with device memory— cudaMalloc(), cudaFree(), cudaMemcpy()— Similar to their C equivalents, malloc(), free(), memcpy()

A More Complex Example: add() Using our add()kernel:global void add( int *a, int *b, int *c ) {*c *a *b;} Let’s take a look at main()

A More Complex Example: main()int main( void ) {int a, b, c;// host copies of a, b, cint *dev a, *dev b, *dev c;// device copies of a, b, cint size sizeof( int );// we need space for an integer// allocate device copies of a, b, ccudaMalloc( (void**)&dev a, size );cudaMalloc( (void**)&dev b, size );cudaMalloc( (void**)&dev c, size );a 2;b 7;

A More Complex Example: main() (cont)// copy inputs to devicecudaMemcpy( dev a, &a, size, cudaMemcpyHostToDevice );cudaMemcpy( dev b, &b, size, cudaMemcpyHostToDevice );// launch add() kernel on GPU, passing parametersadd 1, 1 ( dev a, dev b, dev c );// copy device result back to host copy of ccudaMemcpy( &c, dev c, size, cudaMemcpyDeviceToHost );cudaFree( dev a );cudaFree( dev b );cudaFree( dev c );return 0;}

Parallel Programming in CUDA C But wait GPU computing is about massive parallelism So how do we run code in parallel on the device? Solution lies in the parameters between the triple angle brackets:add 1, 1 ( dev a, dev b, dev c );add N, 1 ( dev a, dev b, dev c ); Instead of executing add() once, add() executed N times in parallel

Parallel Programming in CUDA C With add() running in parallel let’s do vector addition Terminology: Each parallel invocation of add() referred to as a block Kernel can refer to its block’s index with the variable blockIdx.x Each block adds a value from a[] and b[], storing the result in c[]:global void add( int *a, int *b, int *c ) {c[blockIdx.x] a[blockIdx.x] b[blockIdx.x];} By using blockIdx.x to index arrays, each block handles different indices

Parallel Programming in CUDA C We write this code:global void add( int *a, int *b, int *c ) {c[blockIdx.x] a[blockIdx.x] b[blockIdx.x];} This is what runs in parallel on the device:Block 0Block 1c[0] a[0] b[0];c[1] a[1] b[1];Block 2Block 3c[2] a[2] b[2];c[3] a[3] b[3];

Parallel Addition: add() Using our newly parallelized add()kernel:global void add( int *a, int *b, int *c ) {c[blockIdx.x] a[blockIdx.x] b[blockIdx.x];} Let’s take a look at main()

Parallel Addition: main()#define N512int main( void ) {int *a, *b, *c;// host copies of a, b, cint *dev a, *dev b, *dev c;// device copies of a, b, cint size N * sizeof( int );// we need space for 512 integers// allocate device copies of a, b, ccudaMalloc( (void**)&dev a, size );cudaMalloc( (void**)&dev b, size );cudaMalloc( (void**)&dev c, size );a (int*)malloc( size );b (int*)malloc( size );c (int*)malloc( size );random ints( a, N );random ints( b, N );

Parallel Addition: main() (cont)// copy inputs to devicecudaMemcpy( dev a, a, size, cudaMemcpyHostToDevice );cudaMemcpy( dev b, b, size, cudaMemcpyHostToDevice );// launch add() kernel with N parallel blocksadd N, 1 ( dev a, dev b, dev c );// copy device result back to host copy of ccudaMemcpy( c, dev c, size, cudaMemcpyDeviceToHost );free( a ); free( b ); free( c );cudaFree( dev a );cudaFree( dev b );cudaFree( dev c );return 0;}

Review Difference between “host” and “device”— Host CPU— Device GPU Using global to declare a function as device code— Runs on device— Called from host Passing parameters from host code to a device function

Review (cont) Basic device memory management— cudaMalloc()— cudaMemcpy()— cudaFree() Launching parallel kernels— Launch N copies of add() with: add N, 1 ();— Used blockIdx.x to access block’s index

Threads Terminology: A block can be split into parallel threads Let’s change vector addition to use parallel threads instead of parallel blocks:global void add( int *a, int *b, int *c ) {c[ threadIdx.xblockIdx.x ] a[ threadIdx.xblockIdx.x ] b[ threadIdx.xblockIdx.x ];} We use threadIdx.x instead of blockIdx.x in add() main() will require one change as well

Parallel Addition (Threads): main()#define N512int main( void ) {int *a, *b, *c;//host copies of a, b, cint *dev a, *dev b, *dev c;//device copies of a, b, cint size N * sizeof( int );//we need space for 512 integers// allocate device copies of a, b, ccudaMalloc( (void**)&dev a, size );cudaMalloc( (void**)&dev b, size );cudaMalloc( (void**)&dev c, size );a (int*)malloc( size );b (int*)malloc( size );c (int*)malloc( size );random ints( a, N );random ints( b, N );

Parallel Addition (Threads): main() (cont)// copy inputs to devicecudaMemcpy( dev a, a, size, cudaMemcpyHostToDevice );cudaMemcpy( dev b, b, size, cudaMemcpyHostToDevice );// launch add() kernel with N threadsblocksN, N1 ( dev a, dev b, dev c );add 1,// copy device result back to host copy of ccudaMemcpy( c, dev c, size, cudaMemcpyDeviceToHost );free( a ); free( b );cudaFree( dev a );cudaFree( dev b );cudaFree( dev c );return 0;}free( c );

Using Threads And Blocks We’ve seen parallel vector addition using— Many blocks with 1 thread apiece— 1 block with many threads Let’s adapt vector addition to use lots of both blocks and threads After using threads and blocks together, we’ll talk about why threads First let’s discuss data indexing

Indexing Arrays With Threads And Blocks No longer as simple as just using threadIdx.x or blockIdx.x as indices To index array with 1 thread per entry (using 8 eadIdx.x0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7blockIdx.x 0blockIdx.x 1blockIdx.x 2blockIdx.x 3 If we have M threads/block, a unique array index for each entry given byint index threadIdx.x blockIdx.x * M;int index x y* width;

Indexing Arrays: Example In this example, the red entry would have an index of 21:0123456789 10 11 12 13 14 15 16 17 18 19 20 21M 8 threads/blockblockIdx.x 2int index threadIdx.x blockIdx.x * M; 21;5 2* 8;

Addition with Threads and Blocks The blockDim.x is a built-in variable for threads per block:int index threadIdx.x blockIdx.x * blockDim.x; A combined version of our vector addition kernel to use blocks and threads:global void add( int *a, int *b, int *c ) {int index threadIdx.x blockIdx.x * blockDim.x;c[index] a[index] b[index];} So what changes in main() when we use both blocks and threads?

Parallel Addition (Blocks/Threads): main()#define N(2048*2048)#define THREADS PER BLOCK 512int main( void ) {int *a, *b, *c;// host copies of a, b, cint *dev a, *dev b, *dev c;// device copies of a, b, cint size N * sizeof( int );// we need space for N integers// allocate device copies of a, b, ccudaMalloc( (void**)&dev a, size );cudaMalloc( (void**)&dev b, size );cudaMalloc( (void**)&dev c, size );a (int*)malloc( size );b (int*)malloc( size );c (int*)malloc( size );random ints( a, N );random ints( b, N );

Parallel Addition (Blocks/Threads): main()// copy inputs to devicecudaMemcpy( dev a, a, size, cudaMemcpyHostToDevice );cudaMemcpy( dev b, b, size, cudaMemcpyHostToDevice );// launch add() kernel with blocks and threadsadd N/THREADS PER BLOCK, THREADS PER BLOCK ( dev a, dev b, dev c );// copy device result back to host copy of ccudaMemcpy( c, dev c, size, cudaMemcpyDeviceToHost );free( a ); free( b ); free( c );cudaFree( dev a );cudaFree( dev b );cudaFree( dev c );return 0;}

Why Bother With Threads? Threads seem unnecessary— Added a level of abstraction and complexity— What did we gain? Unlike parallel blocks, parallel threads have mechanisms to— Communicate— Synchronize Let’s see how

Dot Product Unlike vector addition, dot product is a reduction from vectors to a scalarbaa0a1a2a3****b0b1b2b3 cc a bc (a0, a1, a2, a3) (b0, b1, b2, b3)c a0 b0 a1 b1 a2 b2 a3 b3

Dot Product Parallel threads have no problem computing the pairwise products:baa0a1a2a3****b0b1b2b3 So we can start a dot product CUDA kernel by doing just that:global void dot( int *a, int *b, int *c ){// Each thread computes a pairwise productint temp a[threadIdx.x] * b[threadIdx.x];

Dot Product But we need to share data between threads to compute the final sum:baa0a1a2a3****b0b1b2b3 global void dot( int *a, int *b, int *c ){// Each thread computes a pairwise productint temp a[threadIdx.x] * b[threadIdx.x];// Can’t compute the final sum// Each thread’s copy of ‘temp’ is private}

Sharing Data Between Threads Terminology: A block of threads shares memory called shared memory Extremely fast, on-chip memory (user-managed cache) Declared with the shared CUDA keyword Not visible to threads in other blocks running in parallelBlock 0ThreadsShared MemoryBlock 1ThreadsShared MemoryBlock 2ThreadsShared Memory

Parallel Dot Product: dot() We perform parallel multiplication, serial addition:#define N 512global void dot( int *a, int *b, int *c ) {// Shared memory for results of multiplicationshared int temp[N];temp[threadIdx.x] a[threadIdx.x] * b[threadIdx.x];// Thread 0 sums the pairwise productsif( 0 threadIdx.x ) {int sum 0;for( int i 0; i N; i )sum temp[i];*c sum;}}

Parallel Dot Product Recap We perform parallel, pairwise multiplications Shared memory stores each thread’s result We sum these pairwise products from a single thread Sounds good but we’ve made a huge mistake

Faulty Dot Product Exposed! Step 1: In parallelparallel, each thread writes a pairwise productshared int temp Step 2: Thread 0 reads and sums the productsshared int temp But there’s an assumption hidden in Step 1

Read-Before-Write Hazard Suppose thread 0 finishes its write in step 1 Then thread 0 reads index 12 in step 2This read returns garbage! Before thread 12 writes to index 12 in step 1?

Synchronization We need threads to wait between the sections of dot():global void dot( int *a, int *b, int *c ) {shared int temp[N];temp[threadIdx.x] a[threadIdx.x] * b[threadIdx.x];// * NEED THREADS TO SYNCHRONIZE HERE *// No thread can advance until all threads// have reached this point in the code// Thread 0 sums the pairwise productsif( 0 threadIdx.x ) {int sum 0;for( int i 0; i N; i )sum temp[i];*c sum;}}

syncthreads() We can synchronize threads with the function syncthreads() Threads in the block wait until all threads have hit the syncthreads()Thread 0Thread 1Thread 2Thread 3Thread s()syncthreads() Threads are only synchronized within a block

Parallel Dot Product: dot()global void dot( int *a, int *b, int *c ) {shared int temp[N];temp[threadIdx.x] a[threadIdx.x] * b[threadIdx.x];syncthreads();if( 0 threadIdx.x ) {int sum 0;for( int i 0; i N; i )sum temp[i];*c sum;}} With a properly synchronized dot() routine, let’s look at main()

Parallel Dot Product: main()#define N512int main( void ) {int *a, *b, *c;// copies of a, b, cint *dev a, *dev b, *dev c;// device copies of a, b, cint size N * sizeof( int );// we need space for 512 integers// allocate device copies of a, b, ccudaMalloc( (void**)&dev a, size );cudaMalloc( (void**)&dev b, size );cudaMalloc( (void**)&dev c, sizeof( int ) );a (int *)malloc( size );b (int *)malloc( size );c (int *)malloc( sizeof( int ) );random ints( a, N );random ints( b, N );

Parallel Dot Product: main()// copy inputs to devicecudaMemcpy( dev a, a, size, cudaMemcpyHostToDevice );cudaMemcpy( dev b, b, size, cudaMemcpyHostToDevice );// launch dot() kernel with 1 block and N threadsdot 1, N ( dev a, dev b, dev c );// copy device result back to host copy of ccudaMemcpy( c, dev c, sizeof( int ) , cudaMemcpyDeviceToHost );free( a ); free( b ); free( c );cudaFree( dev a );cudaFree( dev b );cudaFree( dev c );return 0;}

Review Launching kernels with parallel threads— Launch add() with N threads: add 1, N ();— Used threadIdx.x to access thread’s index Using both blocks and threads— Used (threadIdx.x blockIdx.x * blockDim.x) to index input/output— N/THREADS PER BLOCK blocks and THREADS PER BLOCK threads gave us N threads total

Review (cont) Using shared to declare memory as shared memory— Data shared among threads in a block— Not visible to threads in other parallel blocks Using syncthreads() as a barrier— No thread executes instructions after syncthreads() until allthreads have reached the syncthreads()— Needs to be used to prevent data hazards

Multiblock Dot Product Recall our dot product launch:// launch dot() kernel with 1 block and N threadsdot 1, N ( dev a, dev b, dev c ); Launching with one block will not utilize much of the GPU Let’s write a multiblock version of dot product

Multiblock Dot Product: Algorithm Each block computes a sum of its pairwise products like before:Block 0baa0a1a2a3****Block 1a512a513a514a515 sum ab0b1b2b3****bb512b513b514b515 sum

Multiblock Dot Product: Algorithm And then contributes its sum to the final result:Block 0baa0a1a2a3****Block 1a512a513a514a515 sum ab0b1b2b3****cbb512b513b514b515 sum

Multiblock Dot Product: dot()#define N (2048*2048)#define THREADS PER BLOCK 512global void dot( int *a, int *b, int *c ) {shared int temp[THREADS PER BLOCK];int index threadIdx.x blockIdx.x * blockDim.x;temp[threadIdx.x] a[index] * b[index];syncthreads();if( 0 threadIdx.x ) {int sum 0;for( int i 0; i THREADS PER BLOCK; i )sum temp[i];*c sum; c , sum );atomicAdd(}} But we have a race condition We can fix it with one of CUDA’s atomic operations

Race Conditions Terminology: A race condition occurs when program behavior depends uponrelative timing of two (or more) event sequences What actually takes place to execute the line in question: *c sum;— Read value at address c— Add sum to valueTerminology: Read-Modify-Write— Write result to address c What if two threads are trying to do this at the same time? Thread 0, Block 0— Read value at address c— Add sum to value— Write result to address c Thread 0, Block 1— Read value at address c— Add sum to value— Write result to address c

Global Memory ContentionRead-Modify-WriteBlock 0Reads 0sum 30*c sumc 0Computes 0 3Writes 30 3 3303337Block 133 4 77sum 4Reads 3Computes 3 4Writes 7Read-Modify-Write

Global Memory ContentionRead-Modify-WriteBlock 0Reads 0sum 30*c sumc 0Computes 0 3Writes 30 3 3003034Block 100 4 44sum 4Reads 0Computes 0 4Writes 4Read-Modify-Write

Atomic Operations Terminology: Read-modify-write uninterruptible when atomic Many atomic operations on memory available with CUDA C atomicAdd() atomicInc() atomicSub() atomicDec() atomicMin() atomicExch() atomicMax() atomicCAS() Predictable result when simultaneous access to memory required We need to atomically add sum to c in our multiblock dot product

Multiblock Dot Product: dot()global void dot( int *a, int *b, int *c ) {shared int temp[THREADS PER BLOCK];int index threadIdx.x blockIdx.x * blockDim.x;temp[threadIdx.x] a[index] * b[index];syncthreads();if( 0 threadIdx.x ) {int sum 0;for( int i 0; i THREADS PER BLOCK; i )sum temp[i];atomicAdd( c , sum );}} Now let’s fix up main() to handle a multiblock dot product

Parallel Dot Product: main()#define N (2048*2048)#define THREADS PER BLOCK 512int main( void ) {int *a, *b, *c;int *dev a, *dev b, *dev c;int size N * sizeof( int );// allocatecudaMalloc(cudaMalloc(cudaMalloc(// host copies of a, b, c// device copies of a, b, c// we need space for N intsdevice copies of a, b, c(void**)&dev a, size );(void**)&dev b, size );(void**)&dev c, sizeof( int ) );a (int *)malloc( size );b (int *)malloc( size );c (int *)malloc( sizeof( int ) );random ints( a, N );random ints( b, N );

Parallel Dot Product: main()// copy inputs to devicecudaMemcpy( dev a, a, size, cudaMemcpyHostToDevice );cudaMemcpy( dev b, b, size, cudaMemcpyHostToDevice );// launch dot() kerneldot N/THREADS PER BLOCK, THREADS PER BLOCK ( dev a, dev b, dev c );// copy device result back to host copy of ccudaMemcpy( c, dev c, sizeof( int ) , cudaMemcpyDeviceToHost );free( a ); free( b ); free( c );cudaFree( dev a );cudaFree( dev b );cudaFree( dev c );return 0;}

Review Race conditions— Behavior depends upon relative timing of multiple event sequences— Can occur when an implied read-modify-write is interruptible Atomic operations— CUDA provides read-modify-write operations guaranteed to be atomic— Atomics ensure correct results when multiple threads modify memory

To Learn More CUDA C Check out CUDA by Example— Parallel Programming in CUDA C— Thread Cooperation— Constant Memory and Events— Texture Memory— Graphics Interoperability— Atomics— Str

— Expose general -purpose GPU computing as first -class capability — Retain traditional DirectX/OpenGL graphics performance CUDA C — Based on industry -standard C — A handful of language extensions to allow heterogeneous programs — Straightforward APIs to manage devices, memory, etc. This talk will introduce you to CUDA C

Related Documents:

DU-05227-042 v6.5 | August 2014 CUDA-GDB CUDA DEBUGGER

CUDA-GDB runs on Linux and Mac OS X, 32-bit and 64-bit. CUDA-GDB is based on GDB 7.6 on both Linux and Mac OS X. 1.2. Supported Features CUDA-GDB is designed to present the user with a seamless debugging environment that allows simultaneous debugging of both GPU and CPU code within the same application.

31 Views

3y ago

DU-05227-042 v5.5 | July 2013 CUDA-GDB CUDA DEBUGGER

www.nvidia.com CUDA Debugger DU-05227-042 _v5.5 3 Chapter 2. RELEASE NOTES 5.5 Release Kernel Launch Stack Two new commands, info cuda launch stack and info cuda launch children, are introduced to display the kernel launch stack and the children k

29 Views

2y ago

NVIDIA CUDA Toolkit 10.0

CUDA Toolkit Major Components www.nvidia.com NVIDIA CUDA Toolkit 10.0.153 RN-06722-001 _v10.0 2 ‣ cudadevrt (CUDA Device Runtime) ‣ cudart (CUDA Runtime) ‣ cufft (Fast Fourier Transform [FFT]) ‣ cupti (Profiling Tools Interface) ‣ curand (Random Number Generation) ‣ cusolver (Dense and Sparse Direct Linear Solvers and Eigen Solvers) ‣ cusparse (Sparse Matrix)

14 Views

3m ago

Introduction to GPU computing for statisticicans

Will Landau (Iowa State University) Introduction to GPU computing for statisticicans September 16, 2013 20 / 32. Introduction to GPU computing for statisticicans Will Landau GPUs, parallelism, and why we care CUDA and our CUDA systems GPU computing with R CUDA and our CUDA systems Logging in

34 Views

3y ago

Standard Introduction to CUDA C Programming

Expose GPU parallelism for general-purpose computing Retain performance CUDA C/C Based on industry-standard C/C Small set of extensions to enable heterogeneous programming Straightforward APIs to manage devices, memory etc. This session introduces CUDA C/C . Introduction to CUDA C/C

42 Views

3y ago

Introduction to CUDA programming - Lawrence Berkeley National Laboratory

Third party wrappers are also available for Python, Perl, Fortran, Java, Ruby, Lua, MATLAB and IDL, and Mathematica Compilers from PGI, RCC, HMPP, Copperhead . CUDA Introduc on"to"CUDA"Programming"5"HemantShukla 10 . Introduction to CUDA programming.pptx Author: Hemant Shukla

5 Views

3m ago

CUDA Compiler Driver NVCC - NVIDIA Developer

CUDA Compiler Driver NVCC TRM-06721-001_v11.8 1 Chapter 1. Introduction 1.1. Overview 1.1.1. CUDA Programming Model The CUDA Toolkit targets a class of applications whose control part runs as a process on a

16 Views

1y ago

Nvidia Cuda C Getting Started Guide for Microsoft Windows

NVIDIA CUDA C Getting Started Guide for Microsoft Windows DU-05349-001_v03 1 INTRODUCTION NVIDIA CUDATM is a general purpose parallel computing architecture introduced by NVIDIA. It includes the CUDA Instruction Set Architecture (ISA) and the parallel compute engine in the GPU. To program to the CUDA architecture, developers can use

22 Views

1y ago

Recent Views

Cyber Security Guide for NZ Law Firms - WordPress

2 Incident Response Solutions Cyber Security Guide for NZ Law Firms Welcome to the Cyber Security Guide for NZ Law Firms The storage of sensitive client information and management of large funds make law firms an attractive target for cybercriminals. It is therefore critical for law firms to understand and mitigate the cyber risks they face.

1y ago

135 Views

New Prudential Regime for Investment Firms - Allen Overy

(iii) Investment firms - often referred to as 'Class 2 firms' - these are non-systemic investment firms that do not carry out dealing on own account or underwriting activities. This category of firms are subject to the full scope of the prudential regime is set out in the IFR and IFD. (iv) Small and non-interconnected investment firms -

1y ago

98 Views

The new EU prudential regime for investment firms

In any event, many bank and non-bank financial groups operating through investment firms in the UK have created new EU27 investment firms (or are scaling up existing EU27 investment firms) to serve EU27 clients as part of their Brexit planning. These firms will be subject to the new EU prudential regime. New Classification of Investment Firms

4m ago

48 Views

Actionable Intelligence: Successful Bi for Law Firms

Source: Gartner, Business Intelligence Imperative, 2001 ACTIONABLE INTELLIGENCE: SUCCESSFUL BI FOR LAW FIRMS - 3. A decade later, the fact gap remains a core issue. Law firms have more data than ever about . 1990 Mid-2000s 2015 A CONDENSED HISTORY OF BUSINESS INTELLIGENCE ACTIONABLE INTELLIGENCE: SUCCESSFUL BI FOR LAW FIRMS - 5.

1y ago

129 Views

12 PUBLIC LAW AND PRIVATE LAW - Home: The National .

INTRODUCTION TO LAW MODULE - 3 Public Law and Private Law Classification of Law 164 Notes z define Criminal Law; z list the differences between Public and Private Law; and z discuss the role of Judges in shaping Law 12.1 MEANING AND NATURE OF PUBLIC LAW Public Law is that part of law, which governs relationship between the State

3y ago

745 Views

Dr. Ram Manohar Lohiya National Law University, Lucknow

2. Health and Medicine Law 3. Int. Commercial Arbitration 4. Law and Agriculture IXth SEMESTER 1. Consumer Protection Law 2. Law, Science and Technology 3. Women and Law 4. Land Law (UP) Xth SEMESTER 1. Real Estate Law 2. Law and Economics 3. Sports Law 4. Law and Education **Seminar Courses Xth SEMESTER (i) Law and Morality (ii) Legislative .

3y ago

496 Views

Investment banks hedge funds private equity

investment banks, hedge funds, and private equity firms can use the book to broaden their understanding of their industry and competitors. Finally, professionals at law firms, accounting firms, and other firms that advise investment banks, hedge funds, and private equity firms should

2y ago

372 Views

2021 Report on the State of the Legal Market

1 Thomson Reuters Peer Monitor data are based on reported results from 162 U.S.-based law firms, including 45 Am Law 100 firms, 56 Am Law Second 100 Firms, and 61 additional Midsize firms. 2 Malcolm Gladwell, The Tipping Point

2y ago

136 Views

Cyber Security for Law Firms

Cyber Security and Legal Practice (Australia) Cyber security threats are increasing. 2019 Cyber Security Report - American Bar Association (ABA)(United States) Over a quarter of firms report that they have experienced some sort of security breach Less than a third of law firms have an incident response plan. 2019 PwC Law Firms' Survey

1y ago

130 Views

MARTINDALE-HUBBELL TOP RANKED LAW FIRMS METHODOLOGY TOP - Fee, Smith

view the entire list online at: fortune.com & law.com martindale-hubbell top ranked law firms methodology ranked firms law top page proof—for approval only presents leal leaders coming in 2015 featured in women leaders law in the 2015 for more information call: 855-808-4520 or e-mail legalleaders@alm.com page proof—for approval .

1y ago

92 Views

Companies Law - Cayman Islands dollar

Law 1 of 1971-15th December, 1970 Law 7 of 2000- 20th July, 2000 Law 7 of 1973-28th June, 1973 Law 5 of 2001-20th April, 2001 Law 24 of 1974-22nd November, 1974 Law 10 of 2001-25th May, 2001 Law 25 of 1975-9th December, 1975 Law 29 of 2001-26th September, 2001 Law 19 of 1977-10th November, 1977 Law 46 of 2001-14th January, 2002

3y ago

454 Views

It’s the Law!

ciples stated in Boyle’s Law, Charles’ Law, Gay-Lussac’s Law, Henry’s Law, and Dalton’s Law. Students will be able to explain the application of Boyle’s Law, Charles’ Law, Gay-Lussac’s Law, Henry’s Law, and Dalton’s Law to observations or events related to SCUBA diving. MateriaLs None audio/visuaL MateriaLs None teachinG tiMe

2y ago

378 Views

WHAT LAW IS ? An Introduction to Law

common law system civil law system!! sources of law in civil law !! a1. primary: statutes (written law) enacted by legislative power are the principal source of law. ! a2. two subsidiary sources of law: ! a2.1 administrative regulations a.2.2 customs!! ! sources of law in common law !!! b1. two primary sources of

2y ago

385 Views

Growth Processes of High- Growth Firms in the UK - Nesta

Interest in high-growth ﬁrms (HGFs) has exploded in recent years, once the job-creating prowess of a minority of fast-growing ﬁrms became recognized - roughly 4% of ﬁrms can be expected to generate 50% of jobs (Storey, 1994, p. 117). Research into high-growth ﬁrms has itself undergone high-growth. However, the level of analysis has of-

1y ago

120 Views

Socio-economic profile Coastal and marine ecosystem and economy

According to the Philippine Plastics Industry Association, Inc. (PPIA), there are 1,088 firms throughout the Philippines. The majority of the plastics companies are situated in the National Capital Region (NCR) with 642 firms. This is followed by CALABARZON area with 176 firms. While Central Luzon registered 87 firms. Central Visayas have 87 firms.

1y ago

120 Views

Introduction To CUDA C - Artificial Intelligence Computing .

It looks like you're using an ad-blocker