Parallel Computing - University Of Southern California

2y ago
22 Views
5 Downloads
669.47 KB
16 Pages
Last View : 17d ago
Last Download : 3m ago
Upload by : Jayda Dunning
Transcription

Parallel ComputingCSCI 201Principles of Software DevelopmentJeffrey Miller, Ph.D.jeffrey.miller@usc.edu

Outline Parallel Computing ProgramUSC CSCI 201L

Parallel Computing Parallel computing studies software systems where components located onconnected components communicate through message passing› Individual threads have only a partial knowledge of the problem› Parallel computing is a term used for programs that operate within a shared memoryspace with multiple processors or cores Parallel ComputingUSC CSCI 201L3/16

Redundant Hardware Determination To determine what CPU or CPUs you have in yourcomputer› On Windows, Run ( R) and type msinfo32› On Mac, go to- About this Mac - System Report Parallel ComputingUSC CSCI 201L4/16

Fork/Join Framework The fork/join framework is an implementation ofthe ExecutorService interface that allowstaking advantage of multiple cores or multipleprocessors It is designed for work that can be broken intosmaller pieces recursively, allowing all availableprocessing power We can create threads and add them to a threadpool The ForkJoinPool class implements the corealgorithm for parallel computing in Java and canexecute ForkJoinTask processes Parallel ComputingUSC CSCI 201L5/16

Fork/Join Framework Structure Pseudocode for the fork/join framework is:if (condition)do the work directlyelsesplit work into multiple piecesinvoke the pieces and wait for results The above code should be inside a child of ForkJoinTask› RecursiveTask can return a result› RecursiveAction does not return a result Create the object that represents the work to be done andpass it to the invoke( ) or execute( ) method of aForkJoinPool Parallel ComputingUSC CSCI 201L6/16

ForkJoinTask Class The ForkJoinTask class has a method compute() that shouldbe overridden The compute() method will contain the main computationperformed by the task The fork() method allows asynchronous execution (starts thethread and calls compute()) The join() method will not proceed until the task’s compute()method has completed as a result of the call to fork() The invoke() method on a ForkJoinPool will execute thecompute() method asynchronously and return the value(through a fork() then a join() call) The execute() method on a ForkJoinPool will execute thecompute() method asynchronously but will not return any value(through a fork() call with no join()) Parallel ComputingUSC CSCI 201L7/16

Multi-Threading vs Parallel Comparison Multi-ThreadingParallelClass/Interface fromwhich to inheritThread/RunnableRecursiveAction or RecursiveTaskMethod to overridevoid run()protected abstract void compute()(RecursiveAction)protected abstract V compute()(RecursiveTask)Method called to startthreadpublic void start()public final ForkJoinTask V fork()Managing classExecutorServiceForkJoinPoolMethod to call onmanaging class to startthreadvoid execute(Runnable)public void execute(ForkJoinTask ? )(RecursiveAction)public T invoke(ForkJoinTask T )(RecursiveTask)Method to stop currentthread from continuinguntil another threadcompletespublic final void join()public final V join()Parallel ComputingUSC CSCI 201L8/16

Running Time Any time you fork a task, there is overhead in movingthat to a new CPU, executing it, and getting the response Just because you are parallelizing code does not meanthat you will have an improvement in execution speed If you fork more threads than you have CPUs, the threadswill execute in a concurrent manner (time-slicing similarto multi-threading) in each CPU Parallel ComputingUSC CSCI 201L9/16

Program Write a program to add all of the numbers between 0and 1,000,000,000. Do this in a single-threaded manner and then in a parallelmanner. Compare the running times. Parallel ComputingUSC CSCI 201L10/16

Not Parallel ic class SumNoParallel {public static void main(String [] args) {long minNumber 0;long maxNumber 1 000 000 000;long before System.currentTimeMillis();Sum sum new Sum(minNumber, maxNumber);long num sum.compute();long after e without parallelism " (after - before));System.out.println("SUM(" minNumber "." maxNumber ") " num);}}class Sum {time without parallelism 902private long maxNum;SUM(0.1000000000) 500000000500000000private long minNum;Sum(long minNum, long maxNum) {this.minNum minNum;this.maxNum maxNum;time without parallelism 928}SUM(0.1000000000) 500000000500000000protected Long compute() {long sum 0;for (int i 0; i maxNum; i ) {time without parallelism 812sum i;SUM(0.1000000000) 500000000500000000}}}USC CSCI 201L11/16

Parallel 29import java.util.concurrent.ForkJoinPool;import java.util.concurrent.RecursiveTask;public class SumParallel {public static void main(String [] args) {long minNumber 0;long maxNumber 1 000 000 000;int numThreads 4;long before System.currentTimeMillis();ForkJoinPool pool new ForkJoinPool();SumP sum[] new SumP[numThreads];long start minNumber;long end maxNumber / numThreads;for (int i 0; i numThreads; i ) {sum[i] new SumP(start, end);start end 1;end (maxNumber / numThreads);pool.execute(sum[i]); // no return value, so we will join later}long num 0;for (int i 0; i numThreads; i ) {num sum[i].join();}long after e with parallelism " (after-before));System.out.print("SUM(" minNumber "." maxNumber ") 04142434445class SumP extends RecursiveTask Long {private long minNum;private long maxNum;public static final long serialVersionUID 1;public SumP(long minNum, long maxNum) {this.minNum minNum;this.maxNum maxNum;}protected Long compute() {long sum 0;for (long i minNum; i maxNum; i ) {sum i;}return sum;}}time with parallelism 457SUM(0.1000000000) 500000000500000000time with parallelism 436SUM(0.1000000000) 500000000500000000time with parallelism 433SUM(0.1000000000) 500000000500000000USC CSCI 201L12/16

Not Parallel Mergesort311 public class NonParallelMergeSort {322public static void main(String[] args) {333int SIZE 2 000 000;344int[] list new int[SIZE];355for (int i 0; i SIZE; i ) {366list[i] (int)(Math.random() * Integer.MAX VALUE); 377}388long timing 0;39409long sum 0;4110// run it 8 times to see if there are variations4211for (int i 0; i 8; i ) {12timing nonParallelMergeSort((int[])list.clone()); 434413System.out.println(timing " ms");4514sum timing;4615}4716System.out.println("average " (sum / 8) " ms"); 484917}50185119public static long nonParallelMergeSort(int[] list) {5220long before System.currentTimeMillis();5321new SortTask(list).compute();5422long after System.currentTimeMillis();5523long time after - before;5624return time;575825}5926private static class SortTask {6027private int[] list;61 }28SortTask(int[] list) {29this.list list;30}protected void compute() {if (list.length 2) return; // base case// split into halvesint[] firstHalf new int[list.length / 2];System.arraycopy(list, 0, firstHalf, 0, list.length / 2);int secondLength list.length - list.length / 2;int[] secondHalf new int[secondLength];System.arraycopy(list, list.length / 2, secondHalf, 0, secondLength);}}// recursively sort the two halvesnew SortTask(firstHalf).compute();new SortTask(secondHalf).compute();// merge halves togethermerge(firstHalf, secondHalf, list);public static void merge(int[] list1, int[] list2, int[] merged) {int i1 0, i2 0, i3 0; // index in list1, list2, outwhile (i1 list1.length && i2 list2.length) {merged[i3 ] (list1[i1] list2[i2]) ? list1[i1 ] : list2[i2 ];}// any trailing endswhile (i1 list1.length) {merged[i3 ] list1[i1 ];}while (i2 list2.length) {merged[i3 ] list2[i2 ];}}458 ms416 ms412 ms451 ms401 ms378 ms417 ms397 msaverage 416 msUSC CSCI 201L13/16

Parallel Merge Sort 2728293031323334 import java.util.concurrent.ForkJoinPool;import java.util.concurrent.RecursiveAction;public class ParallelMergeSort {public static void main(String[] args) {int SIZE 2 000 000;int[] list new int[SIZE];35363738394041for (int i 0; i SIZE; i ) {42list[i] (int)(Math.random() * Integer.MAX VALUE);43}44int numProcessors m.out.println("num processors: " numProcessors);4647// timing[i] : time to sort on i processors48long[] timing new long[numProcessors*2 1];4950for (int i 1; i numProcessors * 2; i ) {51timing[i] parallelMergeSort((int[])list.clone(), i);52System.out.println(i " processors " timing[i] " ms");53}54}5556public static long parallelMergeSort(int[] list, int proc) {57long before System.currentTimeMillis();58ForkJoinPool pool new ForkJoinPool(proc);59pool.invoke(new SortTask(list));60pool.shutdown();61while (!pool.isTerminated()) {62Thread.yield();63}64long after System.currentTimeMillis(); num processors: 465long time after - before;1 processors 902 ms66return time;2 processors 704 ms67}3 processors 269 ms684 processors 257 ms695 processors 214 ms706 processors 240 ms717 processors 249 ms728 processors 236 ms7374Parallel Computingprivate static class SortTask extends RecursiveAction {public static final long serialVersionUID 1;private int[] list;SortTask(int[] list) {this.list list;}protected void compute() {if (list.length 2) return; // base case// split into halvesint[] firstHalf new int[list.length / 2];System.arraycopy(list, 0, firstHalf, 0, list.length / 2);int secondLength list.length - list.length / 2;int[] secondHalf new int[secondLength];System.arraycopy(list, list.length / 2, secondHalf, 0, secondLength);}}}// recursively sort the two halvesSortTask st1 new SortTask(firstHalf);SortTask st2 new n();st2.join();public static void merge(int[] list1, int[] list2, int[] merged) {int i1 0, i2 0, i3 0; // index in list1, list2, outwhile (i1 list1.length && i2 list2.length) {merged[i3 ] (list1[i1] list2[i2]) ? list1[i1 ] : list2[i2 ];}num processors: 4// any trailing ends1 processors 912 mswhile (i1 list1.length) {2 processors 732 msmerged[i3 ] list1[i1 ];3 processors 273 ms}4 processors 255 mswhile (i2 list2.length) {5 processors 228 msmerged[i3 ] list2[i2 ];6 processors 224 ms}7 processors 261 ms}8 processors 235 msUSC CSCI 201L14/16

Outline Parallel Computing ProgramUSC CSCI 201L

Program Modify the ParallelMergeSort code to split thearray into the same number of sub-arrays asprocessors/cores on your computer. Does that provide alower execution time than splitting the array into twosub-arrays? Why or why not? Parallel ComputingUSC CSCI 201L16/16

› Parallel computing is a term used for programs that operate within a shared memory . algorithm for parallel computing in Java and can execute ForkJoinTaskprocesses Parallel Computing USC CSCI 201L. . array into the same number of sub-arrays as processors/cores on

Related Documents:

Cloud Computing J.B.I.E.T Page 5 Computing Paradigm Distinctions . The high-technology community has argued for many years about the precise definitions of centralized computing, parallel computing, distributed computing, and cloud computing. In general, distributed computing is the opposite of centralized computing.

Parallel Computing Toolbox Ordinary Di erential Equations Partial Di erential Equations Conclusion Lecture 8 Scienti c Computing: Symbolic Math, Parallel Computing, ODEs/PDEs Matthew J. Zahr CME 292 Advanced MATLAB for Scienti c Computing Stanford University 30th April 2015 CME 292: Advanced MATLAB for SC Lecture 8. Symbolic Math Toolbox .

Parallel computing is a form of High Performance computing. By using the strength of many smaller computational units, parallel computing can pro-vide a massive speed boost for traditional algorithms.[3] There are multiple programming solutions that o er parallel computing. Traditionally, programs are written to be executed linearly. Languages

Practical Application of Parallel Computing Why parallel computing? Need faster insight on more complex problems with larger datasets Computing infrastructure is broadly available (multicore desktops, GPUs, clusters) Why parallel computing with MATLAB Leverage computational power of more hardware

Parallel computing, distributed computing, java, ITU-PRP . 1 Introduction . ITU-PRP provides an all-in-one solution for Parallel Programmers, with a Parallel Programming Framework and a . JADE (Java Agent Development Framework) [6] as another specific Framework implemented on Java, provides a framework for Parallel Processing.

distributed. Some authors consider cloud computing to be a form of utility computing or service computing. Ubiquitous computing refers to computing with pervasive devices at any place and time using wired or wireless communication. Internet computing is even broader and covers all computing paradigms over the Internet.

- multi-threading and/or multi-processing packages (parfor, mpi4py, R parallel, Rmpi, ) Using built in job submission - Matlab Parallel Server, rslurm, python Dask, snakemake Independent calculations in parallel . Parallel Computing Toolbox allows for task based

The American Revolution had both long-term origins and short-term causes. In this section, we will look broadly at some of the long-term political, intellectual, cultural, and economic developments in the eigh-teenth century that set the context for the crisis of the 1760s and 1770s. Between the Glorious Revolution of 1688 and the middle of the eigh- teenth century, Britain had largely failed .