ANSYS Solvers: Usage And Performance

1y ago
4 Views
2 Downloads
3.17 MB
62 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Hayden Brunner
Transcription

ANSYS Solvers:Usage and PerformanceAnsys equation solvers: usage and guidelinesGene PooleAnsys Solvers Team, April, 2002

Outline Basic solver descriptions– Direct and iterative methods– Why so many choices? Solver usage in ANSYS– Available choices and defaults– How do I chose a solver? Practical usage considerations––––Performance issuesUsage rules of thumbUsage examplesHow do I chose the fastest solver?

Solver Basics: Ax bDirect MethodsFactor: A LDLT Solve:Lz bz D-1zLT x z Compute matrix LSolve triangular systems

Solver Basics: Ax bDirect MethodsStationary Methods(Guess and Go)Factor: A LDLT Solve:Lz bz D-1zLT x zIterative Methods Projection Methods(project and minimize)Choose x0Choose x0; r0 Ax0-b;p0 r0Iterate:x K 1 Gxk cUntil x k 1 – xk eIterate:Compute Apk;Update xk xk-1 αk pk-1rk rk-1 – αk Apkpk rk βk pk-1Until rk εCompute matrix LCompute sparse Ax productSolve triangular systemsVector updates

Solver Basics: LimitationsDirect Methods Factor is expensive– Memory & lots of flops– huge file to store L Solve I/O intensive– forward/backward readof huge L fileIterative Methods Sparse Ax multiplycheap but slow– Memory bandwidthand cache limited– Harder to parallelize Preconditioners arenot always robust Convergence is notguaranteed

ANSYS Direct Advantage Enhanced BCSLIB version 4.0– Parallel factorization– Reduced memory requirements for equationreordering– Support for U/P formulation Sparse solver interface improvements– Dynamic memory uses feedback for optimalI/O performance– Sparse assembly including direct eliminationof CEs

Multi-Point ConstraintsDirect elimination methodx1 GTx2 gA11A12x1b1 AT12A22x2b2solve :(GA11GT GA12 AT12 GT A22) x2 b2 Gb1 -AT12g - GA11g

ANSYS Iterative Advantage Powersolver has a proprietary and robustpreconditioner– Parallel matrix/vector multiply– Wide usage, robust Many additional iterative solvers forcomplex systems, non-symmetric, etc. New high performance parallel solvers– AMG Algebraic Multigrid– DDS Domain Decomposition Solver Ongoing efforts to utilize and enhanceAMG and DDS solvers when applicable

Solver Usage Sparse, PCG and ICCG solverscover 95% of all ANSYSapplications Sparse solver is now default inmost cases for robustness andefficiency reasons

Solver Usage: Choices Sparse direct solver ( BCSLIB )PCG solver (PowerSolver)Frontal solverICCGJCGListed by order of usage popularityANSYS now chooses sparse direct in nearly allapplications for robustness and efficiency

Solver Usage: -pp Choices AMG – Algebraic Multigrid– Good for ill-conditioned problems– Best ANSYS shared memory parallel performanceiterative solver– Good for nonlinear problems – can solve indefinitematrix DDS – Domain Decomposition Solver– Exploits MPP cluster computing for solver portion ofanalysis– Solver time scales even on many processorsStill under intensive developments

Solver Usage: Sparse Solver Real and complex, symmetric and non-symmetric Positive definite and indefinite(occurs in nonlinear andeigensolver) Supports block Lanczos Supports substructural USE pass Substructure Generation pass ( Beta in 6.1) Supports ALL physics including some CFD Large numbers of CEs Support for mixed U-P formulation with Lagrangemultipliers (efficient methods are used to support this) Pivoting and partial pivoting (EQSLV,sparse,0.01,-1)

Solver Usage: PCG Solver Real symmetric matrices Positive definite and indefinite matrices. Supportingindefinite matrices is a unique feature in our industry. Power Dynamics modal analyses based on PCG subspace Substructure USE pass and expansion pass All structural analyses and some other field problems Large numbers of CEs NOT for mixed U-P formulation Lagrange multiplierelements NO pivoting or partial pivoting capability

Solver Usage: ICCG Suite Collection of iterative solvers for specialcases Complex symmetric and non-symmetricsystems Good for multiphysics, i.e. EMAG Not good for general usage

Usage Guidelines: Sparse Capabilities– Adapts to memory available– ANSYS interface strives for optimal I/Omemory allocation– Uses machine tuned BLAS kernels thatoperate at near peak speed– Uses ANSYS file splitting for very large files– Parallel performance 2X to 3.5X faster on 4to 8 processor systems– 3X to 6X speedup possible on high endserver systems ( IBM, HP, SGI .)

Usage Guidelines:Sparse Resource requirements– Total factorization time depends on model geometryand element type Shell models best Bulky 3-D models with higher order elementsmore expensive– System requirements 1 Gbyte per million dofs 10 Gbyte disk per million dofs– Eventually runs out of resource 10 million dofs 100 Gbyte file 100 Gbytes X 3 300 Gbytes I/O 300 Gbytes @ 30 Mbytes/sec approx. 10,000seconds I/O wait time

Usage Guidelines: PCG Capabilities– Runs in-core, supports out-of-core (you don’t need to do this)– Parallel matrix/vector multiply achieves2X on 4 to 8 processor system– Memory saving element-by-elementtechnology for solid92 (and solid95beta in 6.1)

Usage Guidelines:PCG Resource requirements– 1 Gbyte per million dofs– Memory grows automatically for largeproblems– I/O requirement is minimal– Convergence is best for meshes with goodaspect ratios– 3-D cube elements converge better than thinshells or high aspect solids– Over 500k dofs shows best performancecompared to sparse

Usage Guidelines: Substructuring Eqslv,spar in generation pass– Requires pcg or sparse inexpansion pass Use pass uses sparse solver bydefault– May fail in symbolic assembly ( tryasso,,front) Pcg or sparse in expansion pass– Avoids large tri filesThis is Beta feature only in 6.1, no unsymmetric, no damping

Performance Summary Where to look– PCG solver; file.PCS– Sparse solver; output file Add Bcsopt ,,, ,,, -5 (undocu. Option) What to look for– Degrees of freedom– Memory usage– Total iterations (iterative only)

Usage Guidelines Tuning sparse solver performance– Bcsopt command (undocumented)– Optimal I/O for largest jobs– In-core for large memory systems andsmall to medium jobs ( 250,000 dofs )– Use parallel processing

User Control of Sparse Solver OptionsSparse solver control using undocumented command:bcsopt, ropt, mopt, msiz ,,, dbgmmdmetissgiwaveSet equationreorderingmethod-5forclimitnnnn - Mbytesup to 2048Force or limitsolver memoryspace in MbytesPrintperformancestats

Solvers and Modal Analyses Modal analyses most demanding in ANSYS– Block Lanczos is most robust Requires all of sparse solver resourcesplus additional space for eigenvectors Requires multiple solves during Lanczositerations– Subspace good for very large jobs and feweigenvalues Uses PCG solver Or uses the frontal solver Not as robust as block Lanczos

Some Solver Examples Some benchmarks 5.7 vs 6.0Typical large sparse solver jobsSparse solver memory problemPCG solver exampleAMG solver examples

Benchmark study; Static 9Total Solution Time5.76Sparse 31147748813770xPeak 5x

Benchmark 1677502851502851Total Solution Time5.76Sparse Solver32028911497892123114631131893Peak Memory5.7658112448011151249403121115

Sparse Solver Memory Usage Example 12 Million DOF Sparse solver jobSGI O2000 16 CPU systemMultiSolution: Sparse Assembly Option . Call No. 1ANSYS largest memory block available10268444 :ANSYS memory in use1323917280 :End of PcgEndANSYS largest memory block availableANSYS memory in useTotal Time (sec) for Sparse Assembly9.79 Mbytes1262.59 Mbytes588214172 :560.96 Mbytes256482560 :244.60 Mbytes63.53 cpu69.02 wallHeap space available at start of BCSSL4: nHeap 75619667 D.P. words577 Mbytes available for sparse solver576.93 Mbytes

Sparse Solver Memory Usage Example 1 (cont.)Carrier 2M dof ModelSPARSE MATRIX DIRECT SOLVER.Number of equations 2090946,Maximum wavefront ANSYS 6.0 memory allocationHeap space available at start of bcs mem0: nHeap 61665329 D.P. words470.47 MbytesEstimated work space needed for solver: min siz 256932078 D.P. words 1960.24 Mbytes275Initial memory increasedto 800 MbytesStart siz Work space needed for solver: start siz 110399416 D.P. words842.28 MbytesHeap space setting at start of bcs mem0: nHeap 110399416 D.P. words842.28 MbytesInitial BCS workspace memory 110399416 D.P. words 842.28 MbytesTotal Reordering Time (cpu,wall) 537.670542.897Increasing memory request for BCS work to67802738 D.P. words517.29 MbytesInitial BCS workspace is sufficientMemory available for solver 842.28 MBMemory required for in-core 0.00 MBOptimal memory required for out-of-core 517.29 MBMinimum memory required for out-of-core 162.39 MB800 Mbytes exceedsOptimal I/O settingInitial guess easily runs in optimal I/O mode

Sparse Solver Memory Usage Example 1 (cont.)Carrier2 2M dof Modelnumber of equations no. of nonzeroes in lower triangle of a no. of nonzeroes in the factor l maximum order of a front matrix maximum size of a front matrix no. of floating point ops for factor time (cpu & wall) for structure inputtime (cpu & wall) for orderingtime (cpu & wall) for symbolic factortime (cpu & wall) for value inputtime (cpu & wall) for numeric factorcomputational rate (mflops) for factortime (cpu & wall) for numeric solvecomputational rate (mflops) for solvei/o statistics: 57.4048E 166.905039unit .32587072.7894888171.507331541.Freeing BCS workspaceSparse Matrix Solver CP Time (sec) 14468.280Sparse Matrix Solver ELAPSED Time (sec) 15982.407DofsNzeros in K (40/1)Nzeros in L (1142/29)Trillion F.P. opsFactored Matrix file LN092.4 Billion D.P words, 18 Gbytes59 Gbytes transferredFile LN32 not usedElapsed time close to CPU time (4.5 Hours)Good processor utilization, reasonable I/O performance

Engine Block Analysis410,977 Solid45 Elements16,304 Combin40 Elements1,698,525 Equations20,299 Multi-Point CEs

Engine Block AnalysisSparse Solver Interface StatisticsSparse CE interface Matrix-------------------------Original A22Constraints GH G*A11 A12THGTModified A22dimcoefsmxcolmlth******* ********* *********1698525 11698525 58304862404# of columns modified by direct elimination of CEs:132849Over 20,000 CEs processed with minimaladditional memory requiredMemory available for solverMemory required for in-coreOptimal memory required forMinimum memory required for 547.22 MB 9417.10 MBout-of-core 527.29 MBout-of-core 127.25 MBMemory available is sufficient to run inOptimal I/O mode

Engine Block AnalysisSparse Solver Performance SummarySGI O2000 16-300Mhz Processors, 3 CPU runtime (cpu & wall) for structure inputtime (cpu & wall) for orderingtime (cpu & wall) for symbolic factortime (cpu & wall) for value inputtime (cpu & wall) for numeric factorcomputational rate (mflops) for factortime (cpu & wall) for numeric solvecomputational rate (mflops) for solvei/o statistics:unit number----------2025911 2.637.96267.125086.30560.60663.776.91I/O always showsup in 23Good sustained rate on factorization – nearly 600 mflops

Sparse Solver Example 2What can go wrongCustomer example: excessive elapsed timeHigh Performance HP 2 CPU desktop system ------------------------- Release 6.0UP20010919HPPA 8000-64 ------------------------- Maximum Scratch Memory Used 252053628 Words961.508 MB ------------------------- CP Time(sec) 6323.090Time 23:36:41 Elapsed Time (sec) 27575.000Date 01/10/2002 --------------------------*

Sparse Solver Example 2 (cont.)FEM model of large radiator650k Degrees of Freedom68,000 Solid95 Elements2089 Surf154 Elements3400 Constraint EquationsInitial memory setting –m 1000 –db 300

Sparse Solver Example 2 (cont.)MultiSolution: Sparse Assembly Option . Call No. 1ANSYS largest memory block available73741452 :ANSYS memory in use612110368 :70.33 Mbytes583.75 MbytesSparse Solver Interface Adding CEs. Call No. 1ANSYS largest memory block available73741164 :ANSYS memory in use612110656 :70.33 Mbytes583.75 MbytesSparse CE interface Matrixdimcoefs mxcolmlth-------------------------- ******* ********* *********Original A22648234 41415993461Constraints G3471232228H G*A11 A12T3471409194219HGT648234781339668The initial memory allocation (-m) has been exceeded.Supplemental memory allocations are being used.No. of columns modified by direct elimination of CEs:42558Modified A22648234 43974225692ANSYS largest memory block available288465472 :275.10 MbytesANSYS memory in use179570288 :171.25 MbytesTotal Time (sec) for processing CEs38.33 cpu61.73 wallEnd of PcgEndANSYS largest memory block available575083952 :548.44 MbytesANSYS memory in use133219536 :127.05 MbytesTotal Time (sec) for Sparse Assembly38.36 cpu61.77 wall584 Mbytes in useduring sparse AssemblyNeeds more memoryto process CEs548 Mbytes availableafter sparse Assembly

Sparse Solver Example 2 (cont)Minimum core memory run: 650k dofsMemory available for solverMemory required for in-coreOptimal memory required forMinimum memory required for 488.21 MB 7348.80 MBout-of-core 651.66 MBout-of-core 63.18 MBtime (cpu & wall) for structure inputtime (cpu & wall) for orderingtime (cpu & wall) for symbolic factortime (cpu & wall) for value inputtime (cpu & wall) for numeric factorcomputational rate (mflops) for factorcondition number estimatetime (cpu & wall) for numeric solvecomputational rate (mflops) for solvei/o statistics: 488 Mbytes available isless than optimal I/O 0000740.3254160.0000D 00117.40000028.598027unit 3.Sparse Matrix Solver CP Time (sec) Sparse Matrix Solver ELAPSED Time (sec) Elapsed time 5Xlarger than CPU 6871156.062337890.0479023.772166Factored Matrix file LN09838M D.P words, 6.4 Gbytes21 Gbytes transferred5956.17027177.617Large front spillover file I/O is culprit77M D.P. words, 110 Billion transferred!Over ¾ of a Terabyte transferred!!!

Sparse Solver Example 2 (cont)Optimal out-of-core memory run: 650k dofsMemory available for solver 660.21 MBMemory required for in-core 7348.80 MBOptimal memory required for out-of-core 651.66 MBMinimum memory required for out-of-core 63.18 MBtime (cpu & wall) for structure inputtime (cpu & wall) for orderingtime (cpu & wall) for symbolic factortime (cpu & wall) for value inputtime (cpu & wall) for numeric factorcomputational rate (mflops) for factorcondition number estimatetime (cpu & wall) for numeric solvecomputational rate (mflops) for solvei/o statistics: 660 Mbytes available isachieves optimal I/O 0000823.8238530.0000D 00115.45000029.081060unit 950358.2674553251.470460286.Sparse Matrix Solver CP Time (sec) Sparse Matrix Solver ELAPSED Time (sec) 50.5781691 Gflop sustained880.4885303.813120Factored Matrix file LN09838M D.P words, 6.4 Gbytes21 Gbytes transferred5405.5205122.035File LN32 not usedElapsed time 5X fasterthan minimum memory run

Sparse Solver NT system ExampleWhat can go wrongCustomer example: NT memory problemsDell system, 2 P4 processors, 2 Gbytes memory default memory run failed -m 925 –db 100 failed before solver -m 1100 –db 100 interactive failed -m 1100 –db 100 batch mode workedWhy so memory sensitive?

Sparse Solver NT system example (cont.)FEM model of turbine blade772k Degrees of Freedom114,000466276173181183400Solid45 ElementsSolid 95 ElementsSolid 92 ElementsSurf154 ElementsConstraint EquationsLots of CEs used to impose cyclic symmetry conditions

Sparse Solver NT system (cont.)NT system run, 770k dofs turbine bladeMultiSolution: Sparse Assembly Option . Call No. 1ANSYS largest memory block available288061264 :ANSYS memory in use562923008 :274.72 Mbytes536.85 MbytesSparse Solver Interface Adding CEs. Call No. 1ANSYS largest memory block available288061024 :ANSYS memory in use562923248 :274.72 Mbytes536.85 MbytesSparse CE interface Matrixdimcoefs mxcolmlth-------------------------- ******* ********* *********Original A22772125 285661230Constraints G16533717060H G*A11 A12T165338956850HGT77212583646010Needs more memoryto process CEsThe initial memory allocation (-m) has been exceeded.Supplemental memory allocations are being used.No. of columns modified by direct elimination of CEs:Sparse CE interface Matrixdimcoefs mxcolmlth-------------------------- ******* ********* *********Modified A22772125 615872490ANSYS largest memory block availableANSYS memory in useANSYS largest memory block availableANSYS memory in useTotal Time (sec) for Sparse Assembly4971036 :1502114112 :804449536 :185689952 :79.95 cpu537 Mbytes in useBefore CEs516784.74 Mbytes1432.53 Mbytes767.18 Mbytes177.09 Mbytes80.48 wall1400 Mbytes is well over initial allocation !1432 Mbytes in useafter CEs

Sparse Solver NT example (cont)Optimal I/O run on fast NT system: 770k dofsUsing opt out of core memory settingInitial BCS workspace is sufficientMemory available for solver 719.93 MBMemory required for in-core 6944.94 MBOptimal memory required for out-of-core 623.68 MBMinimum memory required for out-of-core 77.61 MBtime (cpu & wall) for structure input time (cpu & wall) for ordering time (cpu & wall) for symbolic factor time (cpu & wall) for value input time (cpu & wall) for numeric factor computational rate (mflops) for factor condition number estimate time (cpu & wall) for numeric solve computational rate (mflops) for solve i/o statistics:unit 4.774037991.Sparse Matrix SolverCP Time (sec) Sparse Matrix Solver ELAPSED Time (sec) 35.4682140.0000D 2451597713.3432.9693455.119720 Mbytes availableachieves optimal I/O 28571326.8760161.3 Gflops sustained829.9074343.735272Factored Matrix file LN09774M D.P words, 5.7 Gbytes18 Gbytes transferredFile LN32 not usedExcellent performance once memory issue is resolved!!

Usage Guidelines: Substructuring Eqslv,spar in generation pass– Requires pcg or sparse inexpansion pass Use pass uses sparse solver bydefault– May fail in symbolic assembly ( tryasso,,front) Pcg or sparse in expansion pass– Avoids large tri filesThis is Beta feature only in 6.1, no unsymettric, no damping

Solving NT memory issues Try default memory management Maximize solver memory– Use larger db for prep and post only– Reduce db memory for solve– Run in batch mode Read output file memory messages– Leave room for supplemental memoryallocations– Try bcsopt,,forc,msiz as a last resort

How to get Optimal I/OMemory Prior to 6.1– Increase –m, decrease –db– Force sparse memory with bcsopt Version 6.1– Automatic in most cases– Tuning possible using bcsopt WINTEL 32 bit limitations– Total process space 2Gbytes– Keep db space small to maximize sparsesolver memory– Don’t start –m too small for large jobs– Use msave,on for PCG solver

Sparse Solver Example 3ANSYS 6.1 example – 2 Million DOF engine blockStart of BCS MEM1: msglvl 2need in 0 D.P.need opt 221585885 D.P.need ooc 20333932 D.P.nHold0 202309239 D.P.nHeap 11789065 D.P.navail 202309239 D.P.mem siz 0 155.141543.5089.941543.500.00Sparse solver memory isjust below optimal row memory to optimal settingIncreasing memory request for BCS work to221585885 D.P. words1690.57 MbytesThe initial memory allocation (-m) has been exceeded.Supplemental memory allocations are being used.After Realloc: pdHold -1575830551 hHold0 324 nHold Memory available for solver 1690.57 MBMemory required for in-core 0.00 MBOptimal memory required for out-of-core 1690.57 MBMinimum memory required for out-of-core 155.14 MB221585885

Sparse Solver Example 3 (cont.)ANSYS 6.1 Engine Block example: SGI O2000 systemnumber of equationsno. of nonzeroes in lower triangle of ano. of nonzeroes in the factor lmaximum order of a front matrixmaximum size of a front matrixno. of floating point ops for factorno. of floating point ops for solvetime (cpu & wall) for structure inputtime (cpu & wall) for orderingtime (cpu & wall) for symbolic factortime (cpu & wall) for value inputtime (cpu & wall) for numeric factorcomputational rate (mflops) for factortime (cpu & wall) for numeric solvecomputational rate (mflops) for solvei/o statistics: 2149066780076322698281519.200622012519531.9072E 131.0804E 000633.883809768.06000014.066442unit .27876723.8406875085.1964282370.Freeing BCS workspaceDealloc ptr Diag 683102464Sparse Matrix Solver CP Time (sec) 32283.010Sparse Matrix Solver ELAPSED Time (sec) 34199.4802.1M78M2.7B19DofsNzeros in K (37/1)Nzeros in L (1286/35)Trillion F.P. 9107614.6967451634.0107506.611873Factored Matrix file LN092.7 Billion D.P words, 20 Gbytes63 Gbytes transferredElapsed time close to CPU time (10 Hours)File LN32 not used

PCG Solver Example PCG memory grows dynamically innon-contiguous blocks Msave,on skips global assembly ofstiffness matrix for SOLID 92, 95elements. PCG solver can do largest problemsin the least memory

PCG Solver Example (cont.)Wing job example, 500k dofs, SOLID45 elementsDegrees of Freedom: 477792DOF Constraints: 4424Elements: 144144Assembled: 144144Implicit: 0Nodes: 159264Number of Load Cases: 1File.PCS outputNonzeros in Upper Triangular part ofGlobal Stiffness Matrix : 18350496Nonzeros in Preconditioner: 7017045Total Operation Count: 3.71503e 10Total Iterations In PCG: 343Average Iterations Per Load Case: 343Input PCG Error Tolerance: 1e-06Achieved PCG Error Tolerance: 9.90796e-07Good convergence(1000 or more is bad)DETAILS OF SOLVER CP TIME(secs)UserSystemAssembly23.93.6Preconditioner Construction8.71.8Preconditioner Factoring0.90Preconditioned **************************************Total PCG Solver CP Time: User: 320.9 secs: System: 9.9 ********************************Estimate of Memory Usage In CG : 240.191 MBEstimate of Disk Usage: 247.919 MBCG Working Set Size with matrix outofcore : 65.0977 ******************************Multiply with A MFLOP Rate:168.24 MFlopsSolve With Precond MFLOP Rate:111.946 MFlopsMemory usage and disk I/O lowMflops performance **********************************Lower than sparse solver

PCG Solver Example (cont.)Wing job example, 228k dofs, SOLID95 elementsMsave,onDegrees of Freedom: 228030DOF Constraints: 3832Elements: 16646Assembled: 0Implicit: 16646Nonzeros in Upper Triangular part ofGlobal Stiffness Matrix : 0Nonzeros in Preconditioner: 4412553Total Operation Count: 1.06317e 10Total Iterations In PCG: *****Total PCG Solver CP Time: User: 809.6 *******Estimate of Memory Usage In CG : 30.6945 MBEstimate of Disk Usage: 36.5936 MB*** Implicit Matrix Multiplication ************Multiply with A MFLOP Rate:0 MFlopsSolve With Precond MFLOP Rate:81.7201 *********DefaultDegrees of Freedom: 228030DOF Constraints: 3832Elements: 16646Assembled: 16646Implicit: 0Nonzeros in Upper Triangular part ofGlobal Stiffness Matrix : 18243210Nonzeros in Preconditioner: 4412553Total Operation Count: 4.60199e 10Total Iterations In PCG: *****Total PCG Solver CP Time: User: 850.2 *******Estimate of Memory Usage In CG : 208.261 MBEstimate of Disk Usage: 215.985 ****Multiply with A MFLOP Rate:62.3031 MFlopsSolve With Precond MFLOP Rate:53.5653 *********Msave,on saves 170 Mbytes out of 200 MbytesSolve time is comparable to assembled runWorks only for SOLID 92s and SOLID95s in 6.1

AMG PerformanceSolver time (sec) vs number of processorsTime (Sec)AMG Performance (0.8 M 6Number of Processors810

AMG vs PowerSolverAdvantages: Insensitive to matrix ill-conditioning.Performance doesn’t deteriorate for highaspect ratio elements, rigid links, etc 5x faster than the PowerSolver for difficultproblems on a single processor Scalable up to 8 processors (shared- memoryonly), 5 times faster with 8 processors

AMG vs PowerSolverDisadvantages: 30% more memory required than PowerSolver 20% slower than PowerSolver for wellconditioned problems on a single processor Doesn’t work for Distributed-Memory architecture(neither does PowerSolver). Scalability is limited by memory bandwidth (so isPowerSolver)

AMG vs Sparse SolverANSYS 6.1 example – 2 Million DOF engine blockAMG ITERATIVE SOLVER:Number of equations 2157241Number of processors used 8Reading parameters from file amg params.datanis hard 4anis hard4hard 1hard1end reading parametersAMG parameters tuned forIll-conditioned problemAMG NO.OF ITER 102 ACHIEVED RESIDUAL NORM 0.90170E-05AMG ITERATIVE SOLVER ELAPSED TIME Sparse Matrix Solver CP Time (sec)1758.000 32283.010Sparse Matrix Solver ELAPSED Time (sec) 34199.480AMG 19 times faster than sparse in this example

Comparative Performance may VaryBut, Your results may vary

Large Industrial ExampleLinear Static Analysis with Nonlinear ContactANSYS DesignSpaceDetailed Solid ModelFinite Element Model119,000 Elements590,000 DOFs

Large Industrial ExampleLinear Static Analysis with Nonlinear ContactSGI O2000 16 300Mhz Processors, 16 05301200Solver Elapsed Time (sec)NP 1NP 2NP 3NP 48679 7745 66365265 3831 263888169091884AMG shows superior convergence and scaling for this problemBUT Sparse Direct solver best for this problem

Wing Example Job2-D mesh282 nodes, 233 elements, 646 DOFs-dofs 50Extrude 2-D mesh toobtain 50,000 DofsElements sized tomaintain nice aspectratios

Wing Static Analyses-stype pcg, frontal, sparPowerSolver (pcg)Frontal direct solver (frontal)Sparse direct solver (spar)Fixed B.C.s at z 0.0small negative y displacement atopposite end

Sparse Solvers ComparisonSolve Time Comparison5000sparamg (1)amg (10)amg (25)pcg (1)pcg (10)pcg (25)Static AnalysisHP L-ClassFour 550 Mhz CPUs4 Gbytes MemoryTime (sec)40003000200010000134k245k489kDegrees of FreedomSolver(aspect)

Sparse Solvers ComparisonSolve Time Comparison5000sparamg (1)amg (10)amg (25)pcg (1)pcg (10)pcg (25)Static AnalysisHP L-ClassFour 550 Mhz CPUs4 Gbytes MemoryTime (sec)40003000200010000134k245k489kDegrees of FreedomIncreasing aspect ratio makesmatrices Ill-conditionedSolver(aspect)

Sparse Solvers ComparisonParallel Performance ComparisonStatic AnalysisHP L-ClassFour 550 Mhz CPUs4 Gbytes MemoryTime (sec)2000sparamgamgamgpcgpcgpcg1500100050001 CPU2 Cpus3 Cpus

Summary ANSYS has industry leading solvertechnology to support robust andcomprehensive simulation capability Attention to solver capabilities andperformance characteristics will extendanalysis capabilities Future improvements will includeincreasing parallel processing capabilitiesand new breakthrough solver technologies

ANSYS largest memory block available 10268444 : 9.79 Mbytes ANSYS memory in use 1323917280 : 1262.59 Mbytes End of PcgEnd ANSYS largest memory block available 588214172 : 560.96 Mbytes ANSYS memory in use 256482560 : 244.60 Mbytes Total Time (sec) for Sparse Assembly 63.53 cpu 69.02 wall Heap space available at start of BCSSL4: nHeap 75619667 .

Related Documents:

Apr 16, 2021 · ANSYS ANSYS Chemkin-Pro 2019 R3 2019 R2 2019 R1 19.2 19.1 19.0 ANSYS Elastic Units, BYOL ANSYS ANSYS Discovery Live (Floating License) 2020 R1 19.2 ANSYS Elastic Units, BYOL ANSYS ANSYS EnSight 10.2.3 ANSYS Elastic Units, BYOL ANSYS ANSYS EnSight GUI 10.2.7a, 10.2 ANSYS Elastic Units, BYOL A

1 ANSYS nCode DesignLife Products 2 ANSYS Fluent 3 ANSYS DesignXplorer 4 ANSYS SpaceClaim 5 ANSYS Customization Suite (ACS) 6 ANSYS HPC, ANSYS HPC Pack or ANSYS HPC Workgroup for Simulation 8 ANSYS Additive Suite 9 ANSYS Composite Cure Simulation DMP Distributed-memory parallel SMP Shared-memory parallel MAPDL Mechanical APDL

Computational Structural Mechanics ANSYS Mechanical . ANSYS and NVIDIA Collaboration Roadmap Release ANSYS Mechanical ANSYS Fluent ANSYS EM 13.0 SMP, Single GPU, Sparse Dec 2010 and PCG/JCG Solvers ANSYS Nexxim 14.0 ANSYS Dec 2011 Distributed ANSYS; Multi-node Support Radiation Heat Transfer (beta) Nexxim

1 ANSYS nCode DesignLife Products 2 ANSYS Fluent 3 ANSYS DesignXplorer 4 ANSYS SpaceClaim 5 ANSYS Customization Suite (ACS) 6 ANSYS HPC, ANSYS HPC Pack or ANSYS HPC Workgroup DMP Distributed-memory parallel SMP Shared-memory parallel MAPDL Mechanical APDL Explicit Autodyn RBD Rigid Body Dynamics Aqwa Aqwa

1 ANSYS nCode DesignLife Products 2 ANSYS Fluent 3 ANSYS DesignXplorer 4 ANSYS SpaceClaim 5 ANSYS Customization Suite (ACS) 6 ANSYS HPC, ANSYS HPC Pack or ANSYS HPC Workgroup DMP Distributed-memory parallel SMP Shared-memory parallel MAPDL Mechanical APDL Explicit Autodyn RBD Rigid Body Dynamics Aqwa Aqwa

chapter surveys the organization of CDCL solvers, from the original solvers that inspired modern CDCL SAT solvers, to the most recent and proven techniques. The organizationof CDCL SAT solvers is primarily inspired by DPLL solvers. As a result, and even though the chapter is self-contained, a reasonable knowledge of the organization of DPLL is .

ANSYS Icepak is part of the ANSYS CFD suite, enabling multiphysics coupling between electrical, thermal and mechanical analyses for electronics design. It is integrated in ANSYS Workbench for coupling with MCAD, thermal-stress analysis with ANSYS Mechanical, and advanced post-processing via ANSYS CFD-Post. ANSYS Simplorer (circuit .

2.1 ASTM Standards:2 C165 Test Method for Measuring Compressive Properties of Thermal Insulations C203 Test Methods for Breaking Load and Flexural Proper-ties of Block-Type Thermal Insulation C303 Test Method for Dimensions and Density of Pre-formed Block and Board–Type Thermal Insulation C390 Practice for Sampling and Acceptance of Thermal Insulation Lots C578 Specification for Rigid .