CS 5154 Regression Testing Techniques - Cornell University

1y ago
11 Views
2 Downloads
757.37 KB
54 Pages
Last View : 2d ago
Last Download : 2m ago
Upload by : Ryan Jay
Transcription

CS 5154 Regression Testing Techniques Spring 2021 Owolabi Legunsen

Continuous Integration (CI): rapid test/release cycles Version Control 2 Fetch Changes Commit Changes 5 Pass/Fail 1 Developers ? 6 Release/Deploy CI Server Builds per day: Facebook: 60K* Google: 17K HERE: 100K Microsoft: 30K Single open-source projects: up to 80 Releases per day Etsy: 50 * Android only; Facebook: https://bit.ly/2CAPvN9 ; Google: https://bit.ly/2SYY4rR ; HERE: https://oreil.ly/2T0EyeK ; Microsoft: https://bit.ly/2HgjUpw ; Etsy: https://bit.ly/2IiSOJP ; 2

Several important problems exist in these cycles P1: Passing tests miss bugs Version Control S1: Find more bugs from tests 2 Fetch already Changes have that developers Commit Changes 5 Pass/Fail P2. Failed tests, no buggy changes 1 ? S2: Find bugs more reliably by 6 Release/Deploy detecting such failures Developers Builds per day: P3. Testing can be very slow Facebook: 60K* bugs Google: 17K S3: Find faster HERE: 100K CIby speeding up testing Microsoft: 30K Server Single open-source projects: up to 80 P4. How to test in new domains? Releases per day S4: Find bugs in emerging Etsy: 50 application domains * Android only; Facebook: https://bit.ly/2CAPvN9 ; Google: https://bit.ly/2SYY4rR ; HERE: https://oreil.ly/2T0EyeK ; Microsoft: https://bit.ly/2HgjUpw ; Etsy: https://bit.ly/2IiSOJP ; 3

The problem that we’ll talk about today Commit Changes Builds per day: Problem: Testing can be very slow Facebook: 60K* Version Control Google: 17K 2 Fetch Changes HERE: 100K CI Microsoft: Solution: Techniques that can help speed up regression testing30K Server 5 Pass/Fail Single open-source projects: up to 80 1 Developers ? 6 Release/Deploy Releases per day Etsy: 50 * Android only; Facebook: https://bit.ly/2CAPvN9 ; Google: https://bit.ly/2SYY4rR ; HERE: https://oreil.ly/2T0EyeK ; Microsoft: https://bit.ly/2HgjUpw ; Etsy: https://bit.ly/2IiSOJP ; 4

test execution time number of tests 5min 1667 10min 641534 45min 1296 45min 361 45min 631 4h 4975 17h 8663 Run many times each day Re-running tests can be very slow 5

What are your ideas for speeding up testing? 6

In this lecture Speed up regression testing Detect regression faults as soon as possible Reduce cost of testing Common techniques: Regression Test Selection Test-Suite Reduction (Minimization) Test-Case Prioritization 7

Regression Testing Techniques Speed up regression testing Detect regression faults as soon as possible Reduce cost of testing Common techniques: Regression Test Selection Test-Suite Reduction (Minimization) Test-Case Prioritization 8

Regression Test Selection (RTS) Rev 1733 Tests Tests T0 T1 T2 T3 TN T0 T1 T2 T3 TN Rev 1734 Change 9

How RTS works Code Tests Changes Find Dependencies Dependencies Analyze Dependencies Affected Tests An affected test can behave differently due to code changes A test is affected if any of its dependencies changed 10

RTS at Google (Target/Module Level) buzz client tests gmail client tests gmail server tests buzz server tests Can we select fewer? buzz client youtube client buzz server gmail client gmail server common collections util youtube server 11

Class-level RTS Track dependencies between classes (in Java) Collect changes at class level Connect relationships between classes Select test classes (run all test methods in selected test class) How do we track test dependencies? How do we track changes? 12

Class-level Dynamic RTS (Ekstazi1) Find Dependencies: dynamically track classes used while running each test class Instrument classes to figure out which classes are used/loaded when running tests in some test class Changes: classes whose .class (bytecode) files differ Analyze Dependencies: select test classes for which any of its dependencies changed Maintain dependencies between versions 1Gligoric et al., Practical Regression Test Selection with Dynamic File Dependencies. ISSTA 2015, https://github.com/gliga/ekstazi 13

Ekstazi Example Rev 1733 Ekstazi Dependencies T0: {A,B,C,D} T1: {B} T2: {B,C,D} T3: {E} TN: {C,F} Tests Tests T0 T1 T2 T3 TN T0 T1 T2 T3 TN Rev 1734 Change {C} Ekstazi Dependencies T0: {A,B,C,D} T1: {B} T2: {B,C,D,G} T3: {E} TN: {C,F,G} 14

Class-level STAtic RTS (STARTS1) First, statically build a class dependency graph Each class has an edge to direct superclass/interface and referenced classes Find Dependencies: classes reachable from test class in the graph Changes: computed in same way as Ekstazi Analyze Dependencies: select test classes that reach a changed class in the graph 15 1Legunsen et al., An Extensive Study of Static Regression Test Selection in Modern Software Evolution. FSE 2016

STARTS Example Use edge or inheritance edge A T0 T1 T2 T3 B C D STARTS Dependencies T0: {A,B,C,D} T1: {B,C} T2: {B,C,D} T3: {E} TN: {C,E,F} Transitive closure TN E F 16

Important RTS Considerations Run All Tests Find Dependencies Analyze Run Affected Tests Time Savings End-to-End Time for RTS RTS is safe if it selects to rerun all affected tests For Ekstazi, includes time to run and collect coverage/dependencies RTS is precise if it selects to rerun only affected tests 17

Pros and cons of static vs. dynamic RTS? 18

Dynamic vs Static Dynamic: Pro Gets exactly what tests depends on Con Requires executing tests to collect dependencies (overhead) Static: Pro Quick analysis without needing to execute tests Con Can over-approximate affected tests due to static analysis May miss dependencies (reflection!) 19

Finer Granularity? Why not go even finer granularity of dependencies? Method-level? Statement-level? Collecting such dependencies (correctly) is harder More time to collect dependencies Is the extra time worth it? Can actually be unsafe! 20

Class-level vs Target/Module-level Class-level test selection should be more precise than target/modulelevel test selection Selects to run all tests in affected test class, not all tests in affected test target/module Why do companies not use class-level test selection? 23

Some RTS tools you can use today Built by researchers (click on links below) STARTS Ekstazi Built by industry (click on links below) Microsoft Test Impact Analysis OpenClover Test Optimization 24

Regression Testing Techniques Speed up regression testing Detect regression faults as soon as possible Reduce cost of testing Common techniques: Regression Test Selection Test-Suite Reduction (Minimization) Test-Case Prioritization 25

Test-Suite Reduction (TSR) Tests Rev 1733 Tests T0 T0 has same T1 “behavior” as T1 T2 T2 does not help T3 the test suite overall TN Rev 1734 T0 T1 T2 T3 TN Change 26

Test-Suite Reduction (TSR) Create a smaller, reduced test suite to run Run fewer tests overall across many revisions Analysis happens once/infrequently, so okay to spend more time Test suite should not miss to detect any faults Ideal: all tests that would fail and detect fault should be kept in the reduced test suite Just as good: at least one test that can detect each fault should be kept in the reduced test suite 27

TSR versus RTS Test-Suite Reduction Regression Test Selection How are tests chosen to run? Redundancy Changes (one revision) (two revisions) How often is analysis performed? Infrequently Can it miss failing tests Yes from the original test suite? Every revision No (if safe) 28

TSR Process T0 T1 T2 T3 T4 Heuristic: tests that cover the same elements as the original test suite are just as good C0 C1 C2 C3 C4 M1 M2 M3 M4 X X X X X X X X X X X X X X X X X X T Tests C Classes (could be statements, methods, branches, etc.) M Mutants (could be other fault-like requirements) Reduced Test Suite R {T2,T4} Size 40% Fault-Detection Capability \ 25% ( ) { , } 29

TSR Algorithms TSR is essentially set cover problem (NP-Complete) Algorithms to approximate finding minimal test suite: Greedy (Total vs Additional) GRE GE HGS ILP (Integer Linear Programming - can get actual minimal) 30

Greedy Algorithm (Total) Greedy heuristic: select test that covers the most elements Iteratively make greedy choice test Tie-break: Random? Sorted by name? Stop when chosen tests cover all elements 31

Greedy (Total) Example T0 T1 T2 T3 T4 C0 C1 C2 C3 C4 X X X X X X X X X req(O) {C0,C1,C2,C4} R {T0} 32

Greedy (Total) Example T0 T1 T2 T3 T4 C0 C1 C2 C3 C4 X X X X X X X X X req(O) {C0,C1,C2,C4} R {T0,T1} 33

Greedy (Total) Example T0 T1 T2 T3 T4 C0 C1 C2 C3 C4 X X X X X X X X X req(O) {C0,C1,C2,C4} R {T0,T1,T2} 34

Greedy (Total) Example T0 T1 T2 T3 T4 C0 C1 C2 C3 C4 X X X X X X X X X req(O) {C0,C1,C2,C4} R {T0,T1,T2,T4} 35

Greedy Algorithm (Additional) Greedy heuristic: select test that covers the most uncovered elements Keep track of what has been covered so far and only consider the yet-to-be covered ones The rest is the same as Greedy (Total) 36

Greedy (Additional) Example T0 T1 T2 T3 T4 C0 C1 C2 C3 C4 X X X X X X X X X req(O) {C0,C1,C2,C4} R {T0} 37

Greedy (Additional) Example T0 T1 T2 T3 T4 C0 C1 C2 C3 C4 X X X X X X X X X req(O) {C0,C1,C2,C4} R {T0,T3} 38

Greedy (Additional) Example T0 T1 T2 T3 T4 C0 C1 C2 C3 C4 X X X X X X X X X req(O) {C0,C1,C2,C4} R {T0,T3,T4} 39

GRE Iteratively select essential tests An essential test uniquely covers some elements that no other test can cover Select the “most” essential tests that cover most unique elements Selecting some essential tests may make other tests essential If no essential tests, then make the greedy (additional) choice 40

GRE Example T0 T1 T2 T3 T4 C0 C1 C2 C3 C4 X X X X X X X X X req(O) {C0,C1,C2,C4} R {T4} 41

GRE Example T0 T1 T2 T3 T4 C0 C1 C2 C3 C4 X X X X X X X X X req(O) {C0,C1,C2,C4} R {T4,T2} 42

Open Questions Can reduced test suite replace original test suite for future revisions? Can reduced test suite fail when original test suite does? Are removed tests truly redundant? Would you trust the algorithm in removing some of your tests? What heuristics should we use to determine redundancy? Does evaluation with seeded faults/mutants on current version predict effectiveness in future? 43

Regression Testing Techniques Speed up regression testing Detect regression faults as soon as possible Reduce cost of testing Common techniques: Regression Test Selection Test-Suite Reduction (Minimization) Test-Case Prioritization 44

Test-Case Prioritization (TCP) Run T3 first Tests Rev 1733 T0 T1 T2 T3 TN Change order of tests T3 “better” at finding faults Tests T3 Rev 1734 Change 45

Test-Case Prioritization (TCP) Run all tests, but in decreasing order of likelihood of revealing faults As tests run, debug and fix faults discovered by early test failures Runs all tests, so no risk of missing any test failure overall cost does not go down “Poor person’s” test selection: stop running when budget is exceeded 46

How to Evaluate TCP Orders? Similar to TSR, evaluate using other requirements, e.g., faults/mutants Measure “speed” of detecting all faults 47

APFD Average Percentage of Faults Detected (APFD) 1 1 2 ℎ 48

TCP Algorithms Prioritize based on coverage Greedy (Total vs Additional) Adaptive Random Prioritize based on source code Order tests based on source code differences Information retrieval Simple (yet effective!) prioritization Quickest test first Most frequently failing (historically) test first 49

Greedy TCP Similar to previous Greedy algorithms for TSR Greedy choice: test that covers the most (uncovered) elements Eventually, all tests still get run 50

Adaptive Random TCP Start with a random test Order next tests based on greatest “dissimilarity” with prioritized tests E.g., for coverage, which tests cover the most different elements than any of the tests already prioritized? Measure distance between covered elements (e.g., Jaccard distance) Maximum distance? Minimum distance? 51

Adaptive Random TCP Example T0 T1 T2 T3 T4 C0 C1 C2 C3 C4 X X X X X X X X X 2/3 2/3 0 0 0 1/2 1/2 1/2 1 [T1,T4,T0,T3,T2] 52

Information Retrieval Information Retrieval (IR) Rank text documents based on relevance to a query Documents Query IR Model Ranked Documents 1. Doc3 2. Doc0 3. Doc2 53

Information Retrieval TCP Information Retrieval (IR)-Based TCP1 Rank tests based on relevance to changed code Change-aware Tests Changes 1Saha IR Model Ranked Tests 1. T3 2. T0 3. T2 et al., “An information retrieval approach for regression test prioritization based on program changes”, ICSE 2015 54

Open Questions How to best model likelihood of test failing? Is coverage a good heuristic? Is “diversity” what we want? Can it be done quickly? Change-aware? Should TCP be used in companies? Does it make sense to prioritize tests in order of likely failing? Does it make sense to have mindset of debugging as soon as test fails, even as tests keep running? 55

Summary Regression testing can be a huge cost of software development Regression testing techniques aim to reduce that cost Many other techniques exist for reducing the cost Test parallelization Test mocking Test slicing Optimizing test placement Machine learning approach for RTS, TCP, etc. 56

Regression Testing Techniques Speed up regression testing Detect regression faults as soon as possible Reduce cost of testing Common techniques: Regression Test Selection Test-Suite Reduction (Minimization) Test-Case Prioritization 44. Test-Case Prioritization (TCP)

Related Documents:

lenze se p.o.box 10 13 52, d-31763 hameln, haus-lenzestrasse 1, d-31855 aerzen, germany tel:49-5154-82- fax:49-5154-82-2800 04 16 . wulmser weg5,d-31855 aerzen germany tel:49-5154-9539-41 fax:49-5154-9539-10 world wide network 27 14 12 531113 25 40 26 2428 17 1632 4950 1829 4820 3738 41 3334 4407 4546 19 3023 35 21 4347 09

independent variables. Many other procedures can also fit regression models, but they focus on more specialized forms of regression, such as robust regression, generalized linear regression, nonlinear regression, nonparametric regression, quantile regression, regression modeling of survey data, regression modeling of

Regression testing is any type of software testing, which seeks to uncover regression bugs. Regression bugs occur as a consequence of program changes. Common methods of regression testing are re-running previously run tests and checking whether previously-fixed faults have re-emerged. Regression testing must be conducted to confirm that recent .

1 Testing: Making Decisions Hypothesis testing Forming rejection regions P-values 2 Review: Steps of Hypothesis Testing 3 The Signi cance of Signi cance 4 Preview: What is Regression 5 Fun With Salmon 6 Bonus Example 7 Nonparametric Regression Discrete X Continuous X Bias-Variance Tradeo 8 Linear Regression Combining Linear Regression with Nonparametric Regression

While regression testing has been received a great deal of research effort in many software domains such as test case selection based on code changes [5]-[9] and specification changes [10]-[12], regression testing for database applications [13]-[15] , and regression testing for GUI [16], [17], contrary regression testing for

LINEAR REGRESSION 12-2.1 Test for Significance of Regression 12-2.2 Tests on Individual Regression Coefficients and Subsets of Coefficients 12-3 CONFIDENCE INTERVALS IN MULTIPLE LINEAR REGRESSION 12-3.1 Confidence Intervals on Individual Regression Coefficients 12-3.2 Confidence Interval

REGRESSION TESTING Regression testing is the process of retesting the modified parts of the software to ensure that no new errors have been introduced into previously tested code. Development Testing versus Regression Testing S.no. Development Testing Regression Testing 1. We create test suite and test plan We can make use of existing test suite

Rating according to ASTM E 989 - 06 Impact Insulation Class IIC c: 51 dB Improvement of Impact Insulation Class ΔIIC: 23 dB Evaluation based on laboratory measurement results obtained in one-third-octave bands by an engineering method No.of test report: SONI107 Name of test institute: eco-scan bvba Date: Signature: Volker Spessart 28-Nov-18 L n, ref, c f L n,ref,c (*) 1/3 octave bands : 28 .