Detecting Design Patterns In Object-Oriented Program .

3y ago
36 Views
2 Downloads
920.02 KB
12 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Axel Lin
Transcription

1Detecting Design Patterns in Object-Oriented Program Source Codeby using Metrics and Machine LearningSpecial Issue - Software Design PatternSatoru Uchiyama†, Atsuto Kubo†, Hironori Washizaki†, Yoshiaki Fukazawa†SUMMARYDetecting well-known design patterns in objectoriented program source code can help maintainers understand thedesign of a program. Through the detection, the understandability,maintainability, and reusability of object-oriented programs can beimproved. There are automated detection techniques; however manyexisting techniques are based on static analysis and use strict conditionscomposed on class structure data. Hence, it is difficult for them todetect and distinguish design patterns in which the class structures aresimilar. Moreover, it is difficult for them to deal with diversity indesign pattern applications. To solve these problems in existingtechniques, we propose a design pattern detection technique usingsource code metrics and machine learning. Our technique judgescandidates for the roles that compose design patterns by using machinelearning and measurements of several metrics, and it detects designpatterns by analyzing the relations between candidates. It suppressesfalse negatives and distinguishes patterns in which the class structuresare similar. As a result of experimental evaluations with a set ofprograms, we confirmed that our technique is more accurate than twoconventional techniques.keywords: Design patterns, Software metrics, Machine learning,Object-oriented programming, Software maintenance.1. IntroductionA design pattern is an abstracted repeatable solution to acommonly occurring software design problem under acertain context. Among the large number of reporteddesign patterns extracted from well-designed software,the 23 Gang-of-Four (GoF) design patterns [1] areparticularly known and used in object-oriented design.Design patterns targeting object-oriented design areusually defined as partial designs composed of classesthat describe the roles and abilities of objects. Forexample, Figure 1 shows a GoF pattern named theState pattern [1]. This pattern is composed of rolesnamed Context, State, and ConcreteState.Existing programs implemented by a third party andopen source software may take a lot of time tounderstand, and patterns may be applied without explicitclass names, comments, or attached documents. Thus,pattern detection is expected to improve theunderstandability of programs. However, manuallydetecting patterns in existing programs is inefficient, andpatterns may be overlooked.Many researches on pattern detection to solve theabove problems have used static features of patterns.However, such static analysis has difficulty in identifying† The authors are with Waseda University, 3-4-1 Okubo,Shinjuku-ku, Tokyo, 169-8555 Japan.patterns in which class structures are similar. In addition,there is still a possibility that software developers mightoverlook variations of patterns if they use a techniqueutilizing predefined strict conditions of patterns from theviewpoint of structure; patterns are sometimes appliedslightly vary from the predefined conditions.We propose a pattern detection technique that usessource code metrics (hereafter, metrics) and machinelearning for detecting firstly roles and secondly patternsas structure of those roles. Although our technique can beclassified as a type of static analysis, unlike conventionaldetection techniques it detects patterns by identifyingcharacteristics of roles derived by machine learningbased on the measurement of metrics without using strictcondition descriptions (class structural data, etc.). Ametric is a quantitative measure of a software propertythat can be used to evaluate software development. Forexample, one such metric, number of methods (NOM),refers to the number of methods in a class [2]. Moreover,using machine learning, we can in some cases obtainpreviously unknown characteristics of roles foridentification by combinations of various metrics. Tocover a diverse range of pattern applications, our methoduses a variety of learning data because the results of ourtechnique may depend on the type and number oflearning data used during the machine learning process.Finally, we conducted experiments comparing ourtechnique with two conventional techniques and foundFig. 1 State pattern.Fig. 2 Strategy pattern.

2that our approach was the most accurate of the three forsmall-scale programs and large-scale ones used in theexperiments.2. Conventional Techniques and Their ProblemsMost of the existing detection techniques are based onstatic analysis [3][4]. These techniques chiefly analyzeinformation such as class structures that satisfy certainconditions. If they vary even slightly from the intendedstrict conditions, or two or more roles are assigned in aclass, there is a possibility that these techniques mightoverlook patterns. For example, many of theconventional techniques based on static technique candetect the Singleton pattern [1] in the typicalimplementations shown in Figure 3. However, regardingspecialized implementation using a boolean variable,as shown in Figure 4, the Singleton pattern cannot bedetected by the conventional techniques based on staticanalysis. On the other hand, our technique successfullydetect the Singleton pattern for both implementationsdue to the flexible nature in the machine learning ofmetric measurements for identifying roles and the entireprocess composed of multiple steps including judgingcandidate roles and detecting patterns.Distinguishing the State pattern (shown in Figure 1)from the Strategy pattern (shown in Figure 2) is alsodifficult for conventional techniques based on staticanalysis because their class structures are similar. Unlikethese techniques based on static analysis, we distinguishpatterns for which the structure is similar by firstlyidentifying the roles using various metrics and theirmachine learning and secondly detecting patterns asstructure of those roles.There is another static analysis technique that detectspatterns based on the degrees of similarity betweengraphs of the pattern structure and graphs of theprograms to be detected [3]. This technique is availableto the public. However, this technique has the difficultyin distinguishing patterns that have similar structure asmentioned above.There is also a technique that outputs candidatepatterns based on features derived from metricmeasurements [5]. However, it requires manualconfirmation; this technique can roughly identifycandidate patterns, but the final choice depends on thedeveloper's skill. Our technique detects patterns withoutmanual filtering using metrics and machine learningbased on class structure analysis. Moreover, thistechnique uses general metrics concerning an objectoriented system without using metrics for each role. Ourtechnique uses metrics that specialize in each role.Another existing technique improves precision byfiltering out false hits from pattern detection resultsobtained by existing static analysis approach [6].Although this technique is similar to our technique sinceboth techniques utilize machine learning and requireFig. 3 Example of typical implementation of Singleton pattern in Java.Fig. 4 Example of specialized implementation of Singletonpattern in Java.some heuristics in determining parameters such asthresholds in machine learning, the designs of entiredetection processes are quite different. This techniqueutilizes the machine learning only for filtering resultsobtained by another technique so that its final recallcannot exceed that of the original obtained results. Onthe other hand, our technique utilizes the machinelearning not for filtering but for detecting patterns.Therefore, there is a possibility that our technique issuperior to this technique in terms of discriminatingsimilar patterns and detecting variety of patternapplications, as mentioned above; or at least, thistechnique and our technique are expected to posedifferent detection results.Yet another approach detects patterns from the classstructure and behavior of a system after classifying itspatterns [8][9]. It is difficult to use, however, whenmultiple patterns are applied to a same location andwhen pattern application is diverse. In contrast, ourtechnique copes well with both of these challenges.Other detection techniques use dynamic analysis. Thesemethods identify patterns by referring to the executionpath information of programs [10][11]. However, it isdifficult to analyze the entire execution path and usefragmentary class sets in an analysis. Additionally, theresults of dynamic analysis depend on therepresentativeness of the execution sequences.Some detection techniques use a multilayered(multiphase) approach [12][13]. Lucia et al. use a twophase, static analysis approach [12]. This method hasdifficulty, however, in detecting creational andbehavioral patterns because it analyzes pattern structuresand source code level conditions. Guéhéneuc andAntoniol use DeMIMA, an approach that consists of

3three layers: two layers to recover an abstract model ofthe source code, including binary class relationships, anda third layer to identify patterns in the abstract model[13]. However, distinguishing the State pattern fromthe Strategy pattern is difficult because theirstructures are almost identical. Our technique can detectpatterns in all categories and distinguish the Statepattern from the Strategy pattern using metrics andmachine learning.Finally, one existing technique detects patterns usingformal OWL (Web Ontology Language) definitions [14].However, false negatives arise because this techniquedoes not accommodate the diversity in patternapplications. This technique [14] is available to thepublic via the web as an Eclipse plug-in.We suppress false negatives by using metrics andmachine learning to accommodate diverse patternapplications and to distinguish patterns in which the classstructures are similar. Note that only the techniques in [3]and [14] out of those discussed above have been releasedas publicly accessible tools; Table 1 shows details ofthese publicly available tools.Table 1NameTSAN(Tsantalis et al.[3])DIET(Dietrichetal.[14])Details of publicly available detection tools.Patterns to be detectedFactory Method, Prototype,Singleton, Composite, Decorator,Proxy, Template Method, Observer,Visitor, Adapter/Command,State/StrategyAbstract Factory, Builder,Singleton, Adapter, Bridge,Composite, Proxy, Template Method3. Machine-Learning-Based Detectionin Figure 5: a learning phase and a detection phase. Eachprocess is described below, with pattern specialists anddevelopers included as the parties concerned. Patternspecialists are people with knowledge about the patterns.Developers are people who maintain the object-orientedsoftware. Our technique currently uses Java as the targetprogram language.The learning phase consists of the following steps. P1. Define Patterns: Pattern specialists determinethe detectable patterns and define the structures androles composing these patterns. P2. Decide Metrics: Pattern specialists determineuseful metrics to judge the roles defined in P1 usingthe Goal Question Metric method. P3. Machine Learning: Pattern specialists inputprograms containing patterns into the metricmeasurement system and obtain measurements foreach role. They also input these measurements intothe machine learning simulator to learn. Aftermachine learning, they verify the judgment for eachrole, and if the verification results are unsatisfactory,they return to P2 and revise the metrics.The detection phase consists of the following steps. P4. Candidate Role Judgment: Developers inputprograms to be detected into the metricmeasurement system and obtain measurements foreach class. They then input these measurementsinto the machine learning simulator. The machinelearning simulator identifies candidate roles. P5. Pattern Detection: Developers input thecandidate roles judged in P4 into the patterndetection system using the pattern structuredefinitions defined in P1. This system detectspatterns automatically.In the following subsections, we will explain thesephases in detail.The process of our technique is composed of thefollowing five steps classified into two phases as shownLearning PhaseDetection PhaseCandidaterolesMetrics relatedto stsDesignpatternprogramsProgramsjudgmentPattern specialistsFig. 5P5PatterndetectionDevelopersProcess of our technique.Detectionresult

43.1 Learning Phase(Metric values) NOFNSMNOMNOAMUnitP1. Define Patterns23 GoF patterns have been originally classified into threetypes: creational, structural and behavioral [1]. To clarifythe usefulness of our technique for each type, we chooseat least one pattern from each type: Singleton fromthe creational patterns, Adapter from the structural one,and, Template Method, State and Strategyfrom the behavioral ones. Currently, our techniqueconsiders these five patterns and 12 roles.P2. Decide MetricsPattern specialists decide on useful metrics to judge rolesusing the Goal Question Metric method [15] (hereafter,GQM). With our technique, the pattern specialists set theaccurate judgment of each role as a goal. To achieve thisgoal they define a set of questions to be evaluated. Next,they decide on useful metrics to help answer thequestions they have established. They can decidequestions by paying attention to the attributes andoperations of the roles by reading the description of thepattern definition.A lack of questions might occur because GQM isqualitative. Therefore, if the machine learning results areunsatisfactory owing to the diverse values of metricmeasurements, it is preferable to back to P2 in order toreconsider metrics also concerning behavior. Currentlysuch returning path is not systematically supported in ourtechnique; in the future we will consider supporting thepath by such as indicating inappropriate goals, questionsand/or metrics according to the machine learning result.For example, Figure 6 illustrates the goal of making ajudgment about the AbstractClass role in theFig. 6Example of GQM application result(AbstractClass role).Fig. 7 Example of source code (AbstractClass role).・・・Input layerWeightwHidden layer・・・WeightwOutput layer(Role)・・・SingletonAbstractClassFig. 8AdapterStateNeural network.Fig. 9 Back propagation.Template Method pattern. AbstractClass roleshave abstract methods or methods using written logicthat use abstract methods as shown in Figure 7. TheAbstractClass role can be distinguished by the ratioof the number of abstract methods (NOAM) to thenumber of concrete methods (NOM) because for this rolethe former is supposed to exceed the latter. Therefore,NOAM and NOM are useful metrics for judging this role.In Appendix, Figure 14 shows the results of applyingGQM to the roles of all detection targets. The metrics aredescribed in detail in Table 2 in subsection 4.1.P3. Machine LearningMachine learning is a technique that analyzes sampledata by computer and acquires useful rules with which tomake forecasts about unknown data. We used machinelearning so as to be able to evaluate patterns with avariety of application forms. Machine learning isexpected to suppress false negatives and achievesextensive detection.Our technique uses a neural network [16] algorithmbecause it outputs the values to all roles, taking intoconsideration the interdependence among the differentmetrics. Therefore, it can deal with cases in which oneclass has two or more roles.A neural network is composed of an input layer,hidden layers, and an output layer, as shown in Figure 8,and each layer is composed of elements called units.Values are given a weight when they move from unit to

5unit, and a judgment rule is acquired by changing theweights. A typical algorithm for adjusting weights isback propagation. Back propagation calculates the errormargin between the output result y and the correctanswer T, and it sequentially adjusts weights from thelayer nearest the output to the input layer, as shown inFigure 9. These weights are adjusted until the outputerror margin of the network reaches a certain value.Our technique uses a hierarchical neural networksimulator [17]. This simulator uses back propagation.The hierarchy number in the neural network is set tothree, the number of units in the input layer is set to thenumber of decided metrics, and the number of units ofthe output layer is set to the number of roles beingjudged. Regarding the hidden layer, at this time wetentatively set the same number as that of the input layerfor the simplicity of entire structure of the network andlow memory consumptions in repeatedly conductedexperiments described in the later section 4. In the future,we will consider optimizing the number of units of thehidden layer using various information criteria [23].As the transfer function in the neural network, we usethe sigmoid function instead of a step function since thesigmoid function is a widely accepted choice forcomputing continuous output in a multi-layer neuralnetwork [16][17].The input consists of the metric measurements of eachrole in a program to which patterns have already beenapplied, and the output is the expected role. Patternspecialists obtain measurements for each role using themetric measurement system, and they input thesemeasurements into the machine learning simulator tolearn. The repetition of learning ceases when the errormargin curve of the simulator converges. At present,specialists verify the convergence of the error margincurve manually. After machine learning they verify thejudgment for each role, and if the verification results areunsatisfactory, they return to P2 and revise the metrics.3.2 Detection PhaseP4. Candidate Role JudgmentDevelopers input programs to be detected into the metricmeasurement system and obtain measurements for eachclass, and then they input these measurements into themachine learning simulator. This simulator outputsvalues between 0 and 1 for all roles to be judged. Theoutput values fare normalized such that the sum of allvalues becomes 1 since the sum of the output valuescould be different for each input in the neural network;by this normalization a common threshold can be usedfor comparison. The normalized output values are calledrole agreement values. A larger role agreement valuemeans that the candidate role is more likely to be correct.The reciprocal of the number of roles to be detected isset as a threshold; the role agreement values that arehigher than the threshold are taken to be candidate roles.The threshold is 1/12 (i.e., 0.0834) because we treat 12roles at present.For example, Figure 10 shows the candidate rolejudgment results for a class that has the following metricmeasurement values: NOM is 3, NOAM is 2, and othermeasurement values are 0. In Figure 10, the output valueof AbstractClass is highest. By normalizing thevalues in Figure 10, the candidate roles of the class arejudged as AbstractClass and Target.P5. Pattern DetectionDevelopers input the candidate roles judged in P4 intothe pattern detection system using the pattern structuredefinitions defined in P1. This system detects patterns bymatching the direction of the relations between candidateroles in the programs and the roles of patterns. Thematching moves sequentially from the candidate rolewith the highest role agreement value to that with thelowest value; the system searches all combinations ofcandidate roles that are in agreement with the patternstructures. And if the directions of relations betweencandidate roles are in agreement with the patternstructure and when the candidate roles are in agreementwith the roles at both ends of the relations, the systemdetects the pattern.Currently, our method deals with inheritance, interfaceimplementation, and aggregation relations. To clarify thedifference of these relation types, we introduce therelation

A design pattern is an abstracted repeatable solution to a commonly occurring software design problem under a certain context. Among the large number of reported design patterns extracted from well-designed software, the 23 Gang-of-Four (GoF) design patterns [1] are particularly known and used in object-oriented design.

Related Documents:

Object built-in type, 9 Object constructor, 32 Object.create() method, 70 Object.defineProperties() method, 43–44 Object.defineProperty() method, 39–41, 52 Object.freeze() method, 47, 61 Object.getOwnPropertyDescriptor() method, 44 Object.getPrototypeOf() method, 55 Object.isExtensible() method, 45, 46 Object.isFrozen() method, 47 Object.isSealed() method, 46

Object Class: Independent Protection Layer Object: Safety Instrumented Function SIF-101 Compressor S/D Object: SIF-129 Tower feed S/D Event Data Diagnostics Bypasses Failures Incidences Activations Object Oriented - Functional Safety Object: PSV-134 Tower Object: LT-101 Object Class: Device Object: XS-145 Object: XV-137 Object: PSV-134 Object .

Creational patterns This design patterns is all about class instantiation. This pattern can be further divided into class-creation patterns and object-creational patterns. While class-creation patterns use inheritance effectively in the instantiation process, object-creation patterns

LLinear Patterns: Representing Linear Functionsinear Patterns: Representing Linear Functions 1. What patterns do you see in this train? Describe as What patterns do you see in this train? Describe as mmany patterns as you can find.any patterns as you can find. 1. Use these patterns to create the next two figures in Use these patterns to .

What is object storage? How does object storage vs file system compare? When should object storage be used? This short paper looks at the technical side of why object storage is often a better building block for storage platforms than file systems are. www.object-matrix.com info@object-matrix.com 44(0)2920 382 308 What is Object Storage?

1. Transport messages Channel Patterns 3. Route the message to Routing Patterns 2. Design messages Message Patterns the proper destination 4. Transform the message Transformation Patterns to the required format 5. Produce and consume Endpoint Patterns Application messages 6. Manage and Test the St Management Patterns System

In this patent the inventor proposes to use time multiplexing for two or more radars mounted on an automotive vehicle for detecting object. The first radar is detecting object in front of the vehicle and the second radar is detecting object behind the vehicle or both radars are positioned close to each other.

Anatomi jalan lahir penting untuk keberhasilan kelahiran . Jalan Lahir Bagian tulang terdiri atas tulang- tulang panggul. - os coxae - os sacrum - os coccygis Bagian lunak (Diafragma pelvis )terdiri atas otot- otot , jaringan, dan ligament. - Pars muskulus levator ani - Pars membranasea - Regio perineum Tulang panggul terdiri atas a. os. Coxae (inominata) - os. Ilium - os. Ischium - os. Pubis .