Mixing SNA And Classical Software Metrics For Sub-Projects .

2y ago
5 Views
2 Downloads
503.72 KB
6 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Elisha Lemon
Transcription

Recent Researches in Engineering Education and Software EngineeringMixing SNA and Classical Software Metrics for Sub-ProjectsAnalysis.ROBERTO TONELLIUniversity of CagliariDIEEP.zza D’Armi, 09100 CagliariITALYroberto.tonelli@dsf.unica.itGIUSEPPE DESTEFANISUniversity of CagliariDIEEP.zza D’Armi, 09100 ract: We present a preliminary study of the joint application of network and software metrics to the analysis ofsubprojects of a large software system. Aim of this paper is to provide new tools for the analysis of software systems subprojects by using very recent results obtained in the new research field of software networks. We presentan empirical and exploratory analysis of the evolution of software subprojects in time, as well as a comparisonamong subprojects with similar or different aims and functionalities. Our preliminary results show how the jointapplication of traditional and network software metrics may be used to identify subprojects developed with similarfunctionalities and scopes.Key–Words: Software Metrics; Social Network Analysis; Software Networks; PCA; Typing manuscripts, LATEX1Introductionlationships among classes.In this paper we perform an empirical analysis ofthe Eclipse sub-projects with the joint application ofCK metrics, SNA metrics, and other network metrics,in order to detect similarities and differences amongsome subprojects depending on their scope and functionalities. The joint use of the different set of metrics, allows to explore the structure of such softwarenetworks exploiting relationships existing among thesub-systems. [4], [?].The recent litterature on software enginnering showshow it is possible to describe large software systems through a complex graph [1] [2], where thegraph nodes are the software modules (packages, files,classes or other software entities), and graph edges arethe relationships between modules [14].In the graphs associated to Object Oriented (OO)software systems the nodes are the classes, which arerelated each other through different kind of binary relationships, such as inheritance, composition and dependence.Several authors already investigated some software networks properties, like the distribution of Fanin or Fan-out of network nodes [6] [9] [11], findingfeatures characteristic of complex networks [8]. Thecomplex network structure may be thus analyzed asa Social Network, introducing metrics which are relatively new for software, but are already used in Social Network Analysis (SNA) with different meanings[10] [12]. The advantage of this approach is to providenew metrics for software sytems, which describe howthe classes interact with each other, and which maybe related to software quality [15] [5]. Up to now, allthese studies regarded a software system as a whole.On the other hand, more traditional software metrics, like the CK-suite [3], have historically been related to software quality, and from a complete different point of view with respect to software networktheory, tackle also the problem of describing interre-ISBN: 978-1-61804-070-12The Software Systems and theMetrics Analyzed.The Eclipse system is a standard dataset for software engineering studies which provides differentsub-projects large enough for statistical significanceof the results, and possesses a number of differentreleases which allow to analyze the time evolutionof such subprojects. It is an Open Source OO software system whose source code is available for publicdownload from the Eclipse web site. The releases analyzed are 2.0, 2.1, 3.0, 3.1, 3.2, 3.3 and 3.4. Subsequent releases are the evolutions of the previous ones,providing new functionalities while keeping the olderones. Thus, the number of sub-projects varies fromthe 80 of Eclipse 2.0 to the 407 of Eclipse3.4. Amongall these sub-projects, some are too small to be meaningfully represented as software network and provide104

Recent Researches in Engineering Education and Software Engineeringdegenerate values for some SNA metrics. Consequently, we analyzed only subprojects with more than40 classes.Furthermore, in order to carry out the analysis ofthe subproject’s time evolution, we study only thosesubsystem which belong to all the examined releases,which are 22.For each class in the sub-systems we are facedwith the following set of metrics: Locs, Fanin, Fanout,cbo, rfc, wmc, lcom, noc, dit, reachEfficiency, effSize, closeness, dwReach, infoCentrality, Size, Ties.Two are internal metrics, namely Locs and lcom. Theothers are network metrics, and depend on the structure of the software system, namely are computed onthe software graph. The last seven are metrics takenfrom Social Network Analysis (SNA), and provide indications on how the classes interact with each other.Some of the SNA metrics are defined on a subgraph,the ’EGO’ network (from the Latin EGO, meaning’self’), which is the subnetwork obtained consideringone node, named the EGO node, and only the nodesdirectly connected to it.In all our software networks the links amongnodes, due to the different relationships among theclasses, are undirected. Thus, our software graphs areundirected graphs. The definition of each metric is thefollowing: Coupling Between Object Classes (CBO): thenumber of other classes the given class is coupled to. It is a CK metric that denotes class dependency on other classes in the system. Lack of Cohesion of Methods (LCOM): the difference between the number of non-cohesivemethod pairs and number of cohesive pairs, a CKmetric. In-degree (Fan-in): the number of in-links of theclass in the class graph, denoting how much theclass is used by other classes in the system. Out-degree (Fan-out): the number of out-links ofthe class in the class graph, denoting how muchthe class uses other classes in the system. It isstrictly correlated to CBO CK metric, measuringthe coupling of a class with other classes of thesystem. Size. Size of the EGO-network related to theconsidered node (i.e. Class); it is the number ofthe nodes of the EGO-network. Ties. Number of edges of the EGO-network related to the node. EffectiveSize (effSize). Effective size of theEGO network; the number of nodes in the EGOnetwork minus one, minus the average numberof ties that each node has to other nodes of theEGO network. Loc: the number of lines of code of the class,excluding blank lines and comments. Althoughsome authors have pointed out the deficienciesof Loc, it is still the most commonly used sizemeasure in practice because of its simplicity. reachEfficiency; the percentage of nodes withintwo-step distance from a node, divided by theEGO Size. Weighted Method per Class (WMC): the numberof methods of the class, or interface. It is a measure of class complexity. Note that we set theweighting factor to one, as it is the case in mostpublished computations of WMC. In general, ithas been shown that the larger WMC, the morethe development effort and the larger the probability of defect-proneness. Closeness; the Closeness is the reciprocal of theFarness, where the Farness is defined as the sumof the length of all shortest paths from the nodeto all other nodes. Information Centrality (infoCentrality): the harmonic mean of the length of paths starting fromall nodes of the network and ending at the node,multiplied by the total number of nodes. Number of Children (NOC): the number of immediate subclasses of the class, a CK metric. dwReach; the sum of all nodes of the networkthat can be reached from the node, each weightedby the inverse of its geodesic distance. Theweights are thus 1/1, 1/2, 1/3, and so on. Depth of Inheritance Tree (Dit): the number ofinheritance levels, from the object hierarchy topto the given class. Response for Class (RFC): a CK metric computed as the sum of the number of methods defined in the class, and of the cardinality of the setof methods called by them and defined in external classes.ISBN: 978-1-61804-070-13Data AnalysisOur set of metrics is possibly exahustive for describing software systems, but it is also cumbersome and105

Recent Researches in Engineering Education and Software Engineeringpresents many correlations. For example, Size Fanin Fanout 1, CBO Fanin Fanout, Fanout is correlated to Locs since the larger the number of linesof code, the more is the likelyhood for the class topresent dependencies from other classes, and so on.In order to achieve only the relevant informationscarried by the metrics and to eliminate redundanciesrelated to the metrics correlations, we performed aPCA on all the sub-projects with more than 40 classes,using as input the metrics for each class. The outputis a set of measures for each class, where only few ofthem, which are uncorrelated, the so-called principalcomponents, retain the original relevant information[7].The results show that all the systems present afew main principal components, explaining the majority of the variance, and that usually more than theeighty percent of system variability is well accountedfor by the first three principal components. This suggests that for describing the system only few principalcomponents are needed. A typical situation is: pca1 55 %, pca1 18 %, pca3 10 %, for a cumulativepercentage of 83% of the variance explained by thethree first PC’s.Using the principal components each class maybe identified by means of only few coordinates whichcarry the relevant information contained in the original set of metrics. This reduced set allows to representeach class as a point in the PCA space, and the entiresubprojects as sets of points in the PCA space. Oneimmediate consequence is that as the software systemevolves, and the subprojects number of classes grows,the number of points in the PCA space also grows. Ifthe metrics of the existing classes change, the principal components will also change and the points representing the classes will change position in the PCAspace.The pattern of the class distribution in such PCAspace is thus representative of the main structure ofthe metrics for each subproject.In particular, it is possible to follow the time evolution of the sub-projects during the software development, and to easily detect major changes of the wholestructure, or minor changes related to the modificationof a few classes. Through this analysis it is possibleto detect relationships among the patterns of differentsub-projects, and to see the patterns changes as a samesub-project evolves in time.Figs. 1 and 2 show the patterns in six releasesof the sub-projects ’jtd.core’ and ’pde.ui’ (we did notshow Eclipse 3.0 release of ’pde.ui’ and Eclipse 3.3release of ’external.ui.tool’, only for resons of spaceavailability, being the patterns for these releases identical to those of the other ones).The figures show that these patterns are consistentISBN: 978-1-61804-070-1Eclipse2.0 jtd.coreEclipse2.1 jtd.core20202nd Principal Component101000 10 10 20 50510152025 20 50Eclipse3.0 jtd.core2020101000 10 10 20 50510152025 200Eclipse3.2 jtd.core2020101000 10 10 200102030 2005101520Eclipse3.1 jtd.core1020Eclipse3.4 jtd.core10202530301st Principal ComponentFigure 1: First and second Principal Componentsfor six releases of the sub-project ’jtd.core’, evolvingfrom Eclipse 2.0 to Eclipse 3.4.for each sub-project across the releases. This propertyis not trivial, since the PCA takes into account normalized values, thus even changes in a single class mayinfluence the position of all others. On the other hand,there may be sub-projects that undergo major modifications, for example because the number of classesincreases significantly from one release to another, orbecause they may undergo refactoring, where the codeand the relationships among the classes are substantially changed.In the first case, it is not granted that the newclasses are still distributed according to the pattern exhibited by the older classes. Thus, a permanence ofthe same pattern when the number of classes variessignificantly may be a signature of an existing structure associated to the specific software system.In the second case, refactorings may significantlychange the metrics, especially those computed onthe software graph. This may be reflected in major changes of the pattern of the corresponding subproject during the evolution from one release to another.The situation is illustrated in the Figures 1 and2. In Fig. 1 the project jtd.core starts with about 800classes, and ends with about 1150. The pattern remains the same across the evolution, even if the number of classes increases. So, the new classes are arranged according to the same pattern. In Fig. 2 weshow the project pde.ui, with about 370 classes inEclipse 2.0, 480 in Eclipse 2.1, and ending with about1020 in Eclipse 3.4. The final number of classes isabout three times the initial one, still the pattern does106

Recent Researches in Engineering Education and Software Engineering00051015 5 5Eclipse 3.1 pde.ui51015Eclipse 3.2 pde.ui550 502nd Principal Component2nd Principal Component5 5 5Eclipse 3.4 debug.coreEclipse 2.1 pde.uiEclipse 2.0 pde.ui50 5051015 5 5Eclipse 3.3 pde.ui51020010 100 20010 1020Eclipse 3.4 pde.core15201010000510510Eclipse 3.4 jtd.core20 10015 10 5051015Eclipse 3.4 pde.ui51st Principal Component50 5 50Eclipse 3.4 update.core100051015 5 50510Figure 3: First and second Principal Components ofdifferent ’core’ projects in Eclipse 3.4 release.151st Principal ComponentEclipse 3.4 debug.ui2nd Principal ComponentFigure 2: First and second Principal Components forsix releases of the sub-project ’pde.ui’, evolving fromEclipse 2.0 to Eclipse 3.4.ISBN: 978-1-61804-070-11000 10 5not change during the evolution, and is different fromthe jtd.core pattern. We examined also the other subprojects finding always the same patterns through thevarious releases.Figures 1 and 2 show also that the patterns arewell defined for each sub-project, but are differentamong the two sub-projects. Consequently we decided to compare all the patterns of the 22 sub-projectsbelonging to the same release, finding that they are ingeneral similar for sub-projects whose name ends withthe same desinence, which is related to the projectfunctionalities. Figures 3 and 4 show the patternsexhibited in Eclipse 3.4 by the sub-projects whosenames end by ’.core’ and by ’.ui’, respectively. Thelatter are projects relative to the user interface, and areconceived for providing functionalities clearly different from the ’.core’ projects.We found that sub-projects developed with thesame scope exhibit in general the same patterns. Thisis also confirmed by the analysis of the cumulativepercentage of the variance related to the principalcomponents. These percentages confirm the samefindings related to the patterns. For all the ’.core’projects the first PC has a minor relative weight withrespect to the others, and a relatively large contributeto the variance is due also to the second PC. For all the’.ui’ projects the first PC has a major relative weightwith respect to the others, while the second PC provides a minor contribute to the total variance, comparable to the contribute of the remaining PCs.Eclipse 3.4 team.ui100510Eclipse 3.4 update.ui15 10101000 10 50510Eclipse 3.4 jtd.ui15 5 10 50510Eclipse 3.4 pde.ui15051015Eclipse 3.4 jtd.debug.ui10200 2 5051015 10 50510151st Principal ComponentFigure 4: First and second Principal Components ofdifferent ’ui’ projects in Eclipse 3.4 release.Finally we reversed the analysis for understanding which original metrics contribute much to thefirsts principal components, contributing to the patterns more than the other metrics. The varimax rotation and the obtained biplots, not reported for reasonsof space, allow us to determine the contribution ofeach metric to the first two PCs, and to verify how theobtained patterns are not an artifact due to the PCA,but are directly determined by the metric values, andare different for sub-projects written for different purposes.In particular, for the .core sub-projects, the metrics Size, Ties, effSize and Fanin, contribute roughlyin the same amount to PC1 and to PC2, while metrics cbo, wmc, rfc, Locs, and Fanout, give an oppositecontribute to PC2 whit respect to the previous ones.Differently, for the .ui sub-projects, the contributionof each metrics to the two PCs is not well defined107

Recent Researches in Engineering Education and Software Engineeringand each metric provides a contribute to the first andsecond PC that is generally different from that of theother metrics.PCA, this methodology might eventually provide indications about software quality.References:4Threats to Validity[1] A. Barabasi, R. Albert, and H. Jeong. Scale-freecharacteristics of random networks: the topology of the world wide web. Phys. A, 281:69–77,2000.[2] A.-L. Barabasi and R. Albert. Emergence ofscaling in random networks. Science, 286:509–512, 1999.[3] S. Chidamber and C. Kemerer. A metric suitefor object-oriented design. IEEE Trans. SoftwareEng., (20):476–493, 1994.[4] G. Concas, M. Marchesi, A. Murgia, S. Pinna,and R. Tonelli. Assessing traditional and newmetrics for object-oriented systems. In Proceedings of the 2010 ICSE Workshop on Emerging Trends in Software Metrics, WETSoM ’10,pages 24–31, New York, NY, USA, 2010. ACM.[5] M. Melis, I. Turnu., A. Cau, and G. Concas.Evaluating the impact of test-first programmingand pair programming through software processsimulation In Software Process Improvementand Practice 11 (4), pp. 345-360 2006[6] G. Concas, M. Marchesi, A. Murgia, andR. Tonelli. An empirical study of social networksmetrics in object oriented software. Advances inSoftware Engineering, Volume 2010(Article ID729826):20, 2010.[7] I. T. Jolliffe. ”Principal Component Analysis”.Springer-Verlag. (1986), pp. 487.[8] S. Milgram. The small world problem. Psych.Today, 2:60–67, 1967.[9] C. R. Myers. Software systems as complex networks: Structure, function, and evolvability ofsoftware collaboration graphs. Phys. Rev. E,68(4):046116, Oct 2003.[10] J. P. Scott. Social network analysis. Sociology,22(1):109–127, 1988.[11] C. Song, S. Havlin, and H. Makse. Selfsimilarity of complex networks.Nature,433:392–395, 2005.[12] A. Tosun, B. Turhan, and A. Bener. Validationof network measures as indicators of defectivemodules in software systems. In Proceedingsof the 1st International Conference on PredictorModels (PROMISE), 2009.[13] I. Turnu, G. Concas, M. Marchesi, S. Pinna, andR. Tonelli. A modified yule process to model theevolution of some object-oriented system properties. Information Sciences, 181:883–902, 2011.First, Eclipse could not be representative of all Javasystems. Further studies on different systems, opensource and commercial are needed to extend the validity of our study.Second, Eclipse is not representative of systemswritten in other languages. Thus, a full investigationon the possibility of using the PCA of the metrics fordescribing the complex network structure of the wholesoftware system must include a comprehensive study,spanning several languages and systems.Third, we analyzed the subset of the 22 projectsbelonging to all the releases. The analysis of the similarities among patterns of sub-projects with the samescope must be extended to cover also the sub-projectsborn in the subsequent releases. In this case one mayexpect that, in any case, User Interfaces sub-projectsmay in general keep the same patterns, since these aredue to the particular values of some metrics, which,for interfaces, may be in general different. For thecore sub-projects this may be not valid anymore, sincethey provide basic functionalities which may be different for sub-projects created since the beginning ofthe whole software systems, with respect to core subprojects added along the way.5ConclusionWe reported a preliminary and exploratory analysis ofthe Eclipse subprojects, using a joint application ofSNA and traditional software metrics. The entire setof metrics has been summarized performing a PCAand obtaining a very reduced number of independentprincipal components, which allow to represent theclasses into a space where they show typical patterns.These patterns may be useful in providing informations about the subprojects functionalities and theirtime evolution through different releases. Since thePrincipal Components are linear combinations of theoriginal metrics measured for classes, the patterns aredetermined by these original metrics.Our methodology may thus be useful in monitoring the evolution of subprojects in time, and maybe indicative of similarities in the functionalities ofsome subprojects. The relationship among metricsand patterns suggests that, with the introduction ofthe ”bugs metric” into the input set of metrics for theISBN: 978-1-61804-070-1108

Recent Researches in Engineering Education and Software Engineering[14] S. Valverde, R. Ferrer-Cancho, and R. Solé.Scale-free networks from optimal design. Europhysics Letters, 60:512–517, 2002.[15] T. Zimmermann and N. Nagappan. Predictingdefects using network analysis on dependencygraphs. In NewEditor1, editor, Proceedings ofthe 30th international conference on Softwareengineering, May 2008.ISBN: 978-1-61804-070-1109

the ’EGO’ network (from the Latin EGO, meaning ’self’), which is the subnetwork obtained considering one node, named the EGO node, and only the nodes directly connected to it. In all our software networks the links among nodes, due to the different relationships among the classes, are undirected. Thus, our software graphs are undirected .

Related Documents:

4 SNA 127 B - S - T1 - 12 - O60 Type SNA Séries 076 SNA 076 127 SNA 127 176 SNA 176 254 SNA 254 Exécution des cadrans S avec logo STAUFF N neutre X Exécutions spéciales Tyle de joint B Perbunan (NBR standard) V Viton (FPM) Codification Dimensions 076 108 76 31 127 159 127 76 176 208 176 124 254 285 254 192 SNA L1 L2 L3 Sur les corps SNA 254, le voyant est divisé en deux

mechanical mixing (rotating, vibrating) hydraulic mixing pneumatic mixing pipeline mixing (turbulent flow, static mixer) Method of mixing fluids A –mechanical mixing using turbines B –mechanical mixing using blade impellers C –hydraulic mixing D –pneumatic mixing with stationary inputs

hmx-q20,pc,t1.5,w12.9 sna 1 5-14 ad97-21804b 4 c1066 ad61-04473a magnet-lcd; h200,n45m,t1.8,w2.3,l6.7 sa 1 5-15 ad97-21804b 4 l014 ad63-06602a cover-hingetop; hmx-q20,pc,t1.5,w11.7,l1 sna 1 5-16 ad97-21796b 5 ad63-06604a cover-lcdtop; hmx-q20,pc,t1.0,w28.6,l28. sna 1 5-17 ad97-21796b 5 ad61-05532b case-lcdtop_qf20; hmx-qf20,pc,t1.5,w91.8 sna 1

5. Exploded View & Part List 5-2. SPA600BX/EN - Parts List Service Bom (SA: SERVICE AVAILABLE, SNA: SERVICE NOT AVAILABLE) Level Location No. Code No. Description & Specification Q'ty SA/SNA Remark 0.1 BP96-02173D ASSY MISC P-SET;SPA600BX/EN 1 SNA .2 BP90-00424K ASSY COVER TOP;SP-A600B 1 SNA

4 I MIRKA ESSENTIALS PAINT MIXING mirka.com PAINT MIXING SOLUTIONS The Mirka Paint Mixing product range includes mixing cups, lids & systems, mixing sticks, paint strainers, practical dispensers and cloths. Caters for automative refinishing professional paint mixing needs. Size Mirka Code Pcs/pack 180ml 9190170180 50 400ml 9190170400 50

A Learner Profile, SNA 1 and SNA 2 will be required when support is requested from the District-based Support Team (DBST). SNA 1: ASSESSMENT AND INTERVENTION BY TEACHER To be completed by the class teacher and/or subject teachers if the learner is taught by more than one teacher.

In this list the mantras in the 1-10 pra snas are numbered pra sna, pat.ala, khan.d.a:page,6 and in the 11th pra sna and after, pra sna, pat.ala, s utra:page. As entities this collection includes r c, yajus, prais.a, p ada of sloka as in A Vedic Concordance.

This SNA lies within the Richard J. Dorer Memorial Hardwood State Forest. Sanctuary The 39-acre south section of this SNA is a wildlife sanctuary. It is closed to the public except to those with a research permit. Hiking Trails There are no maintained