A Study On The Use Of Non-Parametric Tests For Analyzing .

2y ago
33 Views
3 Downloads
298.91 KB
29 Pages
Last View : 3d ago
Last Download : 3m ago
Upload by : Wren Viola
Transcription

A Study on the Use of Non-Parametric Tests forAnalyzing the Evolutionary Algorithms’Behaviour: A Case Study on the CEC’2005Special Session on Real Parameter OptimizationSalvador Garcı́aDaniel MolinaFrancisco Herrera Manuel LozanoAbstractIn the last years, there has been a growing interest for the experimentalanalysis in the field of evolutionary algorithms. It is noticeable due to theexistence of numerous papers which analyze and propose different typesof problems, such as the basis for experimental comparisons of algorithms,proposals of different methodologies in comparison or proposals of use ofdifferent statistical techniques in algorithms’ comparison.In this paper, we focus our study on the use of statistical techniques inthe analysis of evolutionary algorithms’ behaviour over optimization problems. A study about the required conditions for statistical analysis of theresults is presented by using some models of evolutionary algorithms forreal-coding optimization. This study is conducted in two ways: singleproblem analysis and multiple-problem analysis. The results obtainedstate that a parametric statistical analysis could not be appropriate specially when we deal with multiple-problem results. In multiple-problemanalysis, we propose the use of non-parametric statistical tests given thatthey are less restrictive than parametric ones and they can be used oversmall size samples of results. As a case study, we analyze the publishedresults for the algorithms presented in the CEC’2005 Special Session onReal Parameter Optimization by using non-parametric test procedures.Keywords Statistical analysis of experiments, evolutionary algorithms,parametric tests, non-parametric tests.IntroductionThe “No free lunch” theorem (Wolpert and Macready, 1997) demonstrates thatit is not possible to find one algorithm being better in behaviour for any problem. On the other hand, we know that we can work with different degrees of This work was supported by Project TIN2005-08386-C05-01. S. Garcia, M. Lozano andF. Herrera are with Department of Computer Science and Artificial Intelligence, University ofGranada, Granada 18071, Spain (e-mails:{salvagl,lozano,herrera}@decsai.ugr.es.). D. Molinaare with Department of Computer Engineering, University of Cádiz, Cádiz, Spain (e-mail:daniel.molina@uca.es.) S. Garcı́a holds a FPU scholarship from Spanish Ministry of Educationand Science.1

knowledge about the problem which we expect to solve, and that it is not thesame to work without knowledge about the problem (hypothesis of the “no freelunch” theorem) than to work with partial knowledge about the problem, knowledge that allows us to design algorithms with specific characteristics which canmake them more suitable for the solution of the problem.Once situated in this field, the partial knowledge of the problem and thenecessity of having disposals of algorithms for its solution, the question aboutdeciding when an algorithm is better than another one is suggested. In thecase of the use of evolutionary algorithms, the latter may be done attending tothe efficiency and / or effectiveness criteria. When theoretical results are notavailable in order to allow the comparison of the behaviour of the algorithms,we have to focus on the analysis of empirical results.In the last years, there has been a growing interest in the analysis of experiments in the field of evolutionary algorithms. The work of Hooker is pioneer inthis line and it shows an interesting study on what we must do and not do whenwe suggest the analysis of the behaviour of a metaheuristic about a problem(Hooker, 1995).In relation to the analysis of experiments, we can find three types of works:the study and design of test problems, the statistical analysis of experimentsand experimental design. Different authors have focused their interest in the design of test problems which could be appropriate to do a comparative study among thealgorithms. Focusing our attention to continuous optimization problems,which will be used in this paper, we can point out the pioneer papers ofWhitley and co-authors for the design of complex test functions for continuous optimization (Whitley et al., 1995, 1996), and the recent works ofGallagher and Yuan (2006); Yuan and Gallagher (2003). In the same way,we can find papers that present test cases for different types of problems. Centred on the statistical analysis of the results, if we analyze the published papers in specialized journals, we find that the majority of thearticles make a comparison of results based on average values of a set ofexecutions over a concrete case. In proportion, a little set of works usestatistical procedures in order to compare results, although their use isrecently growing and it is being suggested as a need for many reviewers.When we find statistical studies, they are usually based on the averageand variance by using parametric tests (ANOVA, t-test, etc.) (Czarn etal., 2004; Ozcelik and Erzurumlu, 2006; Rojas et al., 2002). Recently, nonparametric statistical procedures have been considered for being used inanalysis of results (Garcı́a et al., 2007; Moreno-Pérez, Campos-Rodrı́guezand Laguna, 2007). A similar situation can be found in the machine learning community (Demšar, 2006). The experimental design consists of a set of techniques which comprisemethodologies for adjusting the parameters of the algorithms dependingon the settings used and results obtained (Bartz-Beielstein, 2006; Kramer,2007). In our study, we are not interested in this topic; we assume thatthe algorithms in a comparison have obtained the best possible results,depending on an optimal adjustment of their parameters in each problem.2

We are interested in the use of statistical techniques for the analysis of thebehaviour of the evolutionary algorithms over optimization problems, analyzingthe use of the parametric statistical tests and the non-parametric ones (Sheskin,2003; Zar, 1999). We will analyze the required conditions for the usage of theparametric tests, and we will carry out an analysis of results by using nonparametric tests.The study of this paper will be organized into two parts. The first one, wewill denoted it by single-problem analysis, corresponds to the study of the required conditions of a safe use of parametric statistical procedures when comparing the algorithms over a single problem. The second one, denoted by multipleproblem analysis, will suppose the study of the same required conditions whenconsidering a comparison of algorithms over more than one problems simultaneously.The single-problem analysis is usually found in specialized literature (BartzBeielstein, 2006; Ortiz-Boyer, Hervás-Martı́nez and Garcı́a-Pedrajas, 2007). Although the required conditions for using parametric statistics are usually notfulfilled, as we will see in this paper, a parametric statistical study could obtainsimilar conclusions to a non-parametric one. However, in the multiple-problemanalysis, due to the dissimilarities in the results obtained and the small size ofthe sample to be analyzed, a parametric test may reach erroneous conclusions.In recent papers, authors start using single-problem and multiple-problem analysis simultaneously (Ortiz-Boyer, Hervás-Martı́nez and Garcı́a-Pedrajas, 2007).Non-parametric tests can be used for comparing algorithms whose resultsrepresent average values for each problem, in spite of the inexistence of relationships among them. Given that the non-parametric tests do not require explicitconditions for being conducted, it is recommendable that the sample of resultsis obtained following the same criterion, that is, computing the same aggregation (average, mode, etc.) over the same number of runs for each algorithmand problem. They are used for analyzing the results of the CEC’2005 Specialsession on real parameter optimization (Suganthan et al., 2005) over all thetest problems, in which average results of the algorithms for each function arepublished. We will show significant statistical differences among the algorithmscompared in the CEC’2005 Special Session on Real Parameter Optimization,supporting the conclusions obtained in this session.In order to do that, the paper is organized as follows. In Section 1, we describe the setting of the CEC’2005 Special Session: algorithms, tests functionsand parameters . Section 2 shows the study on the required conditions for safeuse of parametric tests, considering single-problem and multiple-problem analysis. We analyze the published results of the CEC’2005 Special session on realparameter optimization by using non-parametric tests in Section 3. Section 4points out some considerations on the use of non-parametric tests. The conclusions of the paper are presented in Section 5. An introduction to statisticsand a complete description of the non-parametric tests procedures are given inAppendix, Section A. The published average results of the CEC’2005 SpecialSession are shown in Appendix, Section B.3

1Preliminaries: Settings of the CEC’2005 Special SessionIn this section we will briefly describe the algorithms compared, the test functions, and the characteristics of the experimentation in the CEC’2005 SpecialSession.1.1Evolutionary AlgorithmsIn this section we enumerate the eleven algorithms which were presented in theCEC’2005 Special Session. For more details on the description and parametersused for each one, please refer to the respective contributions. The algorithmsare: BLX-GL50 (Garcı́a-Martı́nez and Lozano, 2005), BLX-MA (Molina, Herrera and Lozano, 2005), CoEVO (Pošı́k, 2005), DE (Rônkkônen, Kukkonen andPrice, 2005), DMS-L-PSO (Liang and Suganthan, 2005), EDA (Yuan and Gallagher, 2005), G-CMA-ES (Auger and Hansen, 2005a), K-PCX (Sinha, Tiwariand Deb, 2005), L-CMA-ES (Auger and Hansen, 2005b), L-SaDE (Qin andSuganthan, 2005), SPC-PNX (Ballester et al., 2005).1.2Test FunctionsIn the following we present the set of test functions designed for the SpecialSession on Real Parameter Optimization organized in the 2005 IEEE Congresson Evolutionary Computation (CEC 2005) (Suganthan et al., 2005).It is possible to consult in Suganthan et al. (2005) the complete descriptionof the functions, furthermore in the link the source code is included. The set oftest functions is composed of the following functions: 5 Unimodals functions– Sphere function displaced.– Schwefel’s problem 1.2 displaced.– Elliptical function rotated widely conditioned.– Schwefel’s problem 1.2 displaced with noise in the fitness.– Schwefel’s problem 2.6 with global optimum in the frontier. 20 Multimodals functions– 7 basic functions Rosenbrock function displaced. Griewank function displaced and rotated without frontiers. Ackley function displaced and rotated with the global optimumin the frontier. Rastrigin function displaced. Rastrigin function displaced and rotated. Weierstrass function displaced and rotated. Schwefel’s problem 2.13.4

– 2 expanded functions.– 11 hybrid functions. Each one of them have been defined throughcompositions of 10 out of the 14 previous functions (different in eachcase).All functions have been displaced in order to ensure that their optima cannever be found in the centre of the search space. In two functions, in addition,the optima can not be found within the initialization range, and the domain ofsearch is not limited (the optimum is out of the range of initialization).1.3Characteristics of the experimentationThe experiments were done following the instructions indicated in the documentassociated to the competition. The main characteristics are: Each algorithm is run 25 times for each test function, and the average oferror of the best individual of the population is computed. We will use the study with dimension D 10 and the algorithms do100000 evaluations of the fitness function.In the mentioned competition, experiments with dimension D 30 andD 50 have also been done. Each run stops either when the error obtained is less than 10 8 , or whenthe maximal number of evaluations is achieved.2Study of the Required Conditions for the SafeUse of Parametric TestsIn this section, we will describe and analyze the conditions that must be satisfiedfor the safe usage of parametric tests ( Subsection 2.1). For doing it, we collectthe overall set of results obtained by the algorithms BLX-MA and BLX-GL50in the 25 functions considering dimension D 10. With them, we will firstlyanalyze the indicated conditions over the complete sample of results for eachfunction, in a single-problem analysis (see Subsection 2.2). Finally, we willconsider the average results for each function to composite a sample of resultsfor each one of the two algorithms. With these two samples we will check againthe required conditions for the safe use of parametric test in a multiple-problemscheme (see Subsection 2.3).2.1Conditions for the safe use of parametric testsIn Sheskin (2003), the distinction between parametric and non-parametric testsis based on the level of measure represented by the data which will be analyzed.In this way, a parametric test uses data composed by real values.The latter does not imply that when we always dispose of this type of data,we should use a parametric test. There are other initial assumptions for a safeusage of parametric tests. The non fulfillment of these conditions might causea statistical analysis to lose credibility.5

In order to use the parametric tests, it is necessary to check the followingconditions (Sheskin, 2003; Zar, 1999): Independence: In statistics, two events are independent when the fact thatone occurs does not modify the probability of the other one occurring. Normality: An observation is normal when its behaviour follows a normalor Gauss distribution with a certain value of average µ and variance σ. Anormality test applied over a sample can indicate the presence or absenceof this condition in observed data. We will use three normality tests:– Kolmogorov-Smirnov: It compares the accumulated distribution ofobserved data with the accumulated distribution expected from aGaussian distribution, obtaining the p-value based on both discrepancies.– Shapiro-Wilk: It analyzes the observed data to compute the level ofsymmetry and kurtosis (shape of the curve) in order to compute thedifference with respect to a Gaussian distribution afterwards, obtaining the p-value from the sum of the squares of these discrepancies.– D’Agostino-Pearson: It first computes the skewness and kurtosis toquantify how far from Gaussian the distribution is in terms of asymmetry and shape. It then calculates how far each of these valuesdiffers from the value expected with a Gaussian distribution, andcomputes a single p-value from the sum of these discrepancies. Heteroscedasticity: This property indicates the existence of a violation ofthe hypothesis of equality of variances. Levene’s test is used for checking whether or not k samples present this homogeneity of variances (homoscedasticity). When observed data does not fulfil the normality condition, this test’s result is more reliable than Bartlett’s test (Zar, 1999),which checks the same property.In our case, it is obvious the independence of the events given that they areindependent runs of the algorithm with randomly generated initial seeds. Inthe following, we will carry out the normality analysis by using KolmogorovSmirnov, Shapiro-Wilk and D’Agostino-Pearson tests on single-problem andmultiple-problem analysis, and heteroscedasticity analysis by means of Levene’stest.2.2On the study of the required conditions over singleproblem analysisWith the samples of results obtained from running 25 times the algorithmsBLX-GL50 and BLX-MA for each function, we can apply statistical tests for determining whether they check or not the normality and homoscedasticity properties. We have seen before that the independence condition is easily satisfiedin this type of experiments. The number of runs may be low for carrying outstatistical analysis, but it was a requirement in the CEC’2005 Special Session.All the tests used in this section will obtain the p-value associated, whichrepresents the dissimilarity of the sample of results with respect to the normal6

shape. Hence, a low p-value points out a non-normal distribution. In thisstudy, we will consider a level of significance α 0.05, so a p-value greater thanα indicates that the condition of normality is fulfilled. All the computationshave been performed by the statistical software package SPSS.Table 1 shows the results where the symbol “*” indicates that the normalityis not satisfied and the p-value in brackets. Table 2 shows the results by applying the test of normality of Shapiro-Wilk and Table 3 displays the results ofD’Agostino-Pearson test.Table 1: Test of Normality of Kolmogorov-SmirnovBLX-GL50BLX-MAf1(.20)* (.01)f2* (.04)* (.00)f3* (.00)* (.01)f4(.14)* (.00)f5* (.00)* (.00)f6* (.00)(.16)f7* (.04)(.20)f8(.20)* (.00)f9* (.00)* (.00)BLX-GL50BLX-MAf10(.10)(.20)f11(.20)* (.00)f12* (.00)* (.00)f13(.20)(.20)f14(.20)* (.02)f15* (.00)* (.00)f16* (.00)(.20)f17(.20)(.20)f18* (.00)* (.00)BLX-GL50BLX-MAf19* (.00)* (.00)f20* (.00)* (.00)f21* (.00)* (.00)f22* (.00)* (.00)f23* (.00)* (.00)f24* (.00)* (.00)f25* (.00)* (.02)BLX-GL50BLX-MAf1* (.03)* (.00)f2(.06)* (.00)f3* (.00)* (.01)f4* (.03)* (.00)f5* (.00)* (.00)f6* (.00)(.05)f7* (.01)(.27)f8(.23)* (.03)f9* (.00)* (.00)BLX-GL50BLX-MAf10(.07)(.31)f11(.25)* (.00)f12* (.00)* (.00)f13(.39)(.56)f14(.41)* (.01)f15* (.00)* (.00)f16* (.00)(.25)f17(.12)(.72)f18* (.00)* (.00)BLX-GL50BLX-MAf19* (.00)* (.00)f20* (.00)* (.00)f21* (.00)* (.00)f22* (.00)* (.00)f23* (.00)* (.00)f24* (.00)* (.00)f25* (.00)* (.02)Table 2: Test of Normality of Shapiro-WilkTable 3: Test of Normality of D’Agostino-PearsonBLX-GL50BLX-MAf1(.10)* (.00)f2(.06)* (.00)f3* (.00)(.22)f4(.24)* (.00)f5* (.00)* (.00)f6* (.00)* (.00)f7(.28)(.19)f8(.21)(.12)f9* (.00)* (.00)BLX-GL50BLX-MAf10(.17)(.89)f11(.19)* (.00)f12* (.00)* (.03)f13(.79)(.38)f14(.47)(.16)f15* (.00)* (.00)f16* (.00)(.21)f17(.07)(.54)f18* (.03)* (.04)BLX-GL50BLX-MAf19(.05)* (.00)f20(.05)* (.00)f21(.06)(.25)f22* (.01)* (.00)f23* (.00)* (.00)f24* (.00)* (.00)f25(.11)(.20)In addition to this general study, we show the sample distribution in threecases, with the objective of illustrating representative cases in which the normality tests obtain different results.From Figure 1 to Figure 3, different examples of graphical representationsof histograms and Q-Q graphics are shown. A histogram represents a statistical variable by using bars, so that the area of each bar is proportional to thefrequency of the represented values. A Q-Q graphic represents a confrontationbetween the quartiles from data observed and those from the normal distributions.In Figure 1 we can observe a general case in which the property of abnormality is clearly presented. On the contrary, Figure 2 is the illustration of asample whose distribution follows a normal shape, and the three normality testsemployed verified this fact. Finally, Figure 3 shows a special case where the similarity between both distributions, the sample of results and the normal one, isnot confirmed by all normality tests. In this case, a normality test could workbetter than another, depending on types of data, number of ties or number of7

results collected. Due to this fact, we have employed three well-known normalitytests for studying the normality condition. The choice of the most appropriatenormality test depending on the problem is out of the scope of this paper.HistogramNormal Q-Q Plot of F20for Algorithm BLX-MA202151Expected NormalFrequencyfor Algorithm BLX-MA100-15Mean 808.43085 Std. Dev. 167.04630 N 0F208001,0001,200Observed ValueFigure 1: Example of non-normal distribution: Function f20 and BLX-GL50algorithm: Histogram and Q-Q Graphic.HistogramNormal Q-Q Plot of F10for Algorithm BLX-MAfor Algorithm BLX-MA1028Expected NormalFrequency1640-12Mean 9.90627 Std. Dev. 2.54907 N .510.012.515.0Observed ValueFigure 2: Example of normal distribution: Function f10 and BLX-MA algorithm: Histogram and Q-Q Graphic.With respect to the study of homoscedasticity property, Table 4 shows theresults by applying Levene’s test, where the symbol “*” indicates that the variances of the distributions of the different algorithms for a certain function are nothomogeneities (we reject the null hypothesis at a level of significance α 0.05).Table 4: Test of Heteroscedasticity of Levene (based on means)LEVENEf1(.07)f2(.07)f3* (.00)f4* (.04)f5* (.00)f6* (.00)f7* (.00)f8(.41)f9* (.00)LEVENEf10(.99)f11* (.00)f12(.98)f13(.18)f14(.87)f15* (.00)f16* (.00)f17(.24)f18(.21)LEVENEf19* (.01)f20* (.00)f21* (.01)f22(.47)f23(.28)f24* (.00)f25* (.00)Clearly, in both cases, the non fulfillment of the normality and homoscedasticity conditions is perfectible. In most functions, the normality condition is notverified in a single-problem analysis. The homoscedasticity is also dependent ofthe number of algorithms studied, because it checks the relationship among the8

HistogramNormal Q-Q Plot of F21for Algorithm BLX-MA202151Expected NormalFrequencyfor Algorithm BLX-MA100-15Mean 771.27332 Std. Dev. 211.53848 N 00200F214006008001,0001,200Observed ValueFigure 3: Example of a special case: Function f21 and BLX-MA algorithm:Histogram and Q-Q Graphic.variances of all population samples. Even though in this case we only analyzethis condition on results for two algorithms, the condition is also not fulfilled inmany cases.A researcher may think that the non fulfillment of these conditions is notcrucial for obtaining adequate results. By using the same samples of results, wewill show an example in which some results offered by a parametric test, thepaired t-test, do not agree with the ones obtained through a non-parametric test,Wilcoxon’s test. Table 5 presents the difference of average error rates, in eachfunction, between the algorithms BLX-GL50 and BLX-MA (if it is negative,the best performed algorithm is BLX-GL50 ), and the p-value obtained by thepaired t-test and Wilcoxon test.As we can see, the p-values obtained by paired t-test are very similar tothe ones obtained by Wilcoxon test. However, in three cases, they are quitedifferent. We enumerate them: In function f4, Wilcoxon test considers that both algorithms behave different, whereas paired t-test does not. This example perfectly fits with anon-practical case. The difference of error rates is less than 10 7 , and inpractical sense, this has no significant effect. In function f15, the situation is opposite to the previous one. The pairedt-test obtains a significant difference in favour of BLX-MA. Is this resultreliable? As the normality condition is not verified in the results of f15(see Tables 1, 2, 3), the results obtained by Wilcoxon test are theoreticallymore reliable. Finally, in function f22, although Wilcoxon test obtains a p-value greaterthan the level of significance α 0.05, both p-values are again very different.In 3 of the 25 functions, there are observable differences in the applicationof paired t-test and Wilcoxon test. Moreover, in these 3 functions, the requiredconditions for the safe usage of parametric statistics are not verified. In principle, we could suggest the usage of the non-parametric test of Wilcoxon insingle-problem analysis. This is one alternative, but there exist other ways forensuring that the results obtained are valid for parametric statistical analysis.9

Table 5: Difference of error rates and p-values for paired t-test and Wilcoxontest in single-problem analysisFunction Difference t-test Wilcoxonf10f20f3-4712900f4 1.9 · 10 8 50.074f23-28800f24-240.0430.046f2580.5580.45910

Obtaining new results is not very difficult in single-problem analysis. Weonly have to run the algorithms again to get larger samples of results.The Central Limit Theorem confirms that the sum of many identicallydistributed random variables tends to a normal distribution. Nevertheless, the number of runs carried out must not be very high, because anystatistical test has a negative effect size. If the sample of results is toolarge, a statistical test could detect insignificant differences as significant.For controlling the size effect, we can use the Cohen’s index d0td0 nwhere t is the t-test statistics and n is the number of results collected. If d0is near to 0.5, then the differences are significant. A value of d0 lower than0.25 indicates insignificant differences and the statistical analysis may notbe taken into account. The application of transformations for obtaining normal distributions,such as logarithm, square root, reciprocal and power transformations (Patel and Read, 1982). In some situations, skip outliers, but this technique must be used withgreat care.These alternatives could solve the normality condition, but the homoscedasticity condition may result difficult to solve. Some parametric tests, such asANOVA, are very influenced by the homoscedasticity condition.2.3On the study of the required conditions over multipleproblem analysisWhen tackling a multiple-problem analysis, the data to be used is an aggregationof results obtained from individual algorithms’ runs. In this aggregation, theremust be only a result representing a problem or function. This result could beobtained through averaging results for all runs or something similar, but theprocedure followed must be the same for each function; i.e., in this paper wehave used the average of the 25 runs of an algorithm in each function. The size ofthe sample of results to be analyzed, for each algorithm, is equal to the numberof problems. In this way, a multiple-problem analysis allows us to compare twoor more algorithms over a set of problems simultaneously.We can use the results published in the CEC’2005 Special Session to performa multiple-problem analysis. Indeed, we will follow the same procedure as theprevious subsection. We will analyze the required conditions for the safe usageof parametric tests over the sample of results obtained by averaging the errorrate on each function.Table 6 shows the p-values of the normality tests over the sample resultsobtained by BLX-GL50 and BLX-MA. Figures 4 and 5 represent the histogramsand Q-Q plots for such samples.Obviously, the normality condition is not satisfied because the sample ofresults is composed by 25 average error rates computed in 25 different problems.11

Table 6: Normality tests over multiple-problem analysisAlgorithmKolmogorov-Smirnov Shapiro-Wilk D’Agostino-PearsonBLX-GL50* (.00)* (.00)(.10)BLX-MA* (.00)* (.00)* (.00)HistogramNormal Q-Q Plot of BLXGL50212.51Expected NormalFrequency10.07.55.00-12.5Mean 224.73188 Std. Dev. 262.15547 N -2000200BLXGL50400600800Observed ValueFigure 4: BLX-GL50 algorithm: Histogram and Q-Q Graphic.HistogramNormal Q-Q Plot of BLXMA25220Expected NormalFrequency115100-15Mean 2144.58602 Std. Dev. 9496.31291 N .0000050000.00000-20,000BLXMA020,00040,000Observed ValueFigure 5: BLX-MA algorithm: Histogram and Q-Q Graphic.12

We compare the behaviour of the two algorithms by means of pairwise statisticaltests: The p-value obtained with a paired t-test is p 0.318. The paired t-testdoes not consider the existence of difference in performance between thealgorithms. The p-value obtained with Wilcoxon test is p 0.089. The Wilcoxon t-testdoes neither consider the existence of difference in performance betweenthe algorithms, but it considerably reduces the minimal level of significancefor detecting differences. If the level of significance considered were α 0.10, Wilcoxon’s test would confirm that BLX-GL50 is better than BLXMA.Average results for these two algorithms indicate this behaviour, BLX-GL50usually performs better than BLX-MA (see Table 13 in Appendix B), but apaired t-test cannot appreciate this fact. In multiple-problem analysis it is notpossible to enlarge the sample of results, unless new functions / problems wereadded. Applying transformations or skipping outliers cannot be used either,because we would be changing results for certain problems and not for otherproblems.These facts may induce us to using non-parametric statistics for analyzingthe results in multiple-problems. Non-parametric statistics do not need priorassumptions related to the sample of data for being analyzed and, in the exampleshown in this section, we have seen that they could obtain reliable results.3A Case Study: On the Use of Non-parametricStatistics for Comparing the Results of theCEC’2005 Special Session in Real ParameterOptimizationIn this section, we study the results obtained in the CEC’2005 Special Session inReal Parameter Optimization as a case study on the use of the non-parametrictests. As we have mentioned, we will focus on the dimension D 10.We will divide the set of functions into two subgroups, according to thesuggestion given in Hansen (2005) about their degrees of difficulty. The first group is composed by the unimodal functions (from f1 to f5), inwhich all participant algorithms in the CEC’2005 competition normallyachieve the optimum, and the multimodal functions (from f6 to f14), inwhich at least one run of a participant algorithm achieves the optimum. The second group contains the remaining functions, from the functionf15 to f25. In these functions, no participant algorithm has achieved theoptimum.This division is carried out with the objective of showing the differences inthe statistical analysis co

Real Parameter Optimization by using non-parametric test procedures. Keywords Statistical analysis of experiments, evolutionary algorithms, parametric tests, non-parametric tests. Introduction The \No free lunch" theorem (Wolpert and Macready, 1997) demonstrates that it is not possible t

Related Documents:

May 02, 2018 · D. Program Evaluation ͟The organization has provided a description of the framework for how each program will be evaluated. The framework should include all the elements below: ͟The evaluation methods are cost-effective for the organization ͟Quantitative and qualitative data is being collected (at Basics tier, data collection must have begun)

Silat is a combative art of self-defense and survival rooted from Matay archipelago. It was traced at thé early of Langkasuka Kingdom (2nd century CE) till thé reign of Melaka (Malaysia) Sultanate era (13th century). Silat has now evolved to become part of social culture and tradition with thé appearance of a fine physical and spiritual .

Dr. Sunita Bharatwal** Dr. Pawan Garga*** Abstract Customer satisfaction is derived from thè functionalities and values, a product or Service can provide. The current study aims to segregate thè dimensions of ordine Service quality and gather insights on its impact on web shopping. The trends of purchases have

On an exceptional basis, Member States may request UNESCO to provide thé candidates with access to thé platform so they can complète thé form by themselves. Thèse requests must be addressed to esd rize unesco. or by 15 A ril 2021 UNESCO will provide thé nomineewith accessto thé platform via their émail address.

̶The leading indicator of employee engagement is based on the quality of the relationship between employee and supervisor Empower your managers! ̶Help them understand the impact on the organization ̶Share important changes, plan options, tasks, and deadlines ̶Provide key messages and talking points ̶Prepare them to answer employee questions

Chính Văn.- Còn đức Thế tôn thì tuệ giác cực kỳ trong sạch 8: hiện hành bất nhị 9, đạt đến vô tướng 10, đứng vào chỗ đứng của các đức Thế tôn 11, thể hiện tính bình đẳng của các Ngài, đến chỗ không còn chướng ngại 12, giáo pháp không thể khuynh đảo, tâm thức không bị cản trở, cái được

Le genou de Lucy. Odile Jacob. 1999. Coppens Y. Pré-textes. L’homme préhistorique en morceaux. Eds Odile Jacob. 2011. Costentin J., Delaveau P. Café, thé, chocolat, les bons effets sur le cerveau et pour le corps. Editions Odile Jacob. 2010. Crawford M., Marsh D. The driving force : food in human evolution and the future.

Le genou de Lucy. Odile Jacob. 1999. Coppens Y. Pré-textes. L’homme préhistorique en morceaux. Eds Odile Jacob. 2011. Costentin J., Delaveau P. Café, thé, chocolat, les bons effets sur le cerveau et pour le corps. Editions Odile Jacob. 2010. 3 Crawford M., Marsh D. The driving force : food in human evolution and the future.