Generalized Fuzzy Clustering Model With Fuzzy C-Means

3y ago

28 Views

2 Downloads

543.76 KB

11 Pages

Last View : 25d ago

Last Download : 3m ago

Upload by : Jerry Bolanos

Report this link

Download PDF

Transcription

Generalized Fuzzy Clustering Modelwith Fuzzy C-MeansHong Jiang11Computer Science and Engineering, University of South Carolina,Columbia, SC 29208, USjiangh@cse.sc.eduhttp://www.cse.sc.edu/ jiangh/Abstract. This paper extends the traditional Fuzzy C-Means clustering methodto a generalized fuzzy clustering model. According to most applications, thisfuzzy clustering model briefly includes 3 parts: feature extractor transfers original objects information to desired feature data; fuzzy cluster analyzer gets cluster information from the feature data; and post treatment obtains the final resultsbased on the cluster information. Among them, fuzzy cluster analyzer is encapsulated to 5 parts instead of traditional E-step and M-step: Initialization, U α ,E-step, Distance Calculation, and M-step. This model makes each part keeprelatively independent and easy to improve by just replacing one or severalparts in needs. An implementation of this model is supplied, and 3 examplesare given to test the properties. Moreover, potential optimizations are analyzedand listed.1 IntroductionThe capacity of classifying patterns is one of the most fundamental characteristics ofhuman intelligence. Cluster Analysis or clustering thus becomes to one of the mostfundamental issues in it. Since the early 1950s, pattern recognition has become to afield of study. In the mid-1960s, fuzzy set theory started to be used in pattern recognition and cluster analysis. Now, the literatures involving fuzzy clustering are alreadyquite extensive, partly because the vague boundaries are desirable for most categorieswe commonly encounter.On the other hand, the applicability of cluster analysis is not necessarily restrictedto the pattern recognition. It could be used to classify documents in information retrieval, or used in social groupings based on various criteria. In one word, it is applicable in most areas only if you want to classify some objects into several categories.Further more, there are lots of researches on the fuzzy clustering. However, most ofthem focus on the optimization on some fuzzy clustering algorithms or application insome special cases. Personally, I did not find any literatures or research involvinggeneralizing the fuzzy clustering model in convenience of applications and researches.Thus, based on most applications and researches, this project focuses on buildingup a generalized model for fuzzy clustering. This model reorganizes the traditional

idea of fuzzy clustering, and extent it to make it suitable for most applications. It iswell capsulated so that each part keeps enough independence and flexibility. Thisdesign also takes most potential optimizations into account. That is, the way that thismodel is capsulated is suitable for most possible optimizations. Based on this model,researchers can focus on the study more by simply replacing one small part in thismodel.To test the working status of this model, an implementation is supplied, and 3 experiment results are given in this paper as well. According to the structure of themodel and the experiment results, potential optimizations are also analyzed and listedin this paper.2 Generalized Fuzzy Clustering ModelThis section will introduce the basic idea of the generalized fuzzy clustering model,and the details of each part in this model.Before that, I would like to give some useful definitions first: Pattern Recognition: A process by which we search for structures in data andclassify these structures into categories such that the degree of association ishigh among structures of the same category and low between structures of different categories [3]. Clustering: Given a finite set of data X, the problem of clustering in X is tofind several cluster centers that can properly characterize relevant classes of X[3].First of all, this generalized fuzzy clustering model is based on one of the fuzzyclustering algorithm ---- Fuzzy C-means. The goal of building this model is to extendthe traditional fuzzy c-means to a generalized model in convenience of application andresearch.2.1 Fuzzy C-MeansThe basic idea of fuzzy c-means is to find a fuzzy pseudo-partition to minimize thecost function.A brief description is as follows:(1)In above formula, xi is the feature data to be clustered; mk is the center of each cluster; uik is the fuzzy partition corresponding to the feature data; n describes the numberof the feature data; K is the number of the clusters; and a is the exponent used to adjust the fuzzy degree. Generally, a should be greater than 1, and when a is tend toinfinity, the fuzzy degree is increasing. This cost function is used as a control on theupdating. That is, we get final result uik and stop the updating by minimizing the cost

function. Moreover, the uik has the range from 0 to 1 is the main difference with hardc-means which can only have value 0 or 1.The updating steps are defined as:(2)(3)E-Step is used to obtain the new center of each cluster and M-Step is used to updatethe fuzzy partition. By repeating E-step and M-step, cluster center m and fuzzy partition u are updated, until the cost function reaches the minimal value, or can’t be reduced anymore, we can get the final cluster information.2.2 Generalized Fuzzy Clustering Model with FCMTo make the model suitable for most applications and convenient to optimization,the generalized fuzzy clustering model with fuzzy c-means is separated to 3 functionparts and 4 data flow parts as in figure 1.For the functions parts, the first one is feature extractor, which is used to transferoriginal objects information to desired feature data; the second one is the fuzzy clusteranalyzer, which gets cluster information from the feature data; the last part is posttreatment, which is used to obtain the desired final results based on the cluster information.The data flow parts are connected by above 3 function parts. The first one is original objects information, or the representation of input data, which is obtained bymeasurements on objects that are to be recognized. It may be any kind of data information in any kind of data structure. The next is feature information; it is the characteristic features extracted from the input data in terms of which the dimensionality ofpattern vectors can be reduced. The features should be characterizing attributes bywhich the given pattern classes are well discriminated. The third part is cluster information, or category information, which is obtained through cluster analysis. The lastone is goal object information, it is the final desired result, for some special cases, itmay not be necessary always.In this model, the fuzzy cluster analyzer is the key part and most work is also focused on this part. In convenience of optimization, it is encapsulated to 5 parts insteadof traditional E-step and M-step only. It is as in figure 1 as well.For the fuzzy cluster analyzer, the input data mainly includes 3 data information.The first part is the feature data to be clustered, with dimension (n x d). n describeshow many data want to be clustered, and d is the number of dimension of each featuredata. The second input is cluster number K, which describes how many clusters aredesired for the clustering. The next is exponent, which is used to adjust the degree of

fuzzy, usually grater than one. If it’s one, the result is equal to the hard c-means or kmeans. In most applications, the exponent value is rFuzzy ClusterAnalyzerFeatureData(n x K)InitializeuGoalObjectsExponent(a)uaE-stepC (K x d)Calculate D (K x n)DistanceM-stepCU (K x n)U: fuzzy partition matrix;C: center matrix;D: distance matrix.Fig. 1. Structure of Generalized Fuzzy Clustering Model with FCM. The lower part is theflowchart to describe the fuzzy cluster analyzer. Among it, n describes the number of the feature data; K is the number of the clusters; and a is the exponent used to adjust the fuzzy degree.The function parts for the fuzzy clustering analyzer are separated to 5 parts as follows. The first one is initialization, which generate initial fuzzy partition matrix forclustering. The second one is ua, which get the matrix after exponential modification.E-step is used to get the new center matrix, as described in formula (2). The forth partis distance calculation, which calculates the distance of the cluster center with inputfeature data. In general case, the Euclidean distance is used. The fifth part is M-step,which get new fuzzy partition matrix and cost function value. Among them, the costfunction value is used to control the iterations in the implementation. The final fuzzypartition matrix U is the result that we want, which includes the information of cluster

and with dimension (K x n). K is the cluster number, and n is the number of the feature data.2.3 How Does the Model Work?Assume what we have is the original object information. It could be some imageswith digital information or some continues signals or some other kind of information.Based on some criteria or prior knowledge, we need to classify it to several categoriesor several regions.The detail steps for how this model works are described as follows:Step 1. Input the original object information to the feature extractor. The featureextractor extracts feature data based on the criteria or prior knowledge. It could bebriefly considered as a kind of mathematical transformation. The desired feature datashould with exact form ---- a matrix with dimension: n (feature data number) x d (feature dimension). The features should be characterizing attributes by which the givenpattern classes are well discriminated.Step 2. Input the obtained feature data and desired cluster number into the fuzzycluster analyzer. If needs be, you could also adjust the fuzzy degree by inputting theexponent value, and set termination condition to control the iterations.Step 2.1. Initialize the fuzzy partition matrix U, small letter u is used to describethe element of the matrix here;Step 2.2. Calculate the matrix after exponential modification ---- Ua;Step 2.3. Calculate the cluster center matrix C with dimension (K x d), amongwhich K is the desired cluster number and d is the dimension of feature;Step 2.4. Calculate the distance between center and input feature data. Obtaineddistance matrix is with dimensions same with the fuzzy partition matrix.Step 2.5. Calculate new fuzzy partition matrix and cost function value. If the difference of the cost function value is less than or equal to termination condition,stop; otherwise, repeat step2.2. to 2.5. until cost function value satisfies the termination condition.Step 3. Based on above obtained fuzzy partition matrix and requirements for eachspecial case, do post treatment to get the goal object. In most cases, this post treatmentcould be considered as the inverse transform of the first step.3 Potential OptimizationsFollowing lists the main potential optimizations:1. Optimizations on feature data: Feature data obtaining: well discriminated feature data is also the keyto obtain good clustering result. Thus, feature data should best describe the difference between the clusters. How to obtain this featuredata is always an important research issue.

2.3.4.Normalization problem: un-normalized data often makes the clustering time-consuming, and some time even make the cost function veryhard to reach the minimal point.Optimization on cluster number: How to determine the cluster number is always a problem. In manycases, the desired cluster number is not so clear. We could do someresearch to determine the cluster number.Optimizations on Ua The exponent is used to adjust the degree of fuzzy. Generally, when itis close to 1, the fuzzy c-means converges to hard c-means. When ittends to infinity, all cluster centers tend towards the center of the dataset. However, there is no theoretical basis for an optimal choice.Thus, some research could focus on it. On the other hand, to calculate an item with exponent is timeconsuming in computer, there is probably another way to define it.Optimization on distance computation: Generally, we use the Euclidean distance in traditional fuzzy c-mean,but you might also define other kind of distance.4 Application Example and ResultThis section shows 3 examples to test the working status of this generalized fuzzyclustering model.4.1 Example 1This example focuses on testing the working of the fuzzy cluster analyzer. 15 feature data with 2 dimension feature are supplied directly. This data set are displayed asin following figures with sign “*”, “o” and “v” to show the desired clusters. The cluster number is 3, exponent is 2 here. The cluster center is described by “x” sign. Thestep 0, 1 show the iterations number of the updating. The data sets are with 3 initialized centers in step 0, and the others subfigure with steps show the centers’ movingwith the repeat steps, or the iteration of the updating. The details as in following figure2.

step: 0step: 1step: 10step: 15step: 20step: 25Fig. 2. This figure shows how the cluster center moves to get desired clusters. The data set aredisplayed as in the figures with sign “*”, “o” and “v” to show the desired clusters. “x” signsdescribe the 3 cluster center. The cluster number is 3, exponent is 2 here. Step 0 shows the datasets with 3 initialized centers, the others show the centers’ moving with the repeat steps.The final fuzzy partition result is as follows:0.0031 0.9952 0.00170.0161 0.9735 0.01050.0230 0.9650 0.01200.0006 0.9991 0.0004

om above figure and results, we can find that: first, a data point may belong tomultiple clusters with different degrees of membership. That is, it reflects some kindof fuzzy. Next the sum of the degrees of memberships is 1. This reflects the requirement of clustering. Moreover, the fuzzy partition value reflects the how close it belongs to the cluster. Thus, from the greatest value, we can get the cluster result.Above all, the result shows that this model works great on this case.4.2 Example 2This example focuses on the working of the whole parts together. It works on animage, and this image displays some information of bone structure. In this case, thefeature data is the histogram extracted based on the gray value of each pixel in theimage. That is, the feature extractor here works just to get the histogram informationof the image. The fuzzy clustering analyzer is using the reorganized fuzzy c-means asdescribed in this paper. The post treatment can be looks as the inverse of the first step,just redefine each pixel with a new gray value based on the cluster information obtained through the fuzzy clustering analyzer.The testing results are as in following figure 3. The original image is as in subfigure(a). Subfigure (b) shows the results with cluster number 2, and the subfigure (c) showsthe results with cluster number 4. The value of exponent is 2 here also.From the results, we can see that the whole parts do work great. While, there arealso some noise existed in the results. This is due to the noise in the original image. Ifwe could do some pretreatment to smooth it before we extract the feature information,the result might be improved a lot. So, in the real applications, the pretreatment sometime is very important too. However, in this example, I just want to simply test thewhole model’s working. Thus, the result is ok.

(a)(b)(c)Fig. 3. (a) Original image; (b) Clustering result with 2 clusters; (c) Clustering result with 4clusters.4.3 Example 3This example is used to test whether each part in this model keeps flexible and independent enough, so that each part can be replaced by other similar part. Results areas follows figure 4.The image, the feature data and the post treatment method used in this test are directly obtained from internet1. The feature is based on some texture instead of the puregray value. That is, comparing with example 2, the feature extractor is replaced, and1http://vulcan.ee.iastate.edu/ dickerson/classes/ee571x/homework/hw4soln/hw4.html

so is the post treatment part. Though we don’t know exactly how the feature is obtained based on the texture information, we can still replace the corresponding part,and test the working status.(a)(b)Fig. 4. (a) Original image; (b) Clustering result with cluster number 4. The image, the featuredata and the post treatment method used in this test are directly obtained from internet, but theresult shows that it does work fine with the replaced parts.From above results, we can see that the model also works great with replaced parts.On this way, the result does reflect some kind of texture information. For example,comparing the image on the upper-right corner with the one on the lower-right corner,the former focuses on the texture information of the face, while the later focuses moreon the texture information of the eyes.Above all it does keep some kind of independence, and works fine with the replaced parts.

5 ConclusionIn this paper, to simplify applications and research, I extend the traditional Fuzzy CMeans clustering method to a generalized fuzzy clustering model. This generalizedfuzzy clustering model is simplified to 3 function parts: Feature extractor, Fuzzy cluster analyzer and the Post treatment, and 4 data flow parts: original object information,feature data, cluster information and goal object information. Among the 3 functionparts, the fuzzy cluster analyzer is based on Fuzzy C-Means, and encapsulated to 5parts instead of traditional E-step and M-step.An implementation of this model is supplied, and 3 examples are given to test theworking status of it. The properties of this model and the test results show us that:1. This model could be used to most applications;2. Each part of this model is well capsulated, keeps independent and flexible. Eachpart could be replaced by some other corresponding function parts with similar function.3. It is convenient to potential optimizations. We just need to change a simple unitin this model to realize some optimization.More over, some major potential optimizations are analyzed and listed in this paper, future optimizations or research may be based on them.References1. A. Baraldi, and P. Blonda, "A survey of fuzzy clustering algorithms for pattern recognition(1998)", IEEE Transactions on Systems, Man and Cybernetics, Part B survey.html2. F. Masulli, A. Schenone, and A.M. Massone, "Fuzzy clustering methods for the segmentation of multimodal medical images", www.ge.infm.it/ masulli/papers/masulli-fsm2000.pdf3. J.K.George, and Y. bo, Fuzzy sets and fuzzy logic theory and applications, New Jersey:Prentice Hall, 19954. J. Jantzen: "Neurofuzzy Modelling", Technical University of Denmark: Oersted-DTU, Techreport no 98-H-874 (nfmod), 1998. http://fuzzy.iau.dtu.dk/download/nfmod.pdf5. Yingkang Hu, and Richard J. Hathaway, “On Efficiency of Optimization in Fuzzy ations/OptFCM-NPSC.pdf

the traditional fuzzy c-means to a generalized model in convenience of application and research. 2.1 Fuzzy C-Means The basic idea of fuzzy c-means is to find a fuzzy pseudo-partition to minimize the cost function. A brief description is as follows: (1) In above formula, x i is the feature data to be clustered; m k is the center of each clus-ter; u

Related Documents:

fclust: An R Package for Fuzzy Clustering

with ellipsoidal shape. Then, a fuzzy clustering algorithm for relational data is described (Davé and Sen,2002) Fuzzy k-means algorithm The most known and used fuzzy clustering algorithm is the fuzzy k-means (FkM) (Bezdek,1981). The FkM algorithm aims at discovering the best fuzzy

49 Views

2y ago

Takagi-Sugeno Fuzzy Modeling for Process Control

ing fuzzy sets, fuzzy logic, and fuzzy inference. Fuzzy rules play a key role in representing expert control/modeling knowledge and experience and in linking the input variables of fuzzy controllers/models to output variable (or variables). Two major types of fuzzy rules exist, namely, Mamdani fuzzy rules and Takagi-Sugeno (TS, for short) fuzzy .

94 Views

3y ago

A Survey of Fuzzy Clustering - Old Dominion University

methods based on the fuzzy relation. These can be found in (10,12-161. Fuzzy clustering, based on fuzzy relations, was first proposed by Tamura et al. [12]. They proposed an N-step procedure by using the composition of fuzzy relations beginning with a reflexive and symmetric fuzzy relation R in X. They showed that there is an n such that

18 Views

1y ago

MICROCONTROLLER BASED IMPLEMENTATION OF A FUZZY KNOWLEDGE ...

fuzzy controller that uses an adaptive neuro-fuzzy inference system. Fuzzy Inference system (FIS) is a popular computing framework and is based on the concept of fuzzy set theories, fuzzy if and then rules, and fuzzy reasoning. 1.2 LITERATURE REVIEW: Implementation of fuzzy logic technology for the development of sophisticated

55 Views

3y ago

A New Hendecagonal Fuzzy Number for Optimization Problems

Different types of fuzzy sets [17] are defined in order to clear the vagueness of the existing problems. D.Dubois and H.Prade has defined fuzzy number as a fuzzy subset of real line [8]. In literature, many type of fuzzy numbers like triangular fuzzy number, trapezoidal fuzzy number, pentagonal fuzzy number,

21 Views

1y ago

Time Series Clustering - UC3M

Caiado, J., Maharaj, E. A., and D’Urso, P. (2015) Time series clustering. In: Handbook of cluster analysis. Chapman and Hall/CRC. Andrés M. Alonso Time series clustering. Introduction Time series clustering by features Model based time series clustering Time series clustering by dependence Introduction to clustering

67 Views

3y ago

Clonal Selection based Fuzzy C-Means Algorithm for Clustering

the data set. In graph-theoretic fuzzy clustering, the graph representing the data structure is a fuzzy graph and di erent notions of connectivity lead to di erent types of clusters. The idea of fuzzy graphs is rst mentioned in [10] whereby the fuzzy analogues of several basic graph-theoretic concepts

26 Views

3y ago

Day-to-Day Banking

your banking, reduce your working capital needs and possibly save you money. So whether you need to find more efficient ways of paying your suppliers or offer alternative ways for your customers to pay you, improve your accounts reconciliation process, or simply maximise the return on your surplus funds, we have the solutions and experience to help. Managing the cash flowing through your .

69 Views

3y ago

Recent Views

12 PUBLIC LAW AND PRIVATE LAW - Home: The National .

INTRODUCTION TO LAW MODULE - 3 Public Law and Private Law Classification of Law 164 Notes z define Criminal Law; z list the differences between Public and Private Law; and z discuss the role of Judges in shaping Law 12.1 MEANING AND NATURE OF PUBLIC LAW Public Law is that part of law, which governs relationship between the State

3y ago

745 Views

Dr. Ram Manohar Lohiya National Law University, Lucknow

2. Health and Medicine Law 3. Int. Commercial Arbitration 4. Law and Agriculture IXth SEMESTER 1. Consumer Protection Law 2. Law, Science and Technology 3. Women and Law 4. Land Law (UP) Xth SEMESTER 1. Real Estate Law 2. Law and Economics 3. Sports Law 4. Law and Education **Seminar Courses Xth SEMESTER (i) Law and Morality (ii) Legislative .

3y ago

496 Views

Faculty of Juridical, Social and Political Sciences Year .

Law L Law IV 8 Drept procesual civil II / Civil Procedure Law II 5 Law L Law IV 8 Dreptul comerțului internațional / International ommercial Law 4 Law L Law IV 8 riminalistică / Forensics 4 Law L Law IV 8 Practică de cercetare pentru elaborarea lucrării de lincență(3 săptămân

2y ago

384 Views

Companies Law - Cayman Islands dollar

Law 1 of 1971-15th December, 1970 Law 7 of 2000- 20th July, 2000 Law 7 of 1973-28th June, 1973 Law 5 of 2001-20th April, 2001 Law 24 of 1974-22nd November, 1974 Law 10 of 2001-25th May, 2001 Law 25 of 1975-9th December, 1975 Law 29 of 2001-26th September, 2001 Law 19 of 1977-10th November, 1977 Law 46 of 2001-14th January, 2002

3y ago

454 Views

It’s the Law!

ciples stated in Boyle’s Law, Charles’ Law, Gay-Lussac’s Law, Henry’s Law, and Dalton’s Law. Students will be able to explain the application of Boyle’s Law, Charles’ Law, Gay-Lussac’s Law, Henry’s Law, and Dalton’s Law to observations or events related to SCUBA diving. MateriaLs None audio/visuaL MateriaLs None teachinG tiMe

2y ago

378 Views

WHAT LAW IS ? An Introduction to Law

common law system civil law system!! sources of law in civil law !! a1. primary: statutes (written law) enacted by legislative power are the principal source of law. ! a2. two subsidiary sources of law: ! a2.1 administrative regulations a.2.2 customs!! ! sources of law in common law !!! b1. two primary sources of

2y ago

385 Views

Ohm ’s Law

Ohm ’s Law Ohm's law states that, in an electrical circuit, the current passing through most materials is directly proportional to the potential difference applied across them. 3-1—3-3: Ohm ’s Law Formulas There are three forms of Ohm’s Law: I V/R V IR R V/I where:File Size: 1MBPage Count: 40Explore furtherOhm's Law Quiz MCQs with Answers Ohm Lawohmlaw.comOhm’s Law Worksheet - Basic Electricity - All About omohms law worksheet - eering.orgOhm’s Law Worksheet - Richmond County School Systemwww.rcboe.orgOhm's Law with Examples - Physics Problems with Solutions ended to you b

2y ago

295 Views

Intermediate Law Law and You Worksheet 3: Australian law - Home Affairs

4. There are different kinds of law to deal with different kinds of problems. Four important kinds of law are civil law, criminal law, family law and administrative law. Civil law deals with disputes between individuals; for example, if someone sells you goods that are faulty, or that cause you injury or damage, you can take that person to court.

4m ago

110 Views

Public International Law

3. "International Law' is part of the law of the land' - Discuss. Refer to the leading cases & to the British & U.S .practice. 4. State the importance of codification and the steps taken to codify International Law. Assess the contributions of the International Law Commission. 5. "States only are the subjects of International Law' Discuss. 6.

1y ago

140 Views

A Trail Guide to Careers in Environmental Law

law, constitutional law, property law, bankruptcy law, criminal law, food and drug law, land use planning law, and international law. A distinctive aspect of environmental practice is the role of science in advocacy efforts.

3y ago

241 Views

Public International Law: Treaties and International Organizations

of public international law. Public international law is a body of law that defines the relationships, rights, and responsibilities of states. You can think of it as a set of rules for how states interact and associate with each other. Public international law is composed of international treaties, customs, organizations, and even legal .

1y ago

142 Views

The Philosophy of International Law - University of São Paulo

law, such as human rights law, international economic law, international criminal law, international environmental law, and the laws of war. Of course, the volume is not exhaustive and many more issues could have been addressed in an even longer book. This volume is distinguished by its 'dialogical' methodology: there are two essays

1y ago

142 Views

Kingsbury Gal Dimensions of International Organizations Law

Law", 20 European Journal of International Law (2009) pp. 23-57. Symposia on GAL have been published in: 68:3-4 Law and Contemporary Problems (2005); 37:4 New York University Journal of International Law and Politics (2005); 17 European Journal of International Law 1 (2006).

5m ago

52 Views

Accounting Technicians Diploma (ATD) Examination Syllabus

Apply law of contract and tort in various scenarios Apply general principles of business law in practice. CONTENT 2.1 Elements of the legal system 2.1.1 Nature, purpose and classification of law - Meaning of law - Nature of law - Purpose of law - Classification of law - Law and morality 2.1.2 Sources of law - The Constitution

3y ago

216 Views

PRINCIPLES OF BUSINESS LAW - DPHU

ABE Diploma in Business Administration Study Manual PRINCIPLES OF BUSINESS LAW Contents Study Unit Title Page Syllabus i 1 Nature and Sources of Law 1 Nature of Law 3 Historical Origins 6 Sources of Law 9 The European Community and UK Law: An Overview 13 2 Common Law, Equity and Statute Law 23 Custom 25 Case Law 26 Nature of Equity 32

3y ago

285 Views

Generalized Fuzzy Clustering Model With Fuzzy C-Means

It looks like you're using an ad-blocker