تاثير الضوضاء على نتائج خوارزمية التجميع == Noise Effect on The Clustering Algorithm Results
Author name:
هدى قاسم جبار
Supervisor name:
جاسم طعمه سرسوح | كاظم مهدي هاشم
General topic:
Computer Science
Specific topic:
Computer Science
Degree:
Master
University:
Mustansiriyah University - College Of Science - Department Of Computer
Language:
English
University location:
Baghdad
First pages:
28T850 - p.pdf
Abstract:
lustering, which is partitioning data into groups of similar objects, has a wide range of applications. In many cases, unstructured data makes up a significant part of the input.Attempting to cluster such part of the data, which can be referred to as noise, can disturb the clustering on the remaining domain points. Despite the practical need for a framework of clustering that allows a portion of the data to remain unclustered, little research has been done so far in that direction. In this thesis, we take a step towards addressing the issue of clustering in the presence of noise.Clustering is being widely used in many applications including medical, financial, etc. Clustering may be applied on database using various approaches, based upon distance, density, hierarchy, and partition. The data item which is not relevant to data mining is called noise. Noise is a major problem in cluster analysis, which degrades the performance of various clustering algorithms in the term of efficiency and time.The objective of this thesis is to study the noise effect on the performance of various clustering algorithms. Propose a new clustering algorithm that the noise effect is very low compared with the other clustering algorithms.Our purpose is to study how a proposed algorithm is responsive to the noise in the efficiency. K - mean algorithm and our proposed clustering algorithm used based upon the partitioned or hierarchical clustering. Different types of noise add to selected database then measure the effect of that noise on the result of clustering algorithm (proposed algorithm and K - mean algorithm).The challenge of our proposed algorithm to the noise is study by computing the efficiency, the time and the obtained clusters number. Then the percentage of noise will be varied, the efficiency and the time required for clustering, will be calculated. The observation results will be used to compare the efficiency of algorithms and the processed time.The proposed clustering algorithm and other algorithms used in our study have been implemented using (MATLAB R2014b) programming language, and the programs work under windows 7 Ultimate service pack1 operating system type (32 - bit). The tests have been applied using a personal computer (Core i5, processor 2.60 GHz, RAM 6 G - byte).A real - world database containing 200 images has been constructed during thesis time. We do some experiments to demonstrate the power of the proposed algorithm.The results showed that the proposed clustering algorithm efficiency is not affected by the presence of noise but it efficiency is less if the noise ratio is increased.