Distributed Data Clustering

Sundas Charagh; Ayesha Saleem; Amnah Mukhtar

doi:https://doi.org/10.14445/22492593/IJCOT-V21P306

Research Article | Open Access | Download PDF

Volume 5 | Issue 3 | Year 2015 | Article Id. IJCOT-V21P306 | DOI : https://doi.org/10.14445/22492593/IJCOT-V21P306

Distributed Data Clustering

Sundas Charagh, Ayesha Saleem, Amnah Mukhtar

Citation :

Sundas Charagh, Ayesha Saleem, Amnah Mukhtar, "Distributed Data Clustering," International Journal of Computer & Organization Trends (IJCOT), vol. 5, no. 3, pp. 36-39, 2015. Crossref, https://doi.org/10.14445/22492593/IJCOT-V21P306

Abstract

In modern era the volume of data is enlarging day by day. It has become impossible to handle this data without data mining there are different techniques-clustering is one of them. Clustering is a process of grouping same type of objects. In distributed data clustering these groups are distributed over different sites and then centralized at global sit. The purpose of distributing these clusters is efficiency, performance, communication cost and storage limit. There are many different techniques and algorithms are available for distributed data clustering. These algorithms are divided into two categories-synchronous and asynchronous that further has some sub-categories such as k-means, k harmonic means, DBSCAN, PCA based and many more. The paper also describes some important merits of distributed data clustering as well as demerits.

Keywords

Data mining, Clustering, efficiency and performance.

References

[1] V. Fiolet, E. Laskowski, R. Olejnik, L. Ma, B. Toursel, and M. Tudruj, “Optimizing Distributed Data Mining Applications Based on Object Clustering Methods,” pp. 1–6, 2006.
[2] X. Lin, C. Clifton, and M. Zhu, “Privacy-preserving clustering with distributed EM mixture modeling,” Knowledge and Information Systems, vol. 8, no. 1, pp. 68–81, Dec. 2004.
[3] H. Kriegel, “Towards Effective and Efficient Distributed Clustering,” 2003.
[4] J. C. Silva, C. Giannella, R. Bhargava, H. Kargupta, and M. Klusch, “Distributed Data Mining and Agents f g f g,” no. Ddm.
[5] E. Januzaj, H. Kriegel, and M. Pfeifle, “DBDC : Density Based Distributed Clustering.”
[6] I. S. Dhillon and D. S. Modha, “A Data-Clustering Algorithm On Distributed Memory Multiprocessors.”s
[7] Forman, G., & Zhang, B. (2000). Distributed data clustering can be efficient and exact. ACM SIGKDD explorations newsletter, 2(2), 34-38.