ck-NN: A Clustered k-Nearest Neighbours Approach for Large-Scale Classification

Ullah, Rafi; Khan, Ayaz H.; Emaduddin, S.m.

Título

dc.contributor.author	Ullah, Rafi
dc.contributor.author	Khan, Ayaz H.
dc.contributor.author	Emaduddin, S.m.
dc.date.accessioned	2020-06-23T11:12:50Z
dc.date.available	2020-06-23T11:12:50Z
dc.date.issued	2019-08-14
dc.identifier.citation	ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal, 8 (2019)
dc.identifier.issn	2255-2863
dc.identifier.uri	http://hdl.handle.net/10366/143315
dc.description.abstract	k-Nearest Neighbor (k-NN) is a non-parametric algorithm widely used for the estimation and classification of data points especially when the dataset is distributed in several classes. It is considered to be a lazy machine learning algorithm as most of the computations are done during the testing phase instead of performing this task during the training of data. Hence it is practically inefficient, infeasible and inapplicable while processing huge datasets i.e. Big Data. On the other hand, clustering techniques (unsupervised learning) greatly affect results if you do normalization or standardization techniques, difficult to determine "k" Value. In this paper, some novel techniques are proposed to be used as pre-state mechanism of state-of-the-art k-NN Classification Algorithm. Our proposed mechanism uses unsupervised clustering algorithm on large dataset before applying k-NN algorithm on different clusters that might running on single machine, multiple machines or different nodes of a cluster in distributed environment. Initially dataset, possibly having multi dimensions, is pass through clustering technique (K-Means) at master node or controller to find the number of clusters equal to the number of nodes in distributed systems or number of cores in system, and then each cluster will be assigned to exactly one node or one core and then applies k-NN locally, each core or node in clusters sends their best result and the selector choose best and nearest possible class from all options. We will be using one of the gold standard distributed framework. We believe that our proposed mechanism could be applied on big data. We also believe that the architecture can also be implemented on multi GPUs or FPGA to take flavor of k-NN on large or huge datasets where traditional k-NN is very slow.
dc.format.mimetype	application/pdf
dc.language.iso	eng
dc.publisher	Ediciones Universidad de Salamanca (España)
dc.rights	info:eu-repo/semantics/openAccess
dc.subject	Computación
dc.subject	Informótica
dc.subject	Computing
dc.subject	Information Technology
dc.title	ck-NN: A Clustered k-Nearest Neighbours Approach for Large-Scale Classification
dc.type	info:eu-repo/semantics/article

Ficheros en el ítem

Nombre:: ck-NN_A_Clustered_k-Nearest_Ne ...
Tamaño:: 773.9Kb
Formato:: PDF

Ver/

Este ítem aparece en la(s) siguiente(s) colección(ones)

ADCAIJ, Vol.8, n.3 [9]

Mostrar el registro sencillo del ítem