Classifying Arabic text using KNN classifier.

Al-Badarenah, Amer; Al-Shawakfa, Emad; Al-Rababah, Khaleel; Shatnawi, Safwan; Bani-Ismail, Basel

doi:10.14569/IJACSA.2016.070633

Classifying Arabic text using KNN classifier.

Al-Badarenah, Amer; Al-Shawakfa, Emad; Al-Rababah, Khaleel; Shatnawi, Safwan; Bani-Ismail, Basel

Authors

Amer Al-Badarenah

Emad Al-Shawakfa

Khaleel Al-Rababah

Safwan Shatnawi

Basel Bani-Ismail

Abstract

With the tremendous amount of electronic documents available, there is a great need to classify documents automatically. Classification is the task of assigning objects (images, text documents, etc.) to one of several predefined categories. The selection of important terms is vital to classifier performance, feature set reduction techniques such as stop word removal, stemming and term threshold were used in this paper. Three term-selection techniques are used on a corpus of 1000 documents that fall in five categories. A comparison study is performed to find the effect of using full-word, stem, and the root term indexing methods. K-nearest - neighbors classifiers used in this study. The averages of all folds for Recall, Precision, Fallout, and Error-Rate were calculated. The results of the experiments carried out on the dataset show the importance of using k-fold testing since it presents the variations of averages of recall, precision, fallout, and error rate for each category over the 10-fold.

Citation

AL-BADARENAH, A., AL-SHAWAKFA, E., AL-RABABAH, K., SHATNAWI, S. and BANI-ISMAIL, B. 2016. Classifying Arabic text using KNN classifier. International journal of advanced computer science and applications [online], 7(6), pages 259-268. Available from: https://doi.org/10.14569/IJACSA.2016.070633

Journal Article Type	Article
Acceptance Date	Jul 1, 2016
Online Publication Date	Jun 30, 2016
Publication Date	Jul 1, 2016
Deposit Date	Jul 4, 2016
Publicly Available Date	Jul 4, 2016
Journal	International journal of advanced computer science and applications
Print ISSN	2158-107X
Electronic ISSN	2156-5570
Publisher	SAI Organization
Peer Reviewed	Peer Reviewed
Volume	7
Issue	6
Pages	259-268
DOI	https://doi.org/10.14569/IJACSA.2016.070633
Keywords	Categorisation; Arabic; KNN; Stemming; Cross validation
Public URL	http://hdl.handle.net/10059/1526
Contract Date	Jul 4, 2016

Files

AL-BADARENAH 2016 Classifying Arabic text (1.1 Mb)
PDF

Publisher Licence URL
https://creativecommons.org/licenses/by/4.0/

Downloadable Citations

HTML

BIB

RTF