Skip to main content

Research Repository

Advanced Search

Classifying Arabic text using KNN classifier.

Al-Badarenah, Amer; Al-Shawakfa, Emad; Al-Rababah, Khaleel; Shatnawi, Safwan; Bani-Ismail, Basel

Authors

Amer Al-Badarenah

Emad Al-Shawakfa

Khaleel Al-Rababah

Safwan Shatnawi

Basel Bani-Ismail



Abstract

With the tremendous amount of electronic documents available, there is a great need to classify documents automatically. Classification is the task of assigning objects (images, text documents, etc.) to one of several predefined categories. The selection of important terms is vital to classifier performance, feature set reduction techniques such as stop word removal, stemming and term threshold were used in this paper. Three term-selection techniques are used on a corpus of 1000 documents that fall in five categories. A comparison study is performed to find the effect of using full-word, stem, and the root term indexing methods. K-nearest - neighbors classifiers used in this study. The averages of all folds for Recall, Precision, Fallout, and Error-Rate were calculated. The results of the experiments carried out on the dataset show the importance of using k-fold testing since it presents the variations of averages of recall, precision, fallout, and error rate for each category over the 10-fold.

Citation

AL-BADARENAH, A., AL-SHAWAKFA, E., AL-RABABAH, K., SHATNAWI, S. and BANI-ISMAIL, B. 2016. Classifying Arabic text using KNN classifier. International journal of advanced computer science and applications [online], 7(6), pages 259-268. Available from: https://doi.org/10.14569/IJACSA.2016.070633

Journal Article Type Article
Acceptance Date Jul 1, 2016
Online Publication Date Jun 30, 2016
Publication Date Jul 1, 2016
Deposit Date Jul 4, 2016
Publicly Available Date Jul 4, 2016
Journal International journal of advanced computer science and applications
Print ISSN 2158-107X
Electronic ISSN 2156-5570
Publisher SAI Organization
Peer Reviewed Peer Reviewed
Volume 7
Issue 6
Pages 259-268
DOI https://doi.org/10.14569/IJACSA.2016.070633
Keywords Categorisation; Arabic; KNN; Stemming; Cross validation
Public URL http://hdl.handle.net/10059/1526

Files





Downloadable Citations