Amer Al-Badarenah
Classifying Arabic text using KNN classifier.
Al-Badarenah, Amer; Al-Shawakfa, Emad; Al-Rababah, Khaleel; Shatnawi, Safwan; Bani-Ismail, Basel
Authors
Emad Al-Shawakfa
Khaleel Al-Rababah
Safwan Shatnawi
Basel Bani-Ismail
Abstract
With the tremendous amount of electronic documents available, there is a great need to classify documents automatically. Classification is the task of assigning objects (images, text documents, etc.) to one of several predefined categories. The selection of important terms is vital to classifier performance, feature set reduction techniques such as stop word removal, stemming and term threshold were used in this paper. Three term-selection techniques are used on a corpus of 1000 documents that fall in five categories. A comparison study is performed to find the effect of using full-word, stem, and the root term indexing methods. K-nearest - neighbors classifiers used in this study. The averages of all folds for Recall, Precision, Fallout, and Error-Rate were calculated. The results of the experiments carried out on the dataset show the importance of using k-fold testing since it presents the variations of averages of recall, precision, fallout, and error rate for each category over the 10-fold.
Citation
AL-BADARENAH, A., AL-SHAWAKFA, E., AL-RABABAH, K., SHATNAWI, S. and BANI-ISMAIL, B. 2016. Classifying Arabic text using KNN classifier. International journal of advanced computer science and applications [online], 7(6), pages 259-268. Available from: https://doi.org/10.14569/IJACSA.2016.070633
Journal Article Type | Article |
---|---|
Acceptance Date | Jul 1, 2016 |
Online Publication Date | Jun 30, 2016 |
Publication Date | Jul 1, 2016 |
Deposit Date | Jul 4, 2016 |
Publicly Available Date | Jul 4, 2016 |
Journal | International journal of advanced computer science and applications |
Print ISSN | 2158-107X |
Electronic ISSN | 2156-5570 |
Publisher | SAI Organization |
Peer Reviewed | Peer Reviewed |
Volume | 7 |
Issue | 6 |
Pages | 259-268 |
DOI | https://doi.org/10.14569/IJACSA.2016.070633 |
Keywords | Categorisation; Arabic; KNN; Stemming; Cross validation |
Public URL | http://hdl.handle.net/10059/1526 |
Contract Date | Jul 4, 2016 |
Files
AL-BADARENAH 2016 Classifying Arabic text
(1.1 Mb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by/4.0/
Downloadable Citations
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search