The Efficiency of IsiNdebele Part of Speech Tagger: A Quantitative Analysis

Authors

DOI:

https://doi.org/10.51415/ajims.v8i1.3170

Keywords:

accuracy, F1 score, part of speech tagger, precision, recall

Abstract

This study evaluates the performance of the isiNdebele part of speech tagger developed by the National Centre for Human Language Technologies as part of Nguni core technologies. A sample of 522 words from government documents and isiNdebele literary works was randomly selected. A mixed-methods approach was utilised to analyse the data. The raw data were automatically processed using the tagger, and the outputs were compared against the gold standard to calculate the tagger’s accuracy. Nouns attained an accuracy of 86%, verbs 66%, adverbs 59%, pronouns 90%, adjectives 14%, conjunctions 33%, copulatives 83%, relatives 50%, possessives 90%, demonstratives 71%, while it was 0% for ideophones, interjections, prepositions, question words and auxiliary verbs. Recall and precision were calculated using Python 3.0, enabling the researchers to determine the F1 score. Nouns achieved a recall of 0.86, precision of 0.55, and F1 score 0.67, verbs 0.66, 0.7 and 0.68, relatives 0.5, 0.46 and 0.48, adverbs 0.63, 0.86 and 0.73, possessives 0.9, 0.56 and 0.69, demonstratives 0.71, 0.86 and 0.78, adjectives 0.14, 0.67 and 0.23, pronouns 0.9, 0.95 and 0.92 copulatives 0.83, 1.0 and 0.91 and conjunctions 0.36, 0.83 and 0.5 respectively. These findings underscore the importance of improving the isiNdebele part of speech tagger.

Downloads

Published

18-02-2026

How to Cite

Matfunjwa, M., & Skosana, N. (2026). The Efficiency of IsiNdebele Part of Speech Tagger: A Quantitative Analysis. African Journal of Inter Multidisciplinary Studies, 8(1), 1–12. https://doi.org/10.51415/ajims.v8i1.3170

Similar Articles

<< < 1 2 3 4 > >> 

You may also start an advanced similarity search for this article.