Suffix-stripping Algorithms and Transducers for the Fulani Language
Zouleiha Alhadji Ibrahima, Dayang Paul, Kolyang, Guidana Gazawa Frederic
Pages - 1 - 17     |    Revised - 31-05-2022     |    Published - 30-06-2022
Volume - 13   Issue - 1    |    Publication Date - June 2022  Table of Contents
Peul, Fulani, Suffix-stripping, Stemming, Linguistic, Transducers.
Because of the large and constantly increasing amount of information available on the Internet, users are facing diverse challenges and difficulties while trying to satisfy their needs. In fact, the objective of today's information retrieval systems is no longer accessing information but the search and filtering of relevant information. The language used for searching information plays a major role. If we consider resource scarce local or national languages, the situation becomes even more challenging. Many African languages fall into the group of resource scarce languages. Therefore, there is a need to explore and build more specialised information systems that enable speakers of African languages to discover valuable information across linguistic and cultural barriers. As one of the most dispersed languages in Africa, the Peul also called Fulani language suffers from a significant handicap in its computerisation and automatic processing due to the inexistence of digital and linguistic resources. Considering the fact that a devoted care and attention to conserve, guarantee the sustainability of languages is important, few studies and computerisation works have been carried out on African Languages such as Fulani. The aim of this work is to lay some bricks towards tools for the automatic processing of the Fulani language. This language belongs to several dialectal areas and there are almost no digital documents of the Fulani language of the Adamaoua dialectal area. The originality of this work is among others the digital processing of Noye Dominique Fulani dictionary from North Cameroon; we then studied stemming approaches for Fulani words using transducers that clearly show how to remove classifiers from words in order to obtain the stem. To do so, we have grouped all the classifiers that are suffixes in number: singular and plural and by degree of classifiers. An example of the process of removing a suffix has been described in this article. Up to date, no research work has been done aiming at processing the Fulani language or native African languages similar to Fulani. In fact, the stemming approach is crucial in all information retrieval systems because it allows the translation and the classification of documents as well as indexing of words. To specify the stemming approaches, we have adapted the stemming algorithms of Lovins and Porter to the Peul language, knowing that they are the best known in literature and they have the advantage of being applied to other languages. Finally, the evaluation of these stemming methods was done using the method of Christ Paice. Based on the principle that words sharing the same stem are likely to share a unity of meaning, we undertook a morphological analysis of 5186 Fulani words from the Fulani dictionary of Dominique Noye. The results obtained from this method by calculating the error rates of over-stemming, under-stemming and truncation errors have shown that both algorithms are efficient for the stemming of Fulani language.
Mrs. Zouleiha Alhadji Ibrahima
Department of Mathematics and Computer Science, Faculty of Science, the University of Ngaoundéré - Cameroon
Mr. Dayang Paul
Department of Mathematics and Computer Science, Faculty of Science, the University of Ngaoundéré - Cameroon
Mr. Kolyang
Department of Computer Science, Higher Teachers, Training College, the University of Maroua - Cameroon
Mr. Guidana Gazawa Frederic
Department of Mathematics and Computer Science, Faculty of Science, the University of Ngaoundéré - Cameroon

