3P-6
Indexing the Tigrinya Web
○Omer Osman Ibrahim,三上喜貴(長岡技科大)
The Web has become the most valuable source of Information to everyone. However, due to its huge size and dynamic changes, finding information is not a straight far ward task. The most feasible solution for finding information on the Web is to use a Search Engine. However, not all Search Engines fully support minority languages search on the web. Tigrinya is one of the languages which is not fully and efficiently accessible through the available Search Engines.
In this paper, we report the indexing issues involved in the design and development a search engine for the Tigrinya language. Indexing facilitates Information Retrieval by putting documents in a data structure which makes it easy and accurate to search documents. Indexing Tigrinya language documents involves different analysis steps that are specific to the language. Language specific features that affect Tigrinya information retrieval include correct tokenization, lexical analysis ,stop word removal, Compound word handling, short forms(abbreviation handling) and Stemming. We also report the progress we made and results achieved. The results of our work came up with an original Analyzer for specific for Tigrinya language documents that can be used for indexing Tigrinya web or any other applications that may require efficient Tigrinya search.

footer 情報処理学会 セキュリティ プライバシーポリシー 倫理綱領 著作権について