Bayan: An Arabic Text Database Management System.
Roger King, Ali Morfeq:
Most existing databases lack features which allow
for the convenient manipulation of text. It is even more
difficult to use them if the text language is not based on the
Roman alphabet. The Arabic language is a very good example
of this case. Many projects have attempted to use
conventional database systems for Arabic data manipulation
(including text data), but because of Arabic's many
differences with English, these projects have met with limited
success. In the Bayan project, the approach has been
different. Instead of simply trying to adopt an environment
to Arabic, the properties of the Arabic language were the
starting point and everything was designed to meet the
needs of Arabic, thus avoiding the shortcomings of other
projects. A text database management system was designed
to overcome the shortcomings of conventional database
management systems in manipulating text data. Bayan's
data model is based on an object-oriented approach which
helps the extensibility of the system for future use. In Bayan,
we designed the database with the Arabic text properties
in mind. We designed it to support the way Arabic
words are derived, classified, and constructed. Furthermore,
linguistic algorithms (for word generation and morphologlcal
decomposition of words) were designed, leading
to a formalization of rules of Arabic language writing and
sentence construction. A user interface was designed on
top of this environment. A new representation of the Arabic
characters was designed, a complete Arabic keyboard
layout was created, and a window-based Arabic user interface
was also designed.
