Producing Algorithmically Standard Romanization of Arabic Names Using Hints From Non-Standards

International Journal of Computer Processing of Oriental Languages Vol. 17, No.
3 (2004) 165 1 8 0 Chinese Language Computer Society & World Scientific Publishing Company
Producing Algorithmically Standard Romanization of Arabic Names Using Hints from Non-Standards
FAWAZ S. AL-ANZI
Department of Computer Engineering, Kuwait University, P.O. Box 5969, Safat, Postal Code 13060, Kuwait alanzif@eng.kuniv.edu.kw
This article addresses the problem of standard Romanization of Arabic names using undiacritized-Arabic forms and their corresponding non-standard Romanization. The Romanization of Arabic names has long been studied and standardized. Huge amounts of non-standard Arabic databases of Romanized names exist that are in use in many private and government agencies. Examples of such applications are passport name holder databases, phone directories, and geographic names databases. Dealing with such databases can be inefficient and can produce inconsistent results. Converting such databases into their standard Romanization can help in solving these problems. In this paper, we present an efficient algorithmic software implementation which produces standard Romanization of Arabic alphabet name presentation by utilizing the hints in the existing non-standard Romanized databases. The results of the software implementation have proven to be very promising. Keywords: Arabic; Names; Romanization; Transliteration; Standard; Database; Search; Security; Geographic; Map.
1. Introduction
Handling Romanized Arabic names has many applications. Examples of such applications are passport name holder databases, phone directories, and geographic names databases. The processing of such databases will not produce accurate and consistent results if the Romanized Arabic names stored in such database are not consistent. This inconsistency can occur due to the practice of Romanizing names without referring to the standards of Arabic names Romanization procedures. Unfortunately, this process has been happening for quite some time and a huge amount of database has been generated and deployed as an official source of information in many private and government agencies.
165
166 Fawaz S. Al-Anzi
Figure 1. Example of geographic Arabic names Romanization.
This inconsistent representation of Romanized Arabic names can handicap many serious uses of these databases in the future. For example, consider the security hazard of using different Romanization of the same Arabic name seen in the passport and how this can inflict on non-Arabic countries to keep track of such a person for security reasons. Another example, consider the case of searching a persons phone number in a phone directory using the non-standard Romanized Arabic name as the key for the search. Also, consider the difficulty of searching a Romanized geographic name in a map if you do not know the correct Romanized name of that Arabic area name, see Figure 1. Many efforts and directions to solve or reduce the difficulty of these problems have been discussed and attempted by researchers. Of these attempts, many promising solutions have been emerging such as phonetic representation and cross language phonetic search [14]. However, the simplest and most accurate way to solve these problems is to produce a standard and uniform Romanization of Arabic names. Arabic is written from right to left. As opposed to several other languages, uniform results in Romanization (transliteration) of Arabic are difficult to obtain, since vowel points and some of the diacritical marks necessary for certain identification of the Arabic words are always omitted from both handwritten and printed Arabic texts. It follows that the person doing the Romanization must be able to identify the words used in the names and must know their standard
Producing Algorithmically Standard Romanization of Arabic Names 167
written Arabic spelling, their proper vowel pointing, and how to eliminate peculiarities resulting from dialectical and idiosyncratic variation. The problem gets more complicated when dealing with special names such as geographic names. This is due to the fact that in most Arabic speaking regions, a large proportion of the common and geographic names is not available in the Arabic alphabets. This applies to Arabic names as well as to those of non-Arabic origin. Even when the Arabic script is available, it is not always possible to determine the proper vowels from dictionaries and other referencing tools. The Romanization is generally reversible though there are some ambiguous letter sequences (dh, kh, sh, th), which may also point to combinations of Arabic characters in addition to the respective single characters. Arabic text is quasi-stenographic. It is usually presented without diacritical marks, which denote short vowels and geminated consonants Relying on different types of linguistics and textual redundancies, the reader has to substitute for missing diacritics [5]. Non-diacritization is a deeply seated property of the Arabic orthography. Attempts to produce tools to generate diacritics for a general text is underway [69]. Most of the tools developed in this area, concentrate on the making and understanding of the text (sentence) and produce the proper diacritics of a word according to its position in the text. The results of such tools are still at their early stages. It will be quite some time before an efficient general-purpose tool for diacritics generation is produced. A more concise, although not necessarily easier, problem is to produce a tool for diacritics generation of Arabic names. This would be an essential tool for producing an accurate Romanization of Arabic names for Arabic alphabets. Until such tool is perfected, an alternative way to produce Arabic name Romanization must be explored. In this paper, we present an efficient algorithmic implementation of producing a standard Romanization of Arabic alphabet name presentation by utilizing the hits in the existing non-standard Romanized databases. This implementation can be used to produce a software that generates consistent results regardless of the skill of Romanization personnel who uses the system to produce the results. This paper is divided into six sections. Section 1 is the introduction. Section 2 gives a historical background of the Romanization of Arabic names. Section 3 presents the model used to utilize hits in non-standard Romanization of Arabic names to produce standard Romanization results. Section 4 presents the standardization of BGN/PCGN-1956 System. Section 5 presents the results of testing the system on the phone directory of the Ministry of Communication in Kuwait. Finally, Section 6 presents some conclusions.
2. Romanization of Arabic Names

The BGN/PCGN-1956 System [10] for the Arabic alphabet was designed for use in Romanizing (transliterating) standards written in Arabic. It was adopted by BGN in 1946 and by the PCGN in 1956 and has been applied by Syria, Lebanon, Jordan, Iraq, the Arabian Peninsula, Egypt, Libya, Sudan, and Tunisia in 1976. The system was designed to bring about uniformity in the spelling of geographic names in Arabic speaking regions by eliminating deviations resulting from pronunciation which differs in various dialects, except as stated in its special rules section. The Romanization (transliteration) of Arabic consonants, vowels, and diphthongs are presented in the BGN/PCGN-1956 System. Documentation is presented through two sets of tables, general notes and special rules. For Romanization groups of people to be able to produce satisfactory results they have to understand the documents very thoroughly. There is still some chance for producing inconsistent results if two different Romanization personnel with different levels of skill work on the same set of Arabic alphabet names. This may happen due to the fact that tables, general notes and the special rules are interrelated and presented across different parts of the document. One contribution of this paper is that we present an algorithmic implementation of the BGN/PCGN-1956 System that can be used to produce a software that guarantees consistent results regardless of the skill of the Romanization (transliteration) personnel who use the system to produce the results. (See Sec. 4 for more details.) The BGN/PCGN-1956 system is identical to the UN system. The only difference lies in the treatment of articles. The original transliteration table contains examples (but non-explicit rules) where the definite article is always written with a small initial and connected by a hyphen to the main part of the name, e.g. al-Ba rah, ar-Riy . The practice of the BGN and the PCGN, however, is not to use hyphens between articles and names and to capitalize the first definite article in a name, e.g. Al-Ba rah, Ar-Riy . The United Nations recommended Romanization system was approved in 1972 (resolution II/8), based on the system adopted by Arabic experts at a conference held in Beirut in 1971 with the practical amendments carried out and agreed upon by the representatives of the Arabic-speaking countries at this conference [11]. The table was published in volume II of the conference report [12]. In the UN resolution, it was specifically pointed out that the system was recommended for the Romanization of the geographical names within those Arabic-speaking countries where this system is officially acknowledged. It

cannot be definitely ascertained which of the Arabic-speaking countries have adopted this system officially. Judging by the use of names in international cartographic products that rely mostly on national sources, it appears that the UN system is more or less current in Iraq, Kuwait, the Libyan Arab Jamahiriya, Saudi Arabia [13], United Arab Emirates, Yemen, and in some other countries (the system is often used without diacritical marks). For the geographical name of the Syrian Arab Republic, the international maps favor the UN system while the local usage seems to prefer a French-oriented Romanization. Also in Egypt and Sudan there are local Romanization schemes or practices that are used side by side with the UN system. The geographical names of Algeria, Djibouti, Mauritania, Morocco and Tunisia are generally rendered in the traditional manner that conforms to the principles of the French orthography. Resolution 7 of the Seventh UN Conference on the Standardization of Geographical Names (1998) recommended that the League of Arab States should, through its specialized structures, continues its efforts to organize a conference with a view to considering the difficulties encountered in applying the amended Beirut system of 1972 for the Romanization of Arabic script, and submit, as soon as possible, a solution to the United Nations Group of Experts on Geographical Names. At the Eighth UN Conference on the Standardization of Geographical Names (2002), the Arabic Division of the UN Group of Experts announced that it had finalized the proposed modifications to the UN recommended Romanization system. These proposals would be submitted to the League of Arab States for approval. 2.1. Other systems of Romanization Some proposed changes (2002) to the UN system were agreed to by the Arab delegations for the Eighth UN Conference on the Standardization of Geographical Names in Berlin (2002) [14], which include the character ( ) to be Romanized as dh instead of z; and the cedilla (,) to be replaced by a sub-macron (_) in all characters with cedillas. Some less famous form of Romanization of Arabic names also exist. For the benefit of the reader, we would like to mention some of them: The I.G.N. System 1973 (sometimes also called Variant B of the Amended Beirut System) uses an amended Beirut System [15]. In these systems minor amendments are used to resolve local problems and some pronunciation considerations. The transliteration ISO 233:1984 gives every character and diacritical mark a unique equivalent, e.g. long vowels in Arabic and u are consequently written as a, iy and uw, respectively in the ISO transliteration.

The Royal Jordanian Geographic Centre (RJGC) System [16] is essentially the same as the amended Beirut system. The sub-macron is used instead of the cedilla. In the Survey of Egypt System (SES) of Romanization, the variants in parentheses are used depending on pronunciation and tradition. The article is always written as el- (EI-Kafr el-Qadim, Sharm el- Sheikh). In Algeria, at present there is no official Romanization system, the prospects of establishing such a system are being discussed in the Permanent Commission for Taxonomy (CPST) at the National Council of Geographical Information (CNIG) [17]. A system that is used in Lebanon, close to the I.G.N. 1973 System, is mentioned in ISO 3166-2:1998 (Codes for the representation of names of countries and their subdivisions. Part 2: Country subdivision code): Principles for Romanization from Lebanese Arabic to Latin Characters (National Ministry of Defense of the Lebanese Republic, 1963). However, in 2002 Lebanon submitted a document where all geographical names were Romanized using the UN system [18]. In Mauritania, the Romanized name forms in official maps edited since 1969 have been rendered in accordance with a simplified version of the I.G.N. system [19]. In Morocco the official Romanization system for the Arabic script dates from June 17, 1932, although changes to this are being planned [20]. In Tunisia the Directorate of Topography and Cartography has officially adopted the amended Beirut system with minor modifications, in 1983 (e.g., adding a letter 9 to the table).
3. The Proposed Model

In the model we proposed, in order to generate the standard Romanization of an Arabic name, it is required that every name in the database must be presented in both non-diacritized Arabic alphabets and non-standard Roman Arabic name. Since the main handicap of producing a standard Romanization of Arabic Names is the existance of proper diacritics of the Arabic alphabets, the first phase of the proposed model is to try to generate these diacritics using the knowledge stored in the non-standard Romanization of the same names. The next phase is to apply the standard Romanization algorithm. Hence, the model we use for producing a standard Romanization can be summarized in the following two phases: Phase I: Algorithm I, use the hits in the non-standard Romanization of Arabic names, generate the diacritics for Arabic alphabets
Phase II: Algorithm II, use the diacritized Arabic names, generate the standard Romanization of Arabic name. In the proposed model in this paper, we have to differentiate between two types of strings. The first one is the undiacritized and the other is diacritized. Let us denote the undiacritized letter sequence a = a1 a2 an and a possible diacritization of a is a diacritized letter sequence a = a1 a2 a n such that ai is the same as a i with the diacritics removed. To convert a letter sequence from the second type to the first, we only have to remove all diacritics. We also use the following function: Mid(i,k,S) is a function that returns the first k letters (with diacritics if available) starting at position i of a string S in the proper order Length(S) is a function that returns the number of letters in a string S 3.1. Algorithm I: Diacritizing Arabic alphabets using hits from non-standard Romanization In this algorithm, the knowledge is stored in the non-standard Romanization of the Arabic name. The algorithm Diacritize(A,E) works on two strings, A is the Arabic non-diacritized name and E is the non-standard Romanization of the same name. The algorithm starts with the mapping of every non-vowel letter i in the Arabic name A with the proper corresponding letter in the Roman string E and stores in the hash L(i). The possible correspondence is given in Table 1. Next we test the possible diacritics of a letter i by testing the substring M between the letter and the letter following it in the Roman string, i.e., between L(i) and L(i + 1) 1 in the string E. The possible corresponding Roman letters for diacritics are given in Table 2. For example, consider the information pair of the name (A = . E = mohammad) that represents the non-diacritized Arabic alphabets and its nonstandard Romanization. We start by computing the L( ) hash of the Arabic alphabet representation as follows: L(1) = 1, i.e., corresponding letter of in E is at position 1, letter m L(2) = 3, i.e., corresponding letter of in E is at position 3, letter h L(3) = 5, i.e., corresponding letter of in E is at position 5, letter mm L(4) = 8, i.e., corresponding letter of in E is at position 8, letter d Since the letter at position 3, , of the Arabic alphabets is matched with mm as a TASHDID letter in Table 1, the letter should be followed by the TASHDID diacritic. Next, we have to compute M that follows every letter:

Table 1. Single and TASHDID Arabic letters and their possible match(s) in Romanization. # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Arabic letter Romanization possible match a, e, o, u, i B t th j, g h k, kh d th r z s sh s d, dh, z t d, dh, z a, e, o, u, i, , ` , , g, gh, k F g, q, k k l m n h h, t w, o, u y, e, i a, e, o, u, i, y, e, i TASHDID (Doubled) bb tt tth jj, gg hh kk, kkh dd tth rr zz ss ssh ss dd, ddh, zz tt dd, ddh, zz gg, ggh, kk ff gg, qq, kk kk ll mm nn hh hh, tt ww, oo, ou, uu yy, ee, ie yy, ee, ie
! " # $ % & ' ( ) * + , . / 0 1
Producing Algorithmically Standard Romanization of Arabic Names 173 Table 2. Arabic diacritic possible equivalency for Roman letters. # 1 2 3 4 5 6 7 Roman letter A E I U O W Y Arabic diacritic equivalency FATHAH KASRAH KASRAH FATHAH DHAMAH DHAMAH KASRAH
M1 = o, i.e., the hint for diacritic following letter at position 1 is DHAMAH M2 = a, i.e., the hint for diacritic following letter at position 2 is FATHAH M3 = a, i.e., the hint for diacritic following letter at position 3 is FATHAH M4 = , i.e., the hint for diacritic following letter at position 4 is no diacritics Hence, the output of the diacritization process form of the given name is A formal description of the algorithm is given below: Procedure Diacritize(A,E) p=0 For i = 1 to Length(A) L(i) = Match( p,i,A,E) Next i L(n + 1) = Length(A) + 1 For i = 1 to Length(A) M = Mid(L(i), L(i + 1) L(i) + 1,E) Add diacritics to the letter at position i in string A as Dk(M) Next i End procedure Procedure Match( p,i,A,E) Find the first location of j for possible match, as in Table 1, of the letter Mid(i,1,A) in the string E for positions greater or equal to p Check if possible match of the letter is doubled. If so, then add diacritic TASHDID to the letter Mid(i,1,A) in string A .
If no match found then print Error Set p = j + 1 End procedure Function Dk(M) Return a diacritic that in the first equivalent to the string M as in Table 2. End procedure
4. Algorithm II: BNG-PCGN-1956 Standard Romanization (Transliteration)

In this section, we present a formal algorithm description of the way the BGNPCGN-1956 standard works. It is presented in the following algorithm that uses two tables (Tables 3 and 4). The algorithm unravels the ambiguity of the tables, general notes and the special rule interrelation and cross presentation of different parts of the original document of the BNG-PCGN standard. Procedure BNG-PCGN-1956(A) Set i = 1 While i < Length(A) If the following two conditions apply: the string at position i is the same as column Arabic Alphabets in Table 3 the Condition column applies at the same row Then Produce the Output column of the same row in Table 3 Advance i with value of the length of string in column Arabic Alphabets Else If the following two conditions apply: the string at position i is the same as column Arabic Alphabet in Table 4 the Condition column applies at the same row Then Produce the Output column of the same row in the Table 4 Advance i to i+1 End If End If End While End procedure
Table 3. Special rules for Arabic alphabets/diacritics transliteration. Arabic alphabets 203 2/3 3 13 3 /5 7 8 9 5 0: : 2 Name FATHAH YA SUKUN FATHAH WAW SUKUN FATHAH ALIF FATHAH ALIF MAQSURAH FATHAH DAMMAH WAW TANWIN DAMMAH TANWIN KASRAH TANWIN FATHAH DAMMAH KASRAH YA KASRAH SUKUN(JAZMAH) TASHDID Condition TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE Letter at i 1 is a sun letter and the two letters , at positions i 2 and i 1 and a white space at position i3 ELSE ALIF MADDAH HAMZAT AL WASL i=1 ELSE TRUE Output ay aw 4 a 6 un in an u ; i omit omit l and sun letter is doubled
<
double letter at i 1 4 '4 '
176 Fawaz S. Al-Anzi Table 4. Rules for Arabic alphabets Romanization. Arabic alphabet Name HAMZAH ALIF BA TA THA JIM HA KHA DAL DHAL RA ZAY SIN SHIN SAD DAD TA ZA AYN GHAYN FA QAF KAF LAM MIM NUN Condition i=1 ELSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE If letter at i 1 is , letters at i 2 and i + 1 are white spaces ELSE If letter at i 1 is any of the letters , + , # ELSE TRUE TRUE Output omit ' omit b t th j
h
,
! " # $ % & ' ( ) * + , -
kh d dh r z s sh
d
,
z
,
gh f q k l m Bin
HA
n /h h w y
/ 0
WAW YA
5. Results
In this section, we present our experience of applying our proposed model in producing the standard (uniform) Romanization for the Ministry of Communication phone directory in Kuwait. The directory consists of 50,885 Arabic names with non-diacrtized Arabic alphabets and non-standard Romanization of the same names. The application of our proposed model could not make full use of only 1439 names. This constitutes about 2.83% of the total names. This means that our proposed model succeeded in producing the correct Romanization standard of 97.17% of the directory. The rest of the directory needs to be processed manually. Most of the failure in the automatic Romanization of our model was due to the errors existing in the directory in which the non-standard Romanization was missing a letter or in which some letter positions are interchanged. Table 5 shows example of the results obtained from using our model. Notice that the model successfully produced a standard Romanization of the nondiacritized Arabic alphabets by utilizing the hints in the given non-standard Romanization of names.
Table 5. Sample results of applying the proposed model. # 1 2 3 4 5 6 7 8 9 10 Arabic name Non-standard Romanization abalqelob abdulgafour gharzaldeen gharghani mohamad mohamed mohammad mohammed muhamed Arabic name diacritized Standard Romanization balqilub bdalghafur gharzald;n gargan; muhamad muhamid muhammad muhammid muhamid yassn
>?@ A !>BC D E " F GHF F
EKL
yassain
>5@2 A ?: 3 !>5 C2 3 2 2D B3 E : 2 "2 F 3 3 G: F2 F H3 3 33 5 :3 5 I3 5 J3 5 :3 5 EKI 3 L
6. Conclusions
The Romanization of Arabic names and its standardization are addressed. We presented an efficient algorithmic software implementation of producing standard Romanization of Arabic alphabet name presentation by utilizing the hits in the existing non-standard Romanized databases. The model has been formally formulated and implemented. The results of the software implementation have proven to be very promising. This research has a direct impact on Arabic speaking countries since the applications of the results can be applied in many Arabic text-processing databases. Huge amounts of non-standard Arabic databases of Romanized names exist that are in use in many private and government agencies. Dealing with such databases can be made more efficient and can produce more consistent results by applying our model. Converting such databases into their standard Romanization can help in solving many problems in cases like passport Roman names generation, phone directory searches and generating Roman based maps from Arabic ones.
Acknowledgements
This research was supported by Kuwait University Research Administration project number EE 06/00.
References
[1] R. Kneser and H. Ney, Improved clustering techniques for class-based statistical language modeling, in Proc. European Conf. on Speech Technology, 1993, pp. 973976. [2] B. Merialdo, Tagging text with a probabilistic model, in Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, Toronto, 1991, pp. 809812. [3] E. G. Schukat-Talamazzini, H. Niemann, W. Eckert, T. Kuhn and S. Rieck, Acoustic modelling of subword units in the ISADORA speech recognizer, in Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, San Francisco, 1992, pp. 577580. [4] P. Witschel and G. Niedermair, Experiments in dialogue context dependent language modeling, in G. Gorz, editor, KONVENS 92, Springer, Berlin, 1992, pp 395399. [5] Abdoh, Dawood, Pupils Weaknesses in Written Arabic Texts, Symposium of Arabic Language Problem at University Levels, Kuwait University, Kuwait, 1979, pp. 510.
[6] Ali, Nabil, Arabic Language and Computing, Arabization, Kuwait, 1988. [7] Sadany, T. and Hashish, M., Semi-Automatic Vowelization of Arabic Verbs, 12th Computer Conference, Saudi Arabia, 1988. [8] Ali, Nabil, Parsing and automatic diacritization of written arabic: A breakthrough, Proceedings of 13th National Computer Conference, Riyadh, Vol. 28, Dec. 2, 1992. [9] Saliba, Basel and Al-Danan, Abdullah, An approach to automatic vowelization of Arabic texts, Second Conference on Arabic Computational Linguistics, Kuwait, Nov. 2629, 1989. [10] Bahrain, Kuwait, Qatar, and United Arab Emirates Official Standard Names, United States Board on Geographic Names, Defense Mapping Agency Topographic Center, Washington, DC, March 1976. [11] Report on the Current status of United Nation Romanization System for Geographical Names, Compiled by UNGEGN Working Group on Romanization Systems, Version 2.2, January 2003. [12] Second United Nations Conference on the Standardization of Geographical Names, London, 1031 May 1972, Vo1. II, Technical papers, p. 170. [13] Geographic Names Transliteration in GDMS (Saudi Arabia). Eighth United Nations Conference on the Standardization of Geographical Names. Berlin, 27 August5 September 2002. Document E/CONF .94/INF .77. [14] Minutes of the meeting of the Arab Delegations at the Eighth United Nations Conference on the Standardization of Geographical Names. Berlin, 27 August1 September 2002. [Signed by Dr. Abdul Hadi Tazi, Chief of the Arab Delegations. A copy was given to the Convener of the UNGEGN Working Group on Romanization Systems.] [15] Presentation de la Variante B du Systeme de Translittiration de Larabe HBeyrouth amended, UNGEGN, 17th Session. New York, 1324 June 1994. WP No. 61. [16] Activities in Jordan on the Standardization of Geographical Names, UNGEGN, 18th Session, Geneva, 1223 August 19%. WP. No. 86. [17] Rapport de lAlgrie, Huitieme Conference des Nations Unies sur la normalisation des noms geographiques, Berlin, 27 August5 September 2002, E/CONF.94/INF.37. [18] Rapport sur la toponymie, la normalisation et la romanisation des noms geographiques au Liban. Huitieme Conference des Nations Unies sur la normalisation des noms geographiques, Berlin, 27 August5 September 2002, F/CONF.94/INF .7.
[19] Report of the Working Group on a Single Romanization System for Each Non-Roman Writing System: Activities from 1 June 1972 to 16 August 1977, Third United Nations Conference on the Standardization of Geographical Names. Athens, 17 August7 September 1977. Vol. II. Technical papers, pp. 402403. [20] Rapport national sur la toponymie (Maroc). Huitime Conference des Nations Unies sur la normalisation des noms geographiques, Berlin, 27 August5 September 2002, E/CONF .94/lNF .76.

Producing Algorithmically Standard Romanization of Arabic Names Using Hints From Non-Standards

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Producing Algorithmically Standard Romanization of Arabic Names Using Hints From Non-Standards

Diunggah oleh

Hak Cipta:

Format Tersedia

International Journal of Computer Processing of Oriental Languages Vol. 17, No.

166 Fawaz S. Al-Anzi

Figure 1. Example of geographic Arabic names Romanization.

Producing Algorithmically Standard Romanization of Arabic Names 167

168 Fawaz S. Al-Anzi

2. Romanization of Arabic Names

Producing Algorithmically Standard Romanization of Arabic Names 169

170 Fawaz S. Al-Anzi

3. The Proposed Model

Producing Algorithmically Standard Romanization of Arabic Names 171

172 Fawaz S. Al-Anzi

! " # $ % & ' ( ) * + , . / 0 1

174 Fawaz S. Al-Anzi

4. Algorithm II: BNG-PCGN-1956 Standard Romanization (Transliteration)

Producing Algorithmically Standard Romanization of Arabic Names 175

double letter at i 1 4 '4 '

! " # $ % & ' ( ) * + , -

Producing Algorithmically Standard Romanization of Arabic Names 177

>?@ A !>BC D E " F GHF F

>5@2 A ?: 3 !>5 C2 3 2 2D B3 E : 2 "2 F 3 3 G: F2 F H3 3 33 5 :3 5 I3 5 J3 5 :3 5 EKI 3 L

178 Fawaz S. Al-Anzi

Producing Algorithmically Standard Romanization of Arabic Names 179

180 Fawaz S. Al-Anzi

Anda mungkin juga menyukai