Revision: 0.21
Since Mandrake Linux is an Open Source product, it needs your contribution. Supporting Arabic in a GNU/Linux distribution is highly required, so it's up to the community of users to ensure its health. Do you want to help Mandrake Arabization project become more robust and powerful? Would you like to see Mandrake Linux holding the Platinum Certificate? MAPS is a meant to be a fairly comprehensive and detailed practical guide for novices to quickly getting Arabic support on a Mandrake GNU/Linux system.
Copyright (C) 2003 by Munzir Taha Obeid, Arabeyes Project.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license can be found at GNU's Site.
This document is meant to serve as a one-stop location that provides all Arabic-related issues with Mandrake. The information herein this document is, to the best of my knowledge, correct. However, the MAPS is written by a human and thus, the chance of mistakes, bugs, etc. might occur from time to time.
This doc is something more than a HOWTO. It serves as a place of coordinating the Arabization efforts in Mandrake. The latest version of this document can be found at arabeyes.
This document was generated using LyX with DocBook article (SGML) class, checked with `nsgmls -s mdkarabicsupport.sgml' and compiled into various formats using docbook-utils.
As with any effort for the Linux project, there are a number of people who contributed to this project. Without their help, this document would not exist.
Thanks to:
All the folks on the arabeyes.org mailing lists who contributed suggestions, answered my questions, and gently showed me the errors in my way.
Mr. Pablo Saratxaga of Mandrake cooker-i18n mailing list for his many great contributions, numerous bug fixes and modifications.
If you are able to translate this document into another language, we would be grateful and we will also do our best to assist you. Please notify us.
Translation to Arabic (ar) by:
Translation to French (fr) by:
Please send any questions, comments, suggestions, corrections, or contributions via e-mail to the `doc' mailing list at http://lists.arabeyes.org/mailman/listinfo/doc. This document will be updated regularly.
Mandrake 9.2 “FiveStar” which is based on 2.4.22 (and ready for Linux 2.6.0) supports the UTF-8 encoding in the console and of course KDE 3.1.3, Gnome 2.4.0, OpenOffice 1.1 and Mozilla 1.4 which have good Arabic support. Mandrake is now the best distro in its support of the Arabic language out of the box. Still Arabic fonts are not usable to DrakX and hence it's disabled completely in this version.
How you configure your computer to support Arabic will depend on whether you choose Arabic as a language during the installation process. If you choose Arabic while installing GNU/Mandrake every needed package will be installed by default. Also, while choosing Arabic as the primary language during install (install is done in Arabic (well, it will when install font problems will be solved, next version)), you can also choose Arabic as a supplementary language (button “advanced") and check `Arabic'). In such a case, everything needed for Arabic is installed (fonts, translations, etc). However, if you want to install Arabic after you finished a default installation, some programs need to be installed. Follow these steps:
Install locales-ar-2.3.2-5mdk.i586 from CD1 (Base files for localization). You can switch your default language (locale) to Arabic by running
$ localedrake
on the command line.
There are several locale environments that need to be defined in order for certain applications to function the way you expect them to, with regard to Arabic.
Main Menu->Configuration->Configure your computer->Hardware->Keyboarddrake
or
/usr/sbin/keyboarddrake
only allows you to choose one keyboard layout, and when choosing a non-Latin one, it adds US qwerty ("us") layout as first one. But in fact you can put any list of comma separated layouts. For people from Morocco, Algeria, Tunis, etc; they may want to have "fr,ar(digits)" instead, someone from Egypt that does Greek translations may want "us,ar(digits),el", etc. to change the layouts edit the the value of XkbLayout option in /etc/X11/XF86Config-4.
Section "InputDevice" Identifier "Keyboard1" Driver "Keyboard" Option "XkbModel" "pc105" Option "XkbLayout" "us,ar(digits)" Option "XkbOptions" "grp:alt_shift_toggle,grp_led:scroll" EndSection
Warning: always put in the list at least one Latin layout, or you won't be able to type shell commands, email addresses, URLs, etc.
If it's not installed already, go ahead and install fonts-ttf-arabic-1.1-3mdk (Free Arabic TrueType fonts) from CD1.
For installing more fonts:
Main Menu->Configuration->Configure your computer->System->DrakFont->Advanced Options->Add
You can also install windows fonts in case of dual boot systems:
Main Menu->Configuration->Configure your computer->System->DrakFont->Get Windows Fonts
We are looking for the day when XFree86 will solve the problem or memory optimization and include the bitmap font they have.
MCC, rpmdrake, and any gtk2 program, uses as default the gtk2 default font. It is by default set to “Sans", which is a list of several aliases. Interesting thing is, if you install “Arial" ttf file, with Arabic glyphs, it will be used. You can also change the default gtk font to whatever you like by launching gnome-font-properties:
Main Menu->Configuration->Font
from Gnome desktop or
$ gnome-font-properties
from a terminal.
You can now already use any Unicode characters in file names. No kernel or file utilities need modifications. This is the general theory, as long as your files stay inside Linux. On filesystems which are used from other operating systems, you have mount options to control conversion of filenames to/from UTF-8. For more info take a look at the Unicode-HOWTO.
It is possible to read/write filenames in Arabic via many applications, shells and GUI file managers. Shells allow for Arabic input and display with some shaping/joining problems.
Mandrake supports reading/writing Arabic filenames in Windows partitions (FAT, FAT32, NTFS) out of the box. In case you haven't enabled Arabic during the installation, just make sure that /etc/fstab contains the iocharset=utf8.
A library to handle bidirectional scripts (eg Hebrew, Arabic), so that the display is done in the proper way; while the text data itself is always written in logical order. The library uses Unicode internally. Install fribidi-0.10.4-3mdk.rpm
Start Applications->Configuration->Configure your desktop->
Accessibility->Keyboard Layout
Please note that currently, the keyboard switcher will only work correctly if the you have chosen Arabic as your language during installation or during login. If your default language is English and you haven't installed the Arabic language locale, you won't be able to use Arabic. Thus, if you need to type in Arabic, you will need not only to select the Arabic keyboard but also make sure locales-ar-2.3.2-5mdk.i586 is installed.
Install kde-i18n-ar-3.1.3-1mdk from CD3 (Arabic language support for KDE).
Start Applications->Configuration->Configure your desktop->
Accessibility->Country - Region & Language->Add Language->Arabic
or
Start Applications->Configuration->Other->Localedrake
Main Menu->Configuration->Configure your computer->
System->KeyboardDrake
The function of acon is to display Arabic text from right to left, and process it to change the letter shape according to its position in the word. Install acon-1.0.5-1mdk. Akka is the replacement for acon but not yet complete.
http://www.arabeyes.org/project.php?proj=akka
Akka intercepts all input and output to and from the terminal to give the user the ability to read Arabic text. This means that any application that can support the Arabic character set (or UTF-8) can and should be able to work under Akka. Akka is still underdevelopement and it's not stable yet. It needs more work.
Pablo: I was unable to compile last time I checked. It would be nice if it uses configure (autoconf). Also, it would be very nice if endianess and word size would be thought about (that is: don't presupposing 32bit x86; Mandrake Linux also has official PPC and IA64 versions, and GNU/Linux runs in a much wider set of architectures); acon doesn't compile on IA64.
mlterm-2.7.0-1mdk.rpm was the first X terminal to support Arabic and bidi in a satisfactory fashion. This binary distribution has utf-8 and bidi support. mlterm is not included on the CD's but can be downloaded form Mandrake contribs.
Requires:
libmkf13-2.7.0-1mdk from the contribs
libkik9-2.7.0-1mdk from the contribs
we need someone to make an rpm for Mandrake 9.2
The SPEC file is at http://cvs.mandrakesoft.com/cgi-bin/cvsweb.cgi/contrib-SPECS/katoob/
A Unicode text editor for the X Window System. It does not need localized environment or Unicode fonts. yudit-2.7.5-1mdk.rpm
A simple text editor, without formatting like bold, italics etc. that supports Arabic very well. Install kdeutils-kedit-3.1.3-20mdk.
To install the Arabic interface for Mozilla, there is an installable language pack (*.xpi). One small compressed archive containig just the localized resources and an installer script. A user with Mozilla installed, can have Arabic available just with one click at:
http://www.4-sms.com/users/mozilla/index.php?action=get&file=langar-1.4-0.1.xpi
I am not sure why I can't see mozilla-l10n rpm packages.
xchat-2.0.4-6mdk.rpm supports different character sets including UTF-8 and CP1256 (ARABIC).
OpenOffice.org-l10n-ar-1.1-0.rc4.2mdk.i586 should be installed. This package contains the localization of OpenOffice.org in Arabic. It contains the user interface, the templates and the autotext features. You can switch user interface language using the standard locales system.
koffice-1.3-0.beta3.6mdk is available but you need to download and install koffice-i18n-ar-1.3-0.beta3.1mdk for the Arabic support.
MySQL-4.0.15-1mdk and MySQL-client-4.0.15-1mdk support Unicode and hence, Arabic.
phpMyAdmin is intended to handle the administration of MySQL over the web. you have to install php-iconv-4.3.3-1mdk.i586.rpm which is in the contrib area to adds Iconv support to PHP.
License: Non commercial use (see readme.txt)
Don't install tetex-latex-arab-3.09f-4mdk.noarch from CD3. (Files for processing Arabic LaTeX documents). This version is now old and contains some problems w.r.t. LyX. There is a newer version (3.11d) of ArabTeX in CTAN. I am about to cook an rpm for Mandrake.
gimp-1.2.5-6mdk which is included in Mandrake 9.2 doesn't support Arabic. gimp-1.3.21 does. Download gimp-1.3.21-1mdk from the mirrors.
BiDi in Wine is still being actively worked on, and is nowhere near complete.
This part will serve as a place on which we test these bug-like behavior from which we decide whether to file a bug report or not. If the bug is resolved to something then it will be filed as a bug and people will be asked to vote for. This will make the job of the developers easier.
If one chooses the Arabic keyboard during the installation, the BACKSPACE key doesn't work.
Pablo: During install old xmodmap system is used; I will look at the xmodmap.ar file what is the problem. I don't see absolutely anything wrong! I attach you the xmodmap file, you can load it with xmodmap command (eg: xmodmap filename). I load it and the backspace key works fine for me... after install, the Arabic layout is from
/usr/X11R6/lib/X11/xkb/symbols/pc/ar
If the Arabic description of a package in Mandrake Control Center and some other message boxes (if not all) contain a comma, the Arabic translation should use an ARABIC COMMA (U+060C) not the one used now.
Can I switch the language using KDE Keyboard Tool on the panel while still using Alt+Shift or whatever to switch the language? Enabling one disables the other for me!
The PDF output of KWord and OpenOffice.org gives some garbage or nasty output in case of some imported windows fonts like Times New Roman, Courier New, Tahoma, ... But nice output in case of fonts like Traditional Arabic, Andalus, kacst fonts, ...!
Some characters are missing from Urdu Nastaliq Unicode, e.g. ALEF WITH HAMZA ABOVE (U+0623), YEH (U+064A), TEH MARBUTA (U+0629), ALEF MAKSURA (U+649), .... This font is a good font that worth looking at and should be added to the khotot project.
In KWord, OpenOffice, ..., the shortcut keys don't work if the language chosen is Arabic, e.g. If you are writing in Arabic, you can't use Ctrl+C, Ctrl+V, Ctrl+A, ... to copy, paste, select all, ...!!
These are recent bugs. Please feel free to hunt for more bugs, vote for existing bugs, report problems, ask distribution maintainers to provide required apps versions for download, help in any possible way, or make sure it's not also valid in your favourite distro.
Upon the installation one will be faced with the dialog box of ”Please choose a language to use.”. Choosing Arabic as the Primary language should enable Arabic during the installation. In Mandrake 9.1 (bamboo) Arabic fonts are not usable to DrakX and hence it's disabled completely in drakxtools in 9.2 (FiveStar). This should be solved instead. Bug 5181
The problem with DrakX is not with DrakX (it can use and display the fonts), but with the way the installation stage is done (at boot it creates some virtual system on RAM to run a small Linux system from where X and DrakX are launched, etc), there are severe size limitations, and the problem is that the size of the Arabic fonts were a bit too big. To fix it the installation stage (not DrakX, but the way the running Linux is put on memory) has to be completely modified. It's way too late for this version. However, while the installation itself will display in English; the system after installation should be fully localized. For version after 9.2 the install method will be rethought to overcome that problem (that also affect several other languages).
I'm still in search of a freely distributable font covering all the Arabic letters (or at least those used by Arabic and Farsi, as we have translations for those); or someone able to do the needed changes with pfaedit to one of the GPL'ed fonts available. The fonts available are for Arabic only. It doesn't have the supplementary glyphs needed for other languages like Farsi and Urdu (for Farsi particularly it is a big penalty; as it has a very good translation level, but no suitable font...). The Arabic fonts are under GPL, so anyone with font knowledge could add the missing glyphs (most of them are easy as they are just variations in number of points over/under base letters, only 3-4 letters need specific drawing); but I can't do it myself, as I don't know enough about scalable font editing. If more glyphs could be added it would be perfect. In the www.unicode.org there are good tables showing the presentation forms of the various chars. Particularly interesting and easy (no new form, just dots, strokes or a small tah (which glyph is already present) are: U0679, U067E, U0686, U0688, U0691, U0698, U06A9, U06AF, U06C7. Digits U06F0-U06F9 (only 4,5,6 differ from standard Arabic ones). Those need a bit more design: U06BA, U06CC, U06D2, U06D5. With those 23 extra characters it would cover also Farsi and Urdu.
Mr. Nadim: I think its more than the 23 characters - you need to account for their variations as well (in presentation Form-A, I'm guessing). In other words, look for the above glyph encodings in, http://www.unicode.org/charts/PDF/U0600.pdf, http://www.unicode.org/charts/PDF/UFB50.pdf. What is required here is someone that is aware of the requirements that is able to read/modify/write font files - or at a min someone that can mail me the appropriate glyphs to include (all of them). The Khotot project has all its files on CVS, so it should be painless to do this given working examples out there to look at.
me: The Arabic font kacst-qr which is used by default in Mandrake for the Arabic interface has some serious issues which need to be corrected. Mapping of characters doesn't seem to conform with Unicode, e.g. square brackets, asterisk, dashes, ... has been mapped into other symbols (sometimes Koranic annotation signs). The problem is serious because this font renders commands (like pstree) that use chars \342 \224 \234 \342 \224 \200 to display garbage.
Arabic alignment in Mandrake First Time Wizard dialog box is left justified. This should be right justified. The position of the the buttons: `Trash' and `Cancel' when you “Move to Trash" a file or `Delete' and `Cancel' when you “delete” a file. Other dialog boxes faces the same problem, e.g. localedrake's `OK' and `Cancel' button should be in the reverse order in Arabic. Also on the dialog box of typing a password (kdesu program), the "keep password" checkbox should go to the right.Bug 5184 (update: for localedrake's `OK' and `Cancel' are on the reverse order in the English version itself so this has nothing to do with the bug. But in kedit if you modify a document and tried to close it without saving, the dialog box's `Yes', `No' and `Cancel' buttons should reverse the order when the interface is Arabic but this is not the case.)
Deleting Arabic text in terminals in response to an interactive command using BACKSPACE will delete the question itself!
When you type any interactive command in the console (e.g. rm), you will be faced with a question such as:
rm: remove regular file `filename'?
Now, if you switch to Arabic and type some Arabic letters then BACKSPACE to delete them, you will delete letters from the question itself which is equal to the number of the typed Arabic letters. Looks as if something concerned with Arabic being double byte and the UTF-8 encoding. This problem is in konsole, xterm, mlterm, gnome-terminal, ... Bug 5645
tetex-latex-arab in Mandrake 9.2 contains a lot of redundant files that's eating valuable space, besides it's an old version too.
In Mandrake 9.2RC2 distro arabtex is packaged as tetex-latex-arab. Package info (rpm -ql tetex-latex-arab) give something like this
/usr/share/texmf/tex/arabtex /usr/share/texmf/tex/arabtex/abidir.sty /usr/share/texmf/tex/arabtex/abjad.sty ... /usr/share/texmf/tex/latex/arabtex /usr/share/texmf/tex/latex/arabtex/abidir.sty /usr/share/texmf/tex/latex/arabtex/abjad.sty ...
As you can see all the files are repeated in two places. This bug eats valuable space from Mandrake. Also I want to point out that this version is a very old one which has many problems. As a Mandrake Club member I voted for version 3.11b to be included in 9.2 since it solves all the problems I am aware of, but couldn't see any movement towards this. I hope someone will make the rpm package available. Bug 5712
rm -i (or just rm since rm -i is an alias for rm by default in Mandrake) will not work due to certain environmental variables related to Arabic. If /etc/sysconfig/i18n contains LC_ALL=ar_SA.UTF-8, LC_ALL=ar, LC_MESSAGES=ar, ... or we export these variables manually, the rm -i command and other commands will not work.
Pablo: It is fixed in the upcoming locales locales-2.3.2-5mdk. To see what the problem is, you can issue the command:
locale -c yesexpr noexpr
me: It's not fixed yet. As you can see, the Arabic locales only define Arabic letters for those. The yY and nN was not added in the correct order (hint: ?|?yY??). It's not fixed yet in glibc!
The Arabic community suggest that Mandrake will be much better if they considered the following:
CVS repository for managing the translation of Mandrake. I am still waiting for Mr Pablo's response...
The day when we say bugs? What are these? ;)
This table shows the Arabic translation status for the Mandrake Linux specific tools. It doesn't show translation status for other programs; however, links are given to similar pages for Gnome, KDE and to "the translation project" (it does mainly GNU text utilities, and some other packages). All data is shown as percentage of translated messages, and you can download the latest .po or .pot files from this page
- errors, like straws, upon the surface flaw. One who would search for pearls, must dive below.
Of course translation work needs revision. Quality Assurance is vital to have a large project like Mandrake translation coherent, accurate and errrors free.
There are several levels of testing, which can iteratively improve the overall quality of the translation:
Use Mandrake! Yes, this is of course the best practice.
Run duali on any translated file before submitting to cvs.
Feedback from users, since it is usually easier for a third person to see mistakes made by someone else. If you fall across any errors, please contact the Mandrake Arabization team.
MkLiveCD (mklivecd) is a collection of scripts that allows you to generate a LiveCD, ala Knoppix, from an existing Mandrake Linux 9.2+ installation.
Localization support.
Build your own LiveCD from a working installation using a systematic, easy way. This may prove much better than hacking another LiveCD distro.
Compression algorithm compress the actual LiveCD root of 40% approx. i.e. a 1GB initial filesystem should compress to 400MB.
Creation script has two options that allow you to adjust the behaviour of the compression algorithm to either a faster LiveCD or a smaller one.
$ su # mkdir -p /tmp/aramix # urpmi basesystem devfsd harddrake --root /tmp/aramix ... Adding additional packages ... # chroot /tmp/aramix /usr/sbin/pwconv # chroot /tmp/aramix # echo 'root' | passwd --stdin root # exit $ mklivecd --rootdir /tmp/aramix aramix.iso
Burn the resulting aramix.iso to CD-R and enjoy :)
In /etc/fstab what is the difference between using iocharset=utf8 without a code page or using codepage=850 with it?
Pablo: The codepage is supposed to be used for storing old ms-dos fs names (in 8.3 names); it is not very useful nowadays (vfat is used instead); also, it seems it doesn't work very well with iocharset=utf8. For non-utf-8 encodings, iocharset and codepage are supposed to cover the same character set, but with different encodings (for some languages they are identical in fact, eg CJK languages notably), e.g: iocharset=iso-8859-1, codepage=437; iocharset=koi8-r, codepage=685, etc. The iocharset is what is used to display on the UNIX side; the codepage is what is used when writing to the MS-DOS formated disk. There is no codepage corresponding to utf-8. When doing experiments some time ago I noticed that when defining a codepage, I was limited to what I could write, even in vfat; while if I used iocharset=utf8 and no codepage defined, I could write any utf-8 filename. It was maybe a kernel bug back then? I don't know; if you find a difference that justifies defining a codepage value with iocharset, tell me.
When one chooses Saudi Arabia as one's country, one will be faced with zone info options that include: Riyadh, Riyadh87, Riyadh88 and Riyadh89. What are these?
This list of choices is taken from the data provided by the zoneinfo package. They have the very same timezone which is 7 mins and 4 secs different from Riyadh (or 3:7:4 from GMT) according to http://www.timezoneconverter.com/cgi-bin/tzc.tzc. In the late fifties Saudis reset their watches every day at sunset.Maghrib prayer is by definition at 12:00 o'clock. People living in Saudi people had to change their clocks each day in a subtle way.
Riyadh is defined in "glibc-2.2.3/timezone/asia" file as this:
# Saudi Arabia # Zone NAME GMTOFF RULES FORMAT [UNTIL] Zone Asia/Riyadh 3:06:52 - LMT 1950 3:00 - AST
That is, until 1950 it used a quite special offset of 3h 06min 52sec. After 1950 it uses 3:00. The other names come from glibc-2.2.3/timezone/solar8[789] they define a very special "timelight saving" mode, progressibe, with each day having a different offset (instead of having 6months +3h and 6months +2h, there is something like +3h0min5sec on 1st January, +3h, 1min 2sec on 2nd January and so on... each of the solar87, 88, 89 defines each and every one of the 365 days of the year with a unique rule for each day. the trigonometric and astronomic formulas for the calculations were given as comments. So, the timezone for those three years was different each and every one of the days of the year! Anyway, it is for past years.
The following Web sites provide checklists and information that is related to Arabic support in GNU/Linux:
Red Hat Arabization Status Project page by Muhammad Alkarouri, arabeyes.org
Linux Arabization Standard Project