Mandrake GNU/Linux Arabization Project Status (MAPS)

Revision: 0.21

Since Mandrake Linux is an Open Source product, it needs your contribution. Supporting Arabic in a GNU/Linux distribution is highly required, so it's up to the community of users to ensure its health. Do you want to help Mandrake Arabization project become more robust and powerful? Would you like to see Mandrake Linux holding the Platinum Certificate? MAPS is a meant to be a fairly comprehensive and detailed practical guide for novices to quickly getting Arabic support on a Mandrake GNU/Linux system.


Table of Contents
1. Introduction
1.1. Legal Notice
1.2. Information About this Document
1.2.1. Purpose
1.2.2. Tools used in The Making of This Manual
1.3. Credits
1.4. Translations
1.5. Feedback
2. Brief Overview
3. Setup Essentials
3.1. Locale setup
3.1.1. Files and the kernel
3.1.2. Locale environment variables
3.2. keyboard layout
3.3. Installing Fonts
3.3.1. Installing fonts from the CD
3.3.2. Installing fonts from other resources
3.3.3. Installing Windows fonts
3.3.4. Changing gtk2 programs font
3.4. Arabic Filenames
3.4.1. Linux
3.4.2. Windows
3.5. Installing Libraries
3.5.1. Fribidi
4. Desktop Environments
4.1. KDE
4.1.1. Enable Keyboard Layout
4.1.2. Localize Interface
4.2. GNOME
4.2.1. Enable Keyboard Layout
4.2.2. Localize Interface
5. Console Setup
5.1. acon
5.2. Akka
5.3. Terminals Setup
5.3.1. Mlterm
5.3.2. Konsole
5.3.3. XTerm
5.4. Shells Setup
5.4.1. bash, sh, csh, tcsh
6. X Graphical Applications Setup
6.1. Editors
6.1.1. Katoob
6.1.2. Vim
6.1.3. Emacs
6.1.4. Yudit
6.1.5. GEdit
6.1.6. KEdit
6.2. Browsers
6.2.1. Mozilla
6.2.2. Konqueror
6.3. Mailers
6.3.1. KMail
6.3.2. Mutt
6.4. IRC
6.4.1. Xchat
6.5. Instant messaging
6.5.1. Gaim
6.6. Word Processors
6.6.1. OpenOffice.org
6.6.2. AbiWord
6.6.3. KOffice 1.3.0
6.7. Databases
6.7.1. MySQL
6.7.2. phpMyAdmin
6.8. DTP
6.8.1. LyX
6.8.2. ArabTeX
6.8.3. Scribus
6.9. Graphics
6.9.1. The Gimp
6.10. Emulators
6.10.1. Wine
7. Arabic Issues (Testing, and Quality Assurance)
8. Known Bugs (with Fixes when available)
9. ToDo
10. Wishlist
11. Distro Localization (l10n)
11.1. Arabic translation status for Mandrake Linux
11.2. QA the translation
12. MkLiveCD - Mandrake LiveCD build scripts
12.1. What is MkLiveCD?
12.2. why mklivecd?
12.3. How?
13. Frequently Asked Questions (FAQs)
14. Additional Resources

1. Introduction

1.1. Legal Notice

Copyright (C) 2003 by Munzir Taha Obeid, Arabeyes Project.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license can be found at GNU's Site.


1.2. Information About this Document

1.2.1. Purpose

This document is meant to serve as a one-stop location that provides all Arabic-related issues with Mandrake. The information herein this document is, to the best of my knowledge, correct. However, the MAPS is written by a human and thus, the chance of mistakes, bugs, etc. might occur from time to time.

This doc is something more than a HOWTO. It serves as a place of coordinating the Arabization efforts in Mandrake. The latest version of this document can be found at arabeyes.


1.2.2. Tools used in The Making of This Manual

This document was generated using LyX with DocBook article (SGML) class, checked with `nsgmls -s mdkarabicsupport.sgml' and compiled into various formats using docbook-utils.


1.3. Credits

As with any effort for the Linux project, there are a number of people who contributed to this project. Without their help, this document would not exist.

Thanks to:

  • All the folks on the arabeyes.org mailing lists who contributed suggestions, answered my questions, and gently showed me the errors in my way.

  • Mr. Pablo Saratxaga of Mandrake cooker-i18n mailing list for his many great contributions, numerous bug fixes and modifications.


1.4. Translations

If you are able to translate this document into another language, we would be grateful and we will also do our best to assist you. Please notify us.

  • Translation to Arabic (ar) by:

  • Translation to French (fr) by:


1.5. Feedback

Please send any questions, comments, suggestions, corrections, or contributions via e-mail to the `doc' mailing list at http://lists.arabeyes.org/mailman/listinfo/doc. This document will be updated regularly.


2. Brief Overview

Mandrake 9.2 “FiveStar” which is based on 2.4.22 (and ready for Linux 2.6.0) supports the UTF-8 encoding in the console and of course KDE 3.1.3, Gnome 2.4.0, OpenOffice 1.1 and Mozilla 1.4 which have good Arabic support. Mandrake is now the best distro in its support of the Arabic language out of the box. Still Arabic fonts are not usable to DrakX and hence it's disabled completely in this version.


3. Setup Essentials

How you configure your computer to support Arabic will depend on whether you choose Arabic as a language during the installation process. If you choose Arabic while installing GNU/Mandrake every needed package will be installed by default. Also, while choosing Arabic as the primary language during install (install is done in Arabic (well, it will when install font problems will be solved, next version)), you can also choose Arabic as a supplementary language (button “advanced") and check `Arabic'). In such a case, everything needed for Arabic is installed (fonts, translations, etc). However, if you want to install Arabic after you finished a default installation, some programs need to be installed. Follow these steps:


3.1. Locale setup

Install locales-ar-2.3.2-5mdk.i586 from CD1 (Base files for localization). You can switch your default language (locale) to Arabic by running

$ localedrake
   

on the command line.


3.1.1. Files and the kernel

Files and the kernel


3.1.2. Locale environment variables

There are several locale environments that need to be defined in order for certain applications to function the way you expect them to, with regard to Arabic.


3.2. keyboard layout

Main Menu->Configuration->Configure your computer->Hardware->Keyboarddrake
   

or

/usr/sbin/keyboarddrake
   

only allows you to choose one keyboard layout, and when choosing a non-Latin one, it adds US qwerty ("us") layout as first one. But in fact you can put any list of comma separated layouts. For people from Morocco, Algeria, Tunis, etc; they may want to have "fr,ar(digits)" instead, someone from Egypt that does Greek translations may want "us,ar(digits),el", etc. to change the layouts edit the the value of XkbLayout option in /etc/X11/XF86Config-4.

Section "InputDevice"
   Identifier "Keyboard1"
   Driver "Keyboard"
   Option "XkbModel" "pc105"
   Option "XkbLayout" "us,ar(digits)"
   Option "XkbOptions" "grp:alt_shift_toggle,grp_led:scroll"
EndSection 
   

Warning: always put in the list at least one Latin layout, or you won't be able to type shell commands, email addresses, URLs, etc.


3.3. Installing Fonts

3.3.1. Installing fonts from the CD

If it's not installed already, go ahead and install fonts-ttf-arabic-1.1-3mdk (Free Arabic TrueType fonts) from CD1.


3.3.2. Installing fonts from other resources

For installing more fonts:

Main Menu->Configuration->Configure your computer->System->DrakFont->Advanced Options->Add
    

3.3.3. Installing Windows fonts

You can also install windows fonts in case of dual boot systems:

Main Menu->Configuration->Configure your computer->System->DrakFont->Get Windows Fonts
    

We are looking for the day when XFree86 will solve the problem or memory optimization and include the bitmap font they have.


3.3.4. Changing gtk2 programs font

MCC, rpmdrake, and any gtk2 program, uses as default the gtk2 default font. It is by default set to “Sans", which is a list of several aliases. Interesting thing is, if you install “Arial" ttf file, with Arabic glyphs, it will be used. You can also change the default gtk font to whatever you like by launching gnome-font-properties:

Main Menu->Configuration->Font
    

from Gnome desktop or

$ gnome-font-properties
    

from a terminal.


3.4. Arabic Filenames

3.4.1. Linux

You can now already use any Unicode characters in file names. No kernel or file utilities need modifications. This is the general theory, as long as your files stay inside Linux. On filesystems which are used from other operating systems, you have mount options to control conversion of filenames to/from UTF-8. For more info take a look at the Unicode-HOWTO.

It is possible to read/write filenames in Arabic via many applications, shells and GUI file managers. Shells allow for Arabic input and display with some shaping/joining problems.


3.4.2. Windows

Mandrake supports reading/writing Arabic filenames in Windows partitions (FAT, FAT32, NTFS) out of the box. In case you haven't enabled Arabic during the installation, just make sure that /etc/fstab contains the iocharset=utf8.


3.5. Installing Libraries

3.5.1. Fribidi

A library to handle bidirectional scripts (eg Hebrew, Arabic), so that the display is done in the proper way; while the text data itself is always written in logical order. The library uses Unicode internally. Install fribidi-0.10.4-3mdk.rpm


4. Desktop Environments

4.1. KDE

4.1.1. Enable Keyboard Layout

Start Applications->Configuration->Configure your desktop->
Accessibility->Keyboard Layout
    

Please note that currently, the keyboard switcher will only work correctly if the you have chosen Arabic as your language during installation or during login. If your default language is English and you haven't installed the Arabic language locale, you won't be able to use Arabic. Thus, if you need to type in Arabic, you will need not only to select the Arabic keyboard but also make sure locales-ar-2.3.2-5mdk.i586 is installed.


4.1.2. Localize Interface

Install kde-i18n-ar-3.1.3-1mdk from CD3 (Arabic language support for KDE).

Start Applications->Configuration->Configure your desktop->
Accessibility->Country - Region & Language->Add Language->Arabic
    

or

Start Applications->Configuration->Other->Localedrake
    

4.2. GNOME

4.2.1. Enable Keyboard Layout

Main Menu->Configuration->Configure your computer->
System->KeyboardDrake
    

4.2.2. Localize Interface

Main Menu->Configuration->Other->Localedrake
    

5. Console Setup

5.1. acon

The function of acon is to display Arabic text from right to left, and process it to change the letter shape according to its position in the word. Install acon-1.0.5-1mdk. Akka is the replacement for acon but not yet complete.


5.2. Akka

http://www.arabeyes.org/project.php?proj=akka

Akka intercepts all input and output to and from the terminal to give the user the ability to read Arabic text. This means that any application that can support the Arabic character set (or UTF-8) can and should be able to work under Akka. Akka is still underdevelopement and it's not stable yet. It needs more work.

Pablo: I was unable to compile last time I checked. It would be nice if it uses configure (autoconf). Also, it would be very nice if endianess and word size would be thought about (that is: don't presupposing 32bit x86; Mandrake Linux also has official PPC and IA64 versions, and GNU/Linux runs in a much wider set of architectures); acon doesn't compile on IA64.


5.3. Terminals Setup

5.3.1. Mlterm

mlterm-2.7.0-1mdk.rpm was the first X terminal to support Arabic and bidi in a satisfactory fashion. This binary distribution has utf-8 and bidi support. mlterm is not included on the CD's but can be downloaded form Mandrake contribs.

Requires:

libmkf13-2.7.0-1mdk from the contribs

libkik9-2.7.0-1mdk from the contribs


5.3.2. Konsole

kdebase-konsole-3.1.3-79mdk


5.3.3. XTerm

xterm-179-1mdk


6. X Graphical Applications Setup

6.1. Editors

6.1.1. Katoob

we need someone to make an rpm for Mandrake 9.2

The SPEC file is at http://cvs.mandrakesoft.com/cgi-bin/cvsweb.cgi/contrib-SPECS/katoob/


6.1.2. Vim

vim-6.2-11mdk.rpm


6.1.3. Emacs

Emacs-bidi


6.1.4. Yudit

A Unicode text editor for the X Window System. It does not need localized environment or Unicode fonts. yudit-2.7.5-1mdk.rpm


6.1.5. GEdit

Install gedit-2.4.0-1mdk.rpm.


6.1.6. KEdit

A simple text editor, without formatting like bold, italics etc. that supports Arabic very well. Install kdeutils-kedit-3.1.3-20mdk.


6.2. Browsers

6.2.1. Mozilla

To install the Arabic interface for Mozilla, there is an installable language pack (*.xpi). One small compressed archive containig just the localized resources and an installer script. A user with Mozilla installed, can have Arabic available just with one click at:

http://www.4-sms.com/users/mozilla/index.php?action=get&file=langar-1.4-0.1.xpi
    

I am not sure why I can't see mozilla-l10n rpm packages.


6.2.2. Konqueror

kdebase-progs-3.1.3-79mdk


6.3. Mailers

6.3.1. KMail

kdenetwork-kmail-3.1.3-37mdk


6.3.2. Mutt

mutt-1.4.1i-2mdk.i586.rpm


6.4. IRC

6.4.1. Xchat

xchat-2.0.4-6mdk.rpm supports different character sets including UTF-8 and CP1256 (ARABIC).


6.6. Word Processors

6.6.1. OpenOffice.org

OpenOffice.org-l10n-ar-1.1-0.rc4.2mdk.i586 should be installed. This package contains the localization of OpenOffice.org in Arabic. It contains the user interface, the templates and the autotext features. You can switch user interface language using the standard locales system.


6.6.3. KOffice 1.3.0

koffice-1.3-0.beta3.6mdk is available but you need to download and install koffice-i18n-ar-1.3-0.beta3.1mdk for the Arabic support.


6.7. Databases

6.7.1. MySQL

MySQL-4.0.15-1mdk and MySQL-client-4.0.15-1mdk support Unicode and hence, Arabic.


6.7.2. phpMyAdmin

phpMyAdmin is intended to handle the administration of MySQL over the web. you have to install php-iconv-4.3.3-1mdk.i586.rpm which is in the contrib area to adds Iconv support to PHP.


6.8. DTP


6.8.2. ArabTeX

License: Non commercial use (see readme.txt)

Don't install tetex-latex-arab-3.09f-4mdk.noarch from CD3. (Files for processing Arabic LaTeX documents). This version is now old and contains some problems w.r.t. LyX. There is a newer version (3.11d) of ArabTeX in CTAN. I am about to cook an rpm for Mandrake.


6.8.3. Scribus

scribus-1.0.1-3mdk.rpm has some issues with Arabic. Test the latest version.


6.9. Graphics

6.9.1. The Gimp

gimp-1.2.5-6mdk which is included in Mandrake 9.2 doesn't support Arabic. gimp-1.3.21 does. Download gimp-1.3.21-1mdk from the mirrors.


6.10. Emulators

6.10.1. Wine

BiDi in Wine is still being actively worked on, and is nowhere near complete.


7. Arabic Issues (Testing, and Quality Assurance)

This part will serve as a place on which we test these bug-like behavior from which we decide whether to file a bug report or not. If the bug is resolved to something then it will be filed as a bug and people will be asked to vote for. This will make the job of the developers easier.

Pablo: During install old xmodmap system is used; I will look at the xmodmap.ar file what is the problem. I don't see absolutely anything wrong! I attach you the xmodmap file, you can load it with xmodmap command (eg: xmodmap filename). I load it and the backspace key works fine for me... after install, the Arabic layout is from

/usr/X11R6/lib/X11/xkb/symbols/pc/ar
  


8. Known Bugs (with Fixes when available)

These are recent bugs. Please feel free to hunt for more bugs, vote for existing bugs, report problems, ask distribution maintainers to provide required apps versions for download, help in any possible way, or make sure it's not also valid in your favourite distro.

The problem with DrakX is not with DrakX (it can use and display the fonts), but with the way the installation stage is done (at boot it creates some virtual system on RAM to run a small Linux system from where X and DrakX are launched, etc), there are severe size limitations, and the problem is that the size of the Arabic fonts were a bit too big. To fix it the installation stage (not DrakX, but the way the running Linux is put on memory) has to be completely modified. It's way too late for this version. However, while the installation itself will display in English; the system after installation should be fully localized. For version after 9.2 the install method will be rethought to overcome that problem (that also affect several other languages).

I'm still in search of a freely distributable font covering all the Arabic letters (or at least those used by Arabic and Farsi, as we have translations for those); or someone able to do the needed changes with pfaedit to one of the GPL'ed fonts available. The fonts available are for Arabic only. It doesn't have the supplementary glyphs needed for other languages like Farsi and Urdu (for Farsi particularly it is a big penalty; as it has a very good translation level, but no suitable font...). The Arabic fonts are under GPL, so anyone with font knowledge could add the missing glyphs (most of them are easy as they are just variations in number of points over/under base letters, only 3-4 letters need specific drawing); but I can't do it myself, as I don't know enough about scalable font editing. If more glyphs could be added it would be perfect. In the www.unicode.org there are good tables showing the presentation forms of the various chars. Particularly interesting and easy (no new form, just dots, strokes or a small tah (which glyph is already present) are: U0679, U067E, U0686, U0688, U0691, U0698, U06A9, U06AF, U06C7. Digits U06F0-U06F9 (only 4,5,6 differ from standard Arabic ones). Those need a bit more design: U06BA, U06CC, U06D2, U06D5. With those 23 extra characters it would cover also Farsi and Urdu.

Mr. Nadim: I think its more than the 23 characters - you need to account for their variations as well (in presentation Form-A, I'm guessing). In other words, look for the above glyph encodings in, http://www.unicode.org/charts/PDF/U0600.pdf, http://www.unicode.org/charts/PDF/UFB50.pdf. What is required here is someone that is aware of the requirements that is able to read/modify/write font files - or at a min someone that can mail me the appropriate glyphs to include (all of them). The Khotot project has all its files on CVS, so it should be painless to do this given working examples out there to look at.

me: The Arabic font kacst-qr which is used by default in Mandrake for the Arabic interface has some serious issues which need to be corrected. Mapping of characters doesn't seem to conform with Unicode, e.g. square brackets, asterisk, dashes, ... has been mapped into other symbols (sometimes Koranic annotation signs). The problem is serious because this font renders commands (like pstree) that use chars \342 \224 \234 \342 \224 \200 to display garbage.

When you type any interactive command in the console (e.g. rm), you will be faced with a question such as:

rm: remove regular file `filename'?
  

Now, if you switch to Arabic and type some Arabic letters then BACKSPACE to delete them, you will delete letters from the question itself which is equal to the number of the typed Arabic letters. Looks as if something concerned with Arabic being double byte and the UTF-8 encoding. This problem is in konsole, xterm, mlterm, gnome-terminal, ... Bug 5645

In Mandrake 9.2RC2 distro arabtex is packaged as tetex-latex-arab. Package info (rpm -ql tetex-latex-arab) give something like this

/usr/share/texmf/tex/arabtex
/usr/share/texmf/tex/arabtex/abidir.sty
/usr/share/texmf/tex/arabtex/abjad.sty
...
/usr/share/texmf/tex/latex/arabtex
/usr/share/texmf/tex/latex/arabtex/abidir.sty
/usr/share/texmf/tex/latex/arabtex/abjad.sty
...
  

As you can see all the files are repeated in two places. This bug eats valuable space from Mandrake. Also I want to point out that this version is a very old one which has many problems. As a Mandrake Club member I voted for version 3.11b to be included in 9.2 since it solves all the problems I am aware of, but couldn't see any movement towards this. I hope someone will make the rpm package available. Bug 5712

Pablo: It is fixed in the upcoming locales locales-2.3.2-5mdk. To see what the problem is, you can issue the command:

locale -c yesexpr noexpr
  

me: It's not fixed yet. As you can see, the Arabic locales only define Arabic letters for those. The yY and nN was not added in the correct order (hint: ?|?yY??). It's not fixed yet in glibc!


9. ToDo

The ToDo is to be found here.


10. Wishlist

The Arabic community suggest that Mandrake will be much better if they considered the following:


11. Distro Localization (l10n)

11.1. Arabic translation status for Mandrake Linux

This table shows the Arabic translation status for the Mandrake Linux specific tools. It doesn't show translation status for other programs; however, links are given to similar pages for Gnome, KDE and to "the translation project" (it does mainly GNU text utilities, and some other packages). All data is shown as percentage of translated messages, and you can download the latest .po or .pot files from this page


11.2. QA the translation

- errors, like straws, upon the surface flaw. One who would search for pearls, must dive below.

Of course translation work needs revision. Quality Assurance is vital to have a large project like Mandrake translation coherent, accurate and errrors free.

There are several levels of testing, which can iteratively improve the overall quality of the translation:

  • Use Mandrake! Yes, this is of course the best practice.

  • Run duali on any translated file before submitting to cvs.

  • Feedback from users, since it is usually easier for a third person to see mistakes made by someone else. If you fall across any errors, please contact the Mandrake Arabization team.


12. MkLiveCD - Mandrake LiveCD build scripts

12.1. What is MkLiveCD?

MkLiveCD (mklivecd) is a collection of scripts that allows you to generate a LiveCD, ala Knoppix, from an existing Mandrake Linux 9.2+ installation.


12.2. why mklivecd?

  • Localization support.

  • Build your own LiveCD from a working installation using a systematic, easy way. This may prove much better than hacking another LiveCD distro.

  • Compression algorithm compress the actual LiveCD root of 40% approx. i.e. a 1GB initial filesystem should compress to 400MB.

  • Creation script has two options that allow you to adjust the behaviour of the compression algorithm to either a faster LiveCD or a smaller one.


12.3. How?

$ su
# mkdir -p /tmp/aramix
# urpmi basesystem devfsd harddrake --root /tmp/aramix
... Adding additional packages ...
# chroot /tmp/aramix /usr/sbin/pwconv
# chroot /tmp/aramix
# echo 'root' | passwd --stdin root
# exit
$ mklivecd --rootdir /tmp/aramix aramix.iso
   

Burn the resulting aramix.iso to CD-R and enjoy :)


13. Frequently Asked Questions (FAQs)

Pablo: The codepage is supposed to be used for storing old ms-dos fs names (in 8.3 names); it is not very useful nowadays (vfat is used instead); also, it seems it doesn't work very well with iocharset=utf8. For non-utf-8 encodings, iocharset and codepage are supposed to cover the same character set, but with different encodings (for some languages they are identical in fact, eg CJK languages notably), e.g: iocharset=iso-8859-1, codepage=437; iocharset=koi8-r, codepage=685, etc. The iocharset is what is used to display on the UNIX side; the codepage is what is used when writing to the MS-DOS formated disk. There is no codepage corresponding to utf-8. When doing experiments some time ago I noticed that when defining a codepage, I was limited to what I could write, even in vfat; while if I used iocharset=utf8 and no codepage defined, I could write any utf-8 filename. It was maybe a kernel bug back then? I don't know; if you find a difference that justifies defining a codepage value with iocharset, tell me.

This list of choices is taken from the data provided by the zoneinfo package. They have the very same timezone which is 7 mins and 4 secs different from Riyadh (or 3:7:4 from GMT) according to http://www.timezoneconverter.com/cgi-bin/tzc.tzc. In the late fifties Saudis reset their watches every day at sunset.Maghrib prayer is by definition at 12:00 o'clock. People living in Saudi people had to change their clocks each day in a subtle way.

Riyadh is defined in "glibc-2.2.3/timezone/asia" file as this:

# Saudi Arabia # Zone NAME GMTOFF RULES FORMAT [UNTIL] Zone
Asia/Riyadh 3:06:52 - LMT 1950 3:00 - AST
  

That is, until 1950 it used a quite special offset of 3h 06min 52sec. After 1950 it uses 3:00. The other names come from glibc-2.2.3/timezone/solar8[789] they define a very special "timelight saving" mode, progressibe, with each day having a different offset (instead of having 6months +3h and 6months +2h, there is something like +3h0min5sec on 1st January, +3h, 1min 2sec on 2nd January and so on... each of the solar87, 88, 89 defines each and every one of the 365 days of the year with a unique rule for each day. the trigonometric and astronomic formulas for the calculations were given as comments. So, the timezone for those three years was different each and every one of the days of the year! Anyway, it is for past years.


14. Additional Resources

The following Web sites provide checklists and information that is related to Arabic support in GNU/Linux:

1
Hosted by www.Geocities.ws