PISIT' S THAI NATURAL LANGUAGE PROCESSING LABORATORY
This lab is formed since August 26, 1998
e-mail: [email protected]
For C7 members, please check this C7 address list.

KEYWORDS
Thai Natural Language Processing Lab., words segmentation, dictionaries, algorithms, Thai text-to-speech.
����ѧ���������§�ٴ����������Ѻ�ӷ��������ѡ
Morphological Derivative for Unknown Words in Thai text-to-speech Synthesis

Pisit Promchan,
Wittaya Wongvachirapanich,
Saanti Chinnakarn

[Full paper in pdf format]

���Ѵ���: �����������ʹͼš���Ԩ����оѲ�����ҧ�к�����ѧ���������§�ٴ �����·��͡�ҡ�դ�������ö㹡�þٴ������ҹ�ӷ��� � �������� �ѧ�դ�������ö� ��üѹ���§���͡����ҹ�ӷ��������ѡ���ͤӷ������ҡ�㹾��ҹء���� ����дѺ��������� (Precision) �٧�ҡ������÷������ ����ҹ�Ԩ�������ء������ѡ�ǤԴ��Ѫ�ҵ�á��觤��� �������� (Fuzzy Logic) ������ͧ�����ѡ㹡����ѭ��㹡���ѧ������ӷ��������ѡ����ҹ�� 㹧ҹ�Ԩ�¹������ʹ��Ƿҧ���ͧ�鹢ͧ��á��觤����������� ʶһѵ¡����ͧ�к������ �͡Ẻ��� ��鹵͹�Ըա�÷�Ǩ����Ҥ ��û���ҳ˹������§�����§���͢�鹵͹�ͧ��äӳ ǹ���Ѵ��ǹ�ͧ�������Ҫԡ��絢ͧ������������ ��мš�÷��ͺ�����Թ����Է�� �Ҿ�ͧ�к� �����㹡óբͧ�����·��� � �к��դ������������� 99.59% ���㹡ó� �ͧ��ͤ�������Сͺ���¤ӷ��������ѡ�������к��դ������������� 96.69% ���˹觷���� ��������������������ǡѹ�դ�����ͺ���� (Recall) �٧�ش��ͻ���ҳ 98% ��� 88% 㹡óբͧ�����·��� � ���㹡óբͧ��ͤ�������Сͺ���¤ӷ��������ѡ����������ӴѺ

Abstract: This paper presents the morphological derivative for unknown words in Thai text-to-speech synthesis. The research methodology is based on the philosophy of fuzzy logic theory. The paper contains the basic idea of the fuzzy logic, the system architecture, parsing algorithm, approximation matching algorithm for Thai or the fuzzy value calculation. The experimental and performance evaluations are also included. It is found that the system performs up-to 99.59% of precision in case of the Thai text data contain both known and unknown words. The system perform up-to 96.69% in term of precision in case of the Thai text data contains purely unknown. The highest intersection point between precision and recall are about 98% and 88% for Thai text data contain both known and unknown words and the Thai text data contains purely unknown respectively.

Keywords: Thai Text-to-Speech, Synthesis, Parsing, Unknown Word Identification, Approximate Matching, Algorithms, CTI, Fuzzy Logic


This page hosted by � Get your own Free Home Page
Hosted by www.Geocities.ws

1