[home]

Content Based Retrieval of Speech Documents using
Information Retrieval Techniques

This PhD thesis was supervised by
Dr. Ross Wilkinson, who is now the Research Leader of the
Mathematical and Information Sciences of CSIRO, and
Dr. Justin Zobel, Senior Lecturer of the
Department of Computer Science, RMIT.


(no 
picture available) This is ALWAYS under construction.

A modified abstract of my thesis

Spoken documents occur in forms such as voice mail, radio broadcasts, dictation and court transcripts. Speech is also an element of multimedia documents such as news broadcasts, speech-annotated images, home videos, video conference transcripts, and films. Techniques for managing and retrieving written documents are well-understood. This thesis addressed the problem of retrieving spoken documents.

The most common approach to retrieval of spoken documents is to perform speech recognition first, to create transcriptions of the speech data, then techniques currently used in textual information retrieval can be used to search these transcriptions. Three assumptions are made regarding the speech recognition process: that words useful in retrieving relevant documents can be recognised accurately; that the documents to be recognised using language that has been modelled by the recognition system can improve accuracy; and that the computational resources required for recognition are readily available.

In this thesis, we considered the circumstances where the above assumptions are not necessary valid. First, it may not be possible to use an accurate word recogniser because the resources may not be available. Second, the document collection could contain a high proportion of out-of-vocabulary (OOV) words, which can adversely affect retrieval when mis-recognised.

In the first part of this thesis, we investigated the effectiveness of phoneme n-gram retrieval, where spoken documents were either recognised directly as phoneme sequences or as words and later translated to phoneme sequences using a pronouncing dictionary. We explored the feasibility of retrieval using phoneme n-grams as well as he effect of using IR techniques such as stopping, word boundary information, and combination of evidence. The standard document collection of TREC was used to evaluate our experiments. We found that phoneme n-grams had little impact on retrieval effectiveness because there was sufficient evidence from other words to allow the retrieval of relevant documents.


Publications

  1. Experiments in Spoken Document Retrieval using Phoneme N-grams
    C. Ng, R. Wilkinson and J. Zobel,
    in Speech Communication, Vol 32, Issue 1-2, Sept 2000, Pg 61 - 77.
  2. The RMIT/CSIRO Ad Hoc, Q & A, Web, Interactive, and Speech Experiments at TREC 8
    M. Fuller, M. Kaszkiel, S. Kimberley, C. Ng, R. Wilkinson, M. Wu and J. Zobel,
    in Proceedings of the Eighth Text REtrieval Conference (TREC-8), Gaithersburg, MD, USA, Nov 1999, Pg 549 -- 564.
  3. TREC 7 Ad Hoc, Speech, and Interactive tracks at MDS/CSIRO
    M. Fuller, M. Kaszkiel, D. Kim, C. Ng, J. Robertson, R. Wilkinson, M. Wu and J. Zobel,
    in Proceedings of the Seventh Text REtrieval Conference (TREC-7), Gaithersburg, MD, USA, Nov 1998, Pg 465 -- 474.
  4. Factors affecting Speech Retrieval
    C. Ng, R. Wilkinson and J. Zobel,
    Student Day paper, in Proceedings of the Seventh Australian Speech Science and Technology Conference (SST-98) which has been incorporated into the Fifth International Conference on Spoken Language Processing (ICSLP'98), Sydney, Australia, 30th Nov - 4th Dec 1998, Pg 45 -- 50.
  5. Speech Retrieval using Phonemes with Error Correction
    C. Ng and J. Zobel,
    extended abstract, Proceedings of the Twenty-First International ACM-SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia, Aug 1998, Pg 365 -- 366.
  6. MDS TREC 6 Report
    M. Fuller, M. Kaszkiel, C.L. Ng, P. Vines, R. Wilkinson and J. Zobel,
    in Proceedings of the Sixth Text REtrieval Conference (TREC-6), Gaithersburg, MD, USA, Nov 1997, Pg 241 -- 258.

The views made on this page are solely my own. Do email me at [email protected] if you like to comment.

[home]

Corinna NG

[email protected]
last modified 19th April 2001

Hosted by www.Geocities.ws

1