Please use this identifier to cite or link to this item: http://hdl.handle.net/2440/128174
Type: Thesis
Title: The vocal tract areagrah: a new technique for displaying speech
Author: Nichol, D. G.
Issue Date: 1983
School/Discipline: School of Electrical and Electronic Engineering
Abstract: The possibility of making speech visib1e has attracted research workers since the scientific investigation of speech began. The reasons for this are not hard to find and are basically related to the synoptic view which is obtained by having a visual 'hard copy' of an utterance. For many years nor4¡ speech spectrograms, also known as ‘Sonagrams’ or ‘Voice Prints’, have been extensively used as the primary method of visually displaying speech. These have found wide acceptance in fields as diverse as phonetics and linguistics, medical diagnosis and, more recently, jurisprudence. Over the Iast decade the application of the techniques of digital signal processing to the modelling of the voice production process has revolutionised speech research. From a synthesis viewpoint reasonably realistic speech can now be produced from a programmable micro ‘chip’ and linear prediction type analysis has led to a significant improvement in the comprehensibility of speech. The present study arises from the fact that, almost as a byproduct of linear prediction analysis, it is possible to obtain an estimate of vocal tract shape from the raw speech data. Efficient algorithms exist to do this in near real time on general purpose computers. Accordingly a new method to display speech is possible and is the subject of this study. The proposed method produces an 'areagram’ which displays, in a format similar to the spectrogram, the varying shape of the vocal tract as a function of time. Instead of frequency as the vertical axis and spectral density as the grey level the areagram has distance along the vocal tract and cross-sectional areas respectively. The areagram is proposed as a complementary display to the spectrogram and not as a replacement. Features of the spectrogram such as formants and Ioci are obviously extremely significant in speech analysis but they are difficult to relate to articulatory processes. The areagram will provide just this information and as it can be plotted on the same scale as the spectrogram a direct comparison is very easy to make. In the following study various aspects of areagram production and display are examined. These include the choice of time series window length and shape, how to handle random noises and breakdown of the all-pole model, the use of picture processing techniques to interpolate data, to enhance structure and how to assign grey levels or colour to the images. These latter, picture processing techniques, are shown to apply equally well to the spectrogram. Finally a comparison of the spectrogram and areagram from the viewpoint of their usefulness for visual recognition is made.
Advisor: Bogner, R. E.
Dissertation Note: Thesis (M.Eng.Sc.) - Dept. of Electrical and Electronic Engineering, University of Adelaide, 1983
Provenance: This electronic version is made publicly available by the University of Adelaide in accordance with its open access policy for student theses. Copyright in this thesis remains with the author. This thesis may incorporate third party material which has been used by the author pursuant to Fair Dealing exceptions. If you are the owner of any included third party copyright material you wish to be removed from this electronic version, please complete the take down form located at: http://www.adelaide.edu.au/legals
Appears in Collections:Research Theses

Files in This Item:
File Description SizeFormat 
Nichol1982_MEngSc.pdf4.53 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.