Please use this identifier to cite or link to this item:
https://repository.southwesthealthcare.com.au/swhealthcarejspui/handle/1/3658
Journal Title: | Developing automatic articulation, phonation and accent assessment techniques for speakers treated for advanced head and neck cancer. |
Authors: | Clapham, Renee P. Middag, Catherine Hilgers, Frans Martens, Jean-Pierre Van Den Brekel, Michiel M.W. Van Son, Rob |
SWH Author: | Clapham, Renee P. |
Issue Date: | 2014 |
Publisher: | Elsevier |
Date Accessioned: | 2023-04-03T04:47:22Z |
Date Available: | 2023-04-03T04:47:22Z |
Url: | https://doi.org/10.1016/j.specom.2014.01.003 |
Description Affiliation: | Amsterdam Center for Language and Communication, University of Amsterdam, Spuistraat 210, 1012 VT Amsterdam, Netherlandsb Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, Netherlands Multimedia Lab ELIS, University of Gent, Sint-Pietersnieuwstraat 41, 9000 Ghent, Belgium |
Format Startpage: | 44 |
Source Volume: | 59 |
Issue Number: | April 2014 |
DOI: | 10.1016/j.specom.2014.01.003 |
Date: | 2014-01-23 |
Abstract: | Purpose To develop automatic assessment models for assessing the articulation, phonation and accent of speakers with head and neck cancer (Experiment 1) and to investigate whether the models can track changes over time (Experiment 2). Method Several speech analysis methods for extracting a compact acoustic feature set that characterizes a speaker’s speech are investigated. The effectiveness of a feature set for assessing a variable is assessed by feeding it to a linear regression model and by measuring the mean difference between the outputs of that model for a set of recordings and the corresponding perceptual scores for the assessed variable (Experiment 1). The models are trained and tested on recordings of 55 speakers treated non-surgically for advanced oral cavity, pharynx and larynx cancer. The perceptual scores are average unscaled ratings of a group of 13 raters. The ability of the models to track changes in perceptual scores over time is also investigated (Experiment 2). Results Experiment 1 has demonstrated that combinations of feature sets generally result in better models, that the best articulation model outperforms the average human rater’s performance and that the best accent and phonation models are deemed competitive. Scatter plots of computed and observed scores show, however, that especially low perceptual scores are difficult to assess automatically. Experiment 2 has shown that the articulation and phonation models show only variable success in tracking trends over time and for only one of the time pairs are they deemed compete with the average human rater (Experiment 2). Nevertheless, there is a significant level of agreement between computed and observed trends when considering only a coarse classification of the trend into three classes: clearly positive, clearly negative and minor differences. Conclusions A baseline tool to support the multi-dimensional evaluation of speakers treated non-surgically for advanced head and neck cancer now exists. More work is required to further improve the models, particularly with respect to their ability to assess low-quality speech. |
URI: | https://repository.southwesthealthcare.com.au/swhealthcarejspui/handle/1/3658 |
Journal Title: | Speech Communication |
ISSN: | 0167-6393 |
Type: | Journal Article |
Appears in Collections: | SWH Staff Publications |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.