Publications

Refine Results

(Filters Applied) Clear All

Modeling real-world affective and communicative nonverbal vocalizations from minimally speaking individuals

Published in:
IEEE Trans. on Affect. Comput., Vol. 13, No. 4, October 2022, pp. 2238-53.

Summary

Nonverbal vocalizations from non- and minimally speaking individuals (mv*) convey important communicative and affective information. While nonverbal vocalizations that occur amidst typical speech and infant vocalizations have been studied extensively in the literature, there is limited prior work on vocalizations by mv* individuals. Our work is among the first studies of the communicative and affective information expressed in nonverbal vocalizations by mv* children and adults. We collected labeled vocalizations in real-world settings with eight mv* communicators, with communicative and affective labels provided in-the-moment by a close family member. Using evaluation strategies suitable for messy, real-world data, we show that nonverbal vocalizations can be classified by function (with 4- and 5-way classifications) with F1 scores above chance for all participants. We analyze labeling and data collection practices for each participating family, and discuss the classification results in the context of our novel real-world data collection protocol. The presented work includes results from the largest classification experiments with nonverbal vocalizations from mv* communicators to date.
READ LESS

Summary

Nonverbal vocalizations from non- and minimally speaking individuals (mv*) convey important communicative and affective information. While nonverbal vocalizations that occur amidst typical speech and infant vocalizations have been studied extensively in the literature, there is limited prior work on vocalizations by mv* individuals. Our work is among the first studies...

READ MORE

Wearable technology in extreme environments

Published in:
Chapter 2 in: Cibis, T., McGregor AM, C. (eds) Engineering and Medicine in Extreme Environments. Springer, Cham. https://doi-org.ezproxy.canberra.edu.au/10.1007/978-3-030-96921-9_2

Summary

Humans need to work in many types of extreme environments where there is a need to stay safe and even to improve performance. Examples include: medical providers treating infectious disease, people responding to other biological or chemical hazards, firefighters, astronauts, pilots, divers, and people working outdoors in extreme hot or cold temperatures. Wearable technology is ubiquitous in the consumer market but is still needed for extreme environments. For these applications, it is particularly challenging to meet requirements to be actionable, accurate, acceptable, integratable, and affordable. To provide insight into these needs and possible solutions and the technology trade-offs involved, several examples are provided. A physiological monitoring example is described for predicting and avoiding heat injury. A cognitive monitoring example is described for estimating cognitive workload, with broader applicability to a variety of conditions, such as cognitive fatigue and depression. Finally, eye tracking is considered as a promising wearable sensing modality with applications for both physiological and cognitive monitoring. Concluding thoughts are offered on the compelling need for wearable technology in the face of pandemics, wildfires, and climate change, but also for global projects that can uplift mankind, such as long-duration spaceflight and missions to Mars.
READ LESS

Summary

Humans need to work in many types of extreme environments where there is a need to stay safe and even to improve performance. Examples include: medical providers treating infectious disease, people responding to other biological or chemical hazards, firefighters, astronauts, pilots, divers, and people working outdoors in extreme hot or...

READ MORE

Artificial intelligence for detecting COVID-19 with the aid of human cough, breathing and speech signals: scoping review

Summary

Background: Official tests for COVID-19 are time consuming, costly, can produce high false negatives, use up vital chemicals and may violate social distancing laws. Therefore, a fast and reliable additional solution using recordings of cough, breathing and speech data forpreliminary screening may help alleviate these issues. Objective: This scoping review explores how Artificial Intelligence (AI) technology aims to detect COVID-19 disease by using cough, breathing and speech recordings, as reported in theliterature. Here, we describe and summarize attributes of the identified AI techniques and datasets used for their implementation. Methods: A scoping review was conducted following the guidelines of PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews). Electronic databases (Google Scholar, Science Direct, and IEEE Xplore) were searched between 1st April 2020 and 15st August 2021. Terms were selected based on thetarget intervention (i.e., AI), the target disease (i.e., COVID-19) and acoustic correlates of thedisease (i.e., speech, breathing and cough). A narrative approach was used to summarize the extracted data. Results: 24 studies and 8 Apps out of the 86 retrieved studies met the inclusion criteria. Halfof the publications and Apps were from the USA. The most prominent AI architecture used was a convolutional neural network, followed by a recurrent neural network. AI models were mainly trained, tested and run-on websites and personal computers, rather than on phone apps. More than half of the included studies reported area-under-the-curve performance of greater than 0.90 on symptomatic and negative datasets while one study achieved 100% sensitivity in predicting asymptomatic COVID-19 for cough-, breathing- or speech-based acoustic features. Conclusions: The included studies show that AI has the potential to help detect COVID-19 using cough, breathing and speech samples. However, the proposed methods with some time and appropriate clinical testing would prove to be an effective method in detecting various diseases related to respiratory and neurophysiological changes in human body.
READ LESS

Summary

Background: Official tests for COVID-19 are time consuming, costly, can produce high false negatives, use up vital chemicals and may violate social distancing laws. Therefore, a fast and reliable additional solution using recordings of cough, breathing and speech data forpreliminary screening may help alleviate these issues. Objective: This scoping review...

READ MORE

Speech as a biomarker: opportunities, interoperability, and challenges

Published in:
Perspectives of the ASHA Special Interest Groups, Vo. 7, February 2022, pp. 276-83.

Summary

Purpose: Over the past decade, the signal processing and machine learning literature has demonstrated notable advancements in automated speech processing with the use of artificial intelligence for medical assessment and monitoring (e.g., depression, dementia, and Parkinson's disease, among others). Meanwhile, the clinical speech literature has identified several interpretable, theoretically motivated measures that are sensitive to abnormalities in the cognitive, linguistic, affective, motoric, and anatomical domains. Both fields have, thus, independently demonstrated the potential for speech to serve as an informative biomarker for detecting different psychiatric and physiological conditions. However, despite these parallel advancements, automated speech biomarkers have not been integrated into routine clinical practice to date. Conclusions: In this article, we present opportunities and challenges for adoption of speech as a biomarker in clinical practice and research. Toward clinical acceptance and adoption of speech-based digital biomarkers, we argue for the importance of several factors such as robustness, specificity, diversity, and physiological interpretability of speech analytics in clinical applications.
READ LESS

Summary

Purpose: Over the past decade, the signal processing and machine learning literature has demonstrated notable advancements in automated speech processing with the use of artificial intelligence for medical assessment and monitoring (e.g., depression, dementia, and Parkinson's disease, among others). Meanwhile, the clinical speech literature has identified several interpretable, theoretically motivated...

READ MORE

EEG alpha and pupil diameter reflect endogenous auditory attention switching and listening effort

Published in:
Eur. J. Neurosci., 2022, pp. 1-16.

Summary

Everyday environments often contain distracting competing talkers and background noise, requiring listeners to focus their attention on one acoustic source and reject others. During this auditory attention task, listeners may naturally interrupt their sustained attention and switch attended sources. The effort required to perform this attention switch has not been well studied in the context of competing continuous speech. In this work, we developed two variants of endogenous attention switching and a sustained attention control. We characterized these three experimental conditions under the context of decoding auditory attention, while simultaneously evaluating listening effort and neural markers of spatial-audio cues. A least-squares, electroencephalography (EEG) based, attention decoding algorithm was implemented across all conditions. It achieved an accuracy of 69.4% and 64.0% when computed over non-overlapping 10 and 5-second correlation windows, respectively. Both decoders illustrated smooth transitions in the attended talker prediction through switches at approximately half of the analysis window size (e.g. the mean lag taken across the two switch conditions was 2.2 seconds when the 5-second correlation window was used). Expended listening effort, as measured by simultaneous EEG and pupillometry, was also a strong indicator of whether the listeners sustained attention or performed an endogenous attention switch (peak pupil diameter measure (p = 0.034) and minimum parietal alpha power measure (p = 0.016)). We additionally found evidence of talker spatial cues in the form of centrotemporal alpha power lateralization (p = 0.0428). These results suggest that listener effort and spatial cues may be promising features to pursue in a decoding context, in addition to speech-based features.
READ LESS

Summary

Everyday environments often contain distracting competing talkers and background noise, requiring listeners to focus their attention on one acoustic source and reject others. During this auditory attention task, listeners may naturally interrupt their sustained attention and switch attended sources. The effort required to perform this attention switch has not been...

READ MORE

Using oculomotor features to predict changes in optic nerve sheath diameter and ImPACT scores from contact-sport athletes

Summary

There is mounting evidence linking the cumulative effects of repetitive head impacts to neuro-degenerative conditions. Robust clinical assessment tools to identify mild traumatic brain injuries are needed to assist with timely diagnosis for return-to-field decisions and appropriately guide rehabilitation. The focus of the present study is to investigate the potential for oculomotor features to complement existing diagnostic tools, such as measurements of Optic Nerve Sheath Diameter (ONSD) and Immediate Post-concussion Assessment and Cognitive Testing (ImPACT). Thirty-one high school American football and soccer athletes were tracked through the course of a sports season. Given the high risk of repetitive head impacts associated with both soccer and football, our hypotheses were that (1) ONSD and ImPACT scores would worsen through the season and (2) oculomotor features would effectively capture both neurophysiological changes reflected by ONSD and neuro-functional status assessed via ImPACT. Oculomotor features were used as input to Linear Mixed-Effects Regression models to predict ONSD and ImPACT scores as outcomes. Prediction accuracy was evaluated to identify explicit relationships between eye movements, ONSD, and ImPACT scores. Significant Pearson correlations were observed between predicted and actual outcomes for ONSD (Raw = 0.70; Normalized = 0.45) and for ImPACT (Raw = 0.86; Normalized = 0.71), demonstrating the capability of oculomotor features to capture neurological changes detected by both ONSD and ImPACT. The most predictive features were found to relate to motor control and visual-motor processing. In future work, oculomotor models, linking neural structures to oculomotor function, can be built to gain extended mechanistic insights into neurophysiological changes observed through seasons of participation in contact sports.
READ LESS

Summary

There is mounting evidence linking the cumulative effects of repetitive head impacts to neuro-degenerative conditions. Robust clinical assessment tools to identify mild traumatic brain injuries are needed to assist with timely diagnosis for return-to-field decisions and appropriately guide rehabilitation. The focus of the present study is to investigate the potential...

READ MORE

Speaker separation in realistic noise environments with applications to a cognitively-controlled hearing aid

Summary

Future wearable technology may provide for enhanced communication in noisy environments and for the ability to pick out a single talker of interest in a crowded room simply by the listener shifting their attentional focus. Such a system relies on two components, speaker separation and decoding the listener's attention to acoustic streams in the environment. To address the former, we present a system for joint speaker separation and noise suppression, referred to as the Binaural Enhancement via Attention Masking Network (BEAMNET). The BEAMNET system is an end-to-end neural network architecture based on self-attention. Binaural input waveforms are mapped to a joint embedding space via a learned encoder, and separate multiplicative masking mechanisms are included for noise suppression and speaker separation. Pairs of output binaural waveforms are then synthesized using learned decoders, each capturing a separated speaker while maintaining spatial cues. A key contribution of BEAMNET is that the architecture contains a separation path, an enhancement path, and an autoencoder path. This paper proposes a novel loss function which simultaneously trains these paths, so that disabling the masking mechanisms during inference causes BEAMNET to reconstruct the input speech signals. This allows dynamic control of the level of suppression applied by BEAMNET via a minimum gain level, which is not possible in other state-of-the-art approaches to end-to-end speaker separation. This paper also proposes a perceptually-motivated waveform distance measure. Using objective speech quality metrics, the proposed system is demonstrated to perform well at separating two equal-energy talkers, even in high levels of background noise. Subjective testing shows an improvement in speech intelligibility across a range of noise levels, for signals with artificially added head-related transfer functions and background noise. Finally, when used as part of an auditory attention decoder (AAD) system using existing electroencephalogram (EEG) data, BEAMNET is found to maintain the decoding accuracy achieved with ideal speaker separation, even in severe acoustic conditions. These results suggest that this enhancement system is highly effective at decoding auditory attention in realistic noise environments, and could possibly lead to improved speech perception in a cognitively controlled hearing aid.
READ LESS

Summary

Future wearable technology may provide for enhanced communication in noisy environments and for the ability to pick out a single talker of interest in a crowded room simply by the listener shifting their attentional focus. Such a system relies on two components, speaker separation and decoding the listener's attention to...

READ MORE

Human balance models optimized using a large-scale, parallel architecture with applications to mild traumatic brain injury

Published in:
2020 IEEE High Performance Extreme Computing Conf., HPEC, 22-24 September 2020.

Summary

Static and dynamic balance are frequently disrupted through brain injuries. The impairment can be complex and for mild traumatic brain injury (mTBI) can be undetectable by standard clinical tests. Therefore, neurologically relevant modeling approaches are needed for detection and inference of mechanisms of injury. The current work presents models of static and dynamic balance that have a high degree of correspondence. Emphasizing structural similarity between the domains facilitates development of both. Furthermore, particular attention is paid to components of sensory feedback and sensory integration to ground mechanisms in neurobiology. Models are adapted to fit experimentally collected data from 10 healthy control volunteers and 11 mild traumatic brain injury volunteers. Through an analysis by synthesis approach whose implementation was made possible by a state-of-the-art high performance computing system, we derived an interpretable, model based feature set that could classify mTBI and controls in a static balance task with an ROC AUC of 0.72.
READ LESS

Summary

Static and dynamic balance are frequently disrupted through brain injuries. The impairment can be complex and for mild traumatic brain injury (mTBI) can be undetectable by standard clinical tests. Therefore, neurologically relevant modeling approaches are needed for detection and inference of mechanisms of injury. The current work presents models of...

READ MORE

Sensorimotor conflict tests in an immersive virtual environment reveal subclinical impairments in mild traumatic brain injury

Summary

Current clinical tests lack the sensitivity needed for detecting subtle balance impairments associated with mild traumatic brain injury (mTBI). Patient-reported symptoms can be significant and have a huge impact on daily life, but impairments may remain undetected or poorly quantified using clinical measures. Our central hypothesis was that provocative sensorimotor perturbations, delivered in a highly instrumented, immersive virtual environment, would challenge sensory subsystems recruited for balance through conflicting multi-sensory evidence, and therefore reveal that not all subsystems are performing optimally. The results show that, as compared to standard clinical tests, the provocative perturbations illuminate balance impairments in subjects who have had mild traumatic brain injuries. Perturbations delivered while subjects were walking provided greater discriminability (average accuracy ≈ 0.90) than those delivered during standing (average accuracy ≈ 0.65) between mTBI subjects and healthy controls. Of the categories of features extracted to characterize balance, the lower limb accelerometry-based metrics proved to be most informative. Further, in response to perturbations, subjects with an mTBI utilized hip strategies more than ankle strategies to prevent loss of balance and also showed less variability in gait patterns. We have shown that sensorimotor conflicts illuminate otherwise-hidden balance impairments, which can be used to increase the sensitivity of current clinical procedures. This augmentation is vital in order to robustly detect the presence of balance impairments after mTBI and potentially define a phenotype of balance dysfunction that enhances risk of injury.
READ LESS

Summary

Current clinical tests lack the sensitivity needed for detecting subtle balance impairments associated with mild traumatic brain injury (mTBI). Patient-reported symptoms can be significant and have a huge impact on daily life, but impairments may remain undetected or poorly quantified using clinical measures. Our central hypothesis was that provocative sensorimotor...

READ MORE

Predicting cognitive load and operational performance in a simulated marksmanship task

Summary

Modern operational environments can place significant demands on a service member's cognitive resources, increasing the risk of errors or mishaps due to overburden. The ability to monitor cognitive burden and associated performance within operational environments is critical to improving mission readiness. As a key step toward a field-ready system, we developed a simulated marksmanship scenario with an embedded working memory task in an immersive virtual reality environment. As participants performed the marksmanship task, they were instructed to remember numbered targets and recall the sequence of those targets at the end of the trial. Low and high cognitive load conditions were defined as the recall of three- and six-digit strings, respectively. Physiological and behavioral signals recorded included speech, heart rate, breathing rate, and body movement. These features were input into a random forest classifier that significantly discriminated between the low- and high-cognitive load conditions (AUC=0.94). Behavioral features of gait were the most informative, followed by features of speech. We also showed the capability to predict performance on the digit recall (AUC = 0.71) and marksmanship (AUC = 0.58) tasks. The experimental framework can be leveraged in future studies to quantify the interaction of other types of stressors and their impact on operational cognitive and physical performance.
READ LESS

Summary

Modern operational environments can place significant demands on a service member's cognitive resources, increasing the risk of errors or mishaps due to overburden. The ability to monitor cognitive burden and associated performance within operational environments is critical to improving mission readiness. As a key step toward a field-ready system, we...

READ MORE