Summary
Variations in speech timing features have been reliably linked to symptoms of various health conditions, demonstrating clinical potential. However, replication challenges hinder their
translation; extracted speech features are susceptible to methodological variations in the recording and processing pipeline. Investigating this, we compared exemplar timing features extracted via three different techniques from recordings of healthy speech. Our results show that features extracted via an intensity-based method differ from those produced by forced alignment. Different extraction methods also led to differing estimates of within-speaker feature variability over time in an analysis of recordings repeated systematically over three sessions in one day (n=26) and in one week (n=28). Our findings highlight the importance of feature extraction in study design and interpretation, and the need for consistent, accurate extraction techniques for clinical research.