International Journal of Computer & Software Engineering Volume 4 (2019), Article ID 4:IJCSE-147, 2 pages
Alzheimer's Disease: New Challenges for Speech Analysis

Jesús B. Alonso1,* and María L. Barragán Pulido2

1Institute for Technological Development and Innovation in Communications (IDeTIC), University of Las Palmas de Gran Canaria, 35001 Las Palmas de Gran Canaria, Las Palmas, Spain
2Despacho D-102, Pabellón B, Ed. de Electrónica y Comunicaciones, Campus de Tafira, 35017 - Las Palmas, Spain
Dr. Jesús B. Alonso, Institute for Technological Development and Innovation in Communications (IDeTIC), University of Las Palmas de Gran Canaria, 35001 Las Palmas de Gran Canaria, Las Palmas, Spain; E-mail:
10 June 2019; 27 June 2019; 29 June 2019
Alonso JB, Pulido MLB (2019) Alzheimer's Disease: New Challenges for Speech Analysis. Int J Comput Softw Eng 4: 147. doi:

The increase of life expectancy in developed societies has supposed a great challenge for humanity allowing people to live better and longer. However this fact, in turn, has led the inversion of the age pyramid, linked to a larger prevalence of diseases associated with the age such as Alzheimer’s Disease (AD) or Parkinson’s Disease (PD). Both of them are incurable neurodegenerative diseases.

AD is currently the most common cause of neurodegenerative dementia around the world. In addition to being incurable, it is not possible to perform a living diagnose. According to different studies, just the figures of those affected by AD will be set to triple by 2050, establishing a clear need to rationalization and efficiency of the health system around the world.

One of the first AD symptoms is memory loss, as well as others as language difficulties or temporal and spatial disorientation. In more advanced stages, those skills for carrying out the daily tasks or, even, basic bodily functions such as walking or swallowing [1], decrease or disappear. In any case, when the first symptoms are clear and subsequently diagnosed, the damage produced is already irreparable and chronic. Today, the diagnosis process as a screening method is limited since it is long in time, expensive and highly invasive.

This situation keeps awake the interest in the search for biomarkers located in more accessible parts of the body and, of course, AD sensitive before the clinical onset of dementia. Finding easily accessible biomarkers would be an economic solution to early diagnosis and its subsequent monitoring at the specific stages of the disease. Lots of researches in this respect point to clinical tests based on biomarkers from memory subjective assessment, late-life depression or speech, olfactory, and gait analyses. In a relatively recent manner, various neurophysiological tests based on electroencephalography (EEG) and magnetoencephalography (MEG) are under-study. Until today, there are no decisive results.

Among the many symptoms of AD, language problems are considered by lots of researches one of the most characteristic symptoms of AD, which appear as a direct and inevitable consequence of cognitive impairment. Years before the clinical diagnosis occurrence, the language already shows significant cognitive impairment in preclinical patients. Specifically, some resources state that the first year after the onset of the disease, different language skills appear obscured by a loss of interest and spontaneity, spatial disorientation, and memory disorders [2-4]. Although it affects to verbal fluency, is usually not detected. The emotional response capacity is affected and there are often social and behavioral changes, which could be due to that memory loss. Likewise, the alteration of perception abilities could also magnify some emotional responses [5].

The specific communicative problems, such as aphasia and anomy and the emotional response capacity depend on the stage of the disease and increase with AD progression. For that reason some researches that AD could be more sensitive detected by using a linguistic analysis than other cognitive exams.

For the last years, technics based on the automatic processing of the voice signal from its record have found an important place on the language assessment applied to the detection of neurodegenerative diseases. These technics offer the possibility of quantifying those relevant signal properties for a pathology description. Later, supported by Machine Learning methods, the classification process of the samples attending to the obtained results is performed. In the same context, Deep Learning [6] appears as a more complex machine learning method, every day more important. These methods present the advantage of applying in an automatic manner avoiding the likely influence of an interviewer or intermediary.

An exponential increase of the number of researches has been documented in the last years, whose aim has been to include the speech a no invasive AD biomarker. Since the first lines appeared, almost 80 % of the studies have focused on using conventional parameters; mainly duration of voiced or unvoiced segments, pitch, amplitude, and periodicity, as well as others, obtained from the temporary, frequential, and cepstral domains. These variables, as demonstrated, have provided information about the cognitive processes and their results have been directly related to the specific stage of the disease. Likewise, different concepts as the voice quality or the emotional temperature have been defined. Other techniques as the Automatic Spontaneous Speech Analysis (ASSA) [7] involves a combination of different qualities of the voice (durations, short time energy and spectral centroid, for example) and provides very relevant data. By means of classifying these data, in most cases using Support Vector Machine, k-Nearest Neighbors, Linear Discriminant Analysis [8] or Multi-Layer Perceptron [9] classifiers, the published papers in the field have achieved objective and promising assessments of the AD stage. The experimental and statistical assessments in this regarding point to using Machine Learning algorithms with linguistic biomarkers from the verbal sentences of elderly people. In this sense, the future should be clearly oriented towards the new techniques of Deep Learning, which are also introducing an interesting way for the classification of complex systems such as the speech or voice. The regarding studies present encouraging results although there is a strong need to train models for the evolving control using larger data sets.

Currently, some researches also include emotional analysis with classic features such as the pitch, intensity or more recently, the Emotional Temperature (ET) [10]. Some of them introduce methods as the Emotional Speech Analysis or ASSA, which use different conventional features and, combined with ET, have discriminated AD participants from healthy controls with accuracy 94% and by using an SVM classifier [11]. Others have developed their analysis using transcripts from Voice Activity Detection [12]. Moreover voice or speech acoustic analysis they add lexical, semantic, punctuation or syntactic analysis from the communicative process.

Since approximately 2012, more and more researches focused on this line point to the need to build on the no linear and no stationary aspects. Lots of researchers have proposed that those subtle cognitive changes in early stages and preclinical could be better detected by using fractals combined with other no conventional measures such as the Hurst Exponent [13] or, simply, combining with the conventional features previously exposed. Due to the voice signal provides linear and no linear aspects, applying both features combined offers more complete results. By its part, the disturbances in emotional responses suffered by the patient and which could be analyzed and quantified from the voice significantly improve the detection results.

It is important to highlight that, to date, one of the main limitations has been the lack and diversity of available samples to train models allowing the evolving control of the AD. The most part of the databases located lacks the amount of data needed to perform a truly consistent analysis and there is the inconvenience of have been carried out by different guidelines and criteria.

Anyway, developing to this end eHealth 4.0 solutions, such as web applications based on the speech, would enable to democratize the evolving and pharmacological control in an easy, fast, no invasive and scalable way, offering objective parameters and facilitating the work to the specialist doctor. None of the exposed techniques would require an extended infrastructure nor medical equipment availability, and could be used even remotely as a Telecare solution.

The interactional multimodal analysis could be another way to early assess the AD. It could be either from other behavioral characteristics, such as writing or from another kind of biomarkers, such as the blood. However, it would be necessary to carry out future studies to clearly define this point. Pending is still a differential characterization with respect to other neurodegenerative pathologies such as Parkinson Disease or Amyotrophic Lateral Sclerosis (ELA), also well studied today.

Competing Interests

The authors declare that they have no competing interests.


  1. Escobar LMV, Afanador NP (2010) Calidad de vida del cuidador familiar y dependencia del paciente con Alzheimer. Av en Enfermería 28:116-128. View
  2. Lenguaje EL, La EN, Alzheime NDA (1988) El lenguaje en la enfermedad de alzheimer. Foniatría y Audiol 120: 199-205.
  3. Sjogren T, Sjogren H, Lindgren AG (1952) Morbus Alzheimer and morbus Pick; a genetic, clinical and patho-anatomical study. Acta Psychiatr Neurol Scand Suppl 82: 1-152.
  4. Allison RS (1962) The Senile Brain: A Clinical Study. Posgraduate Med J. View
  5. Laske C, Sohrabi HR, Frost SM, López-de-Ipiña K, Garrard P, et al. (2015) Innovative diagnostic tools for early detection of Alzheimer’s disease. Alzheimer’s Dement 11: 561-578. View
  6. Kim Y, Lee H, Provost EM (2013) Deep learning for robust feature generation in audiovisual emotion recognition. Speech and Signal Processing (ICASSP), 2013 IEEE International Conference. View
  7. Lopez-de-Ipiña K (2012) Alzheimer disease diagnosis based on automatic spontaneous speech analysis,” in International Joint Conference on Computational Intelligence. IJCCI 2012: proceedings of the 4th International Joint Conference on Computational Intelligence: Barcelona, Spain.
  8. Roy D, Pentland A (1996) Automatic spoken affect classification and analysis. Automatic Face and Gesture Recognition, Proceedings of the Second International Conference.
  9. Lopez-de-Ipina K, Alonso JB, Travieso CM, Egiraun H, Ecay M, et al. (2013) Automatic analysis of emotional response based on non-linear speech modeling oriented to Alzheimer disease diagnosis. IEEE International Conference on Intelligent Engineering Systems (INES). View
  10. Alonso JB, Cabrera J, Medina M, Travieso CM (2015) New approach in quantification of emotional intensity from the speech signal: emotional temperature. Expert Syst Appl 42: 9554-9564. View
  11. López-de-IpiñaEmail K, Alonso JB, Solé-Casals J, Barroso N, Henriquez P et al. (2015) On Automatic Diagnosis of Alzheimer’s Disease Based on Spontaneous Speech Analysis and Emotional Temperature. Cognit Comput 7: 44-55. View
  12. Barrett PA (2000) Voice activity detector.
  13. Bhaduri S, Das R, Ghosh D (2016) Non-Invasive Detection of Alzheimer’s Disease-Multifractality of Emotional Speech. J Neurol Neurosci 7: 2. View