Auditory Spectro-Temporal Modulation Training in Children With Auditory Processing Disorder

Jalilzadeh Afshari, Parisa; Lotﬁ, Younes; Moossavi, Abdollah; Bakhshi, Enayatollah; Sadjedi, Hamed

doi:10.32598/irj.20.2.1392.2

Volume 20, Issue 2 (June 2022) Iranian Rehabilitation Journal 2022, 20(2): 185-198 | Back to browse issues page

‎ 10.32598/irj.20.2.1392.2

Mendeley

Zotero

RefWorks

Jalilzadeh Afshari P, Lotﬁ Y, Moossavi A, Bakhshi E, Sadjedi H. Auditory Spectro-Temporal Modulation Training in Children With Auditory Processing Disorder. Iranian Rehabilitation Journal 2022; 20 (2) :185-198
URL: http://irj.uswr.ac.ir/article-1-1733-en.html

Auditory Spectro-Temporal Modulation Training in Children With Auditory Processing Disorder

Parisa Jalilzadeh Afshari ^*

¹, Younes Lotﬁ¹

, Abdollah Moossavi²

, Enayatollah Bakhshi³

, Hamed Sadjedi⁴

1- Department of Audiology, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran.
2- Department of Otolaryngology and Head and Neck Surgery, School of Medicine, Iran University of Medical Sciences, Tehran, Iran.
3- Department of Biostatistics, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran.
4- Department of Electrical Engineering, Faculty of Engineering, Shahed University, Tehran, Iran.

Keywords: Auditory processing disorder, Auditory training, Spectro-temporal modulation, Speech perception in noise

Full-Text [PDF 1098 kb] (528 Downloads) | Abstract (HTML) (1398 Views)

Full-Text: (358 Views)

1. Introduction
One of the most salient features of the speech signal is energy fluctuations with dynamic peaks and valleys [1], occurring during the time and along the spectrum axis. The spectrogram of natural sounds and speech signals shows modulation energy concentration mostly at slower temporal modulations and lower frequencies [2]. These spectro-temporal fluctuations of speech reflect the nature and articulation rate of phonemes and syllables. For instance, the slow onset of temporal modulations conveys vowels. Stops consonants are transferred by rapid onset temporal modulations, while the fricative consonants are carried by temporal modulations with an intermediate rate [3]. Reconstructing the degraded speech signal from spectro-temporal modulations (STMs) spectrogram leads to better speech perception, so it can be inferred that speech perception highly depends on the accurate processing of STMs [1].
The ongoing and complicated patterns of STMs define different aspects of speech perception. The temporal modulation patterns contain principal syllabic information, whereas spectral modulation patterns provide a critical clue about formants and pitch perception. Formant transition is carried by the mixed and simultaneous patterns of STMs [4]. Fundamental data for speech categorization into phonemic groups are embedded in dynamic aspects of STMs. Phonemic recognition studies show that about 80% of phonologic errors occur in phonemes with similar vocal characteristics and close STMs features [5]. The temporal modulation rates necessary for syllabic perception are 2-5 Hz and for phonemic perception are 15-30 Hz [4]. Destruction of temporal modulations above 16 Hz can affect consonant identification [3]. The principal densities of spectral modulation for vowel recognition and speech comprehension are among 1 to 4 cycles per octave. Also, the needed spectral modulations for gender identification are in the range of 3 to 7 cycles per octave [4]. So correct speech and melody perception, sound localization, complex environmental sound encoding, pitch perception, and timbre detection depend on accurate processing of different rates and various densities of STMs [6, 7].
Based on intracranial recordings, functional magnetic resonance imaging, and electrophysiological studies, the human auditory cortex has specialized neuronal groups at the superior temporal gyrus for optimal representation of speech sounds which have increased sensitivity to STMs [2]. These higher-order neurons of the auditory cortex have preferred tuning to STMs and construct a multi-dimensional representation of acoustic components in which separate neuronal clusters respond selectively to the STM features of a specific phoneme. Hence, they can decode the phoneme with high fidelity. As a result, the mean firing rates of neurons in response to different phonemes are classified according to their best frequency and STMs. For example, neurons tuned to the rapid temporal modulations respond better to the stops consonants, and neurons with slow temporal modulations decode fricative consonants alternatively [5].
These response specializations to the STMs and speech segments are mostly in Heshel’s gyrus and adjacent cortical areas. Imaging and functional studies in humans show that the mentioned auditory cortical areas in the processing of STMs have a tonotopic organization [8]. The close relationship between structures involved in processing STMs and speech signals demonstrates that the complex spectro-temporal nature of speech sounds is represented by STMs sensitive filters through spectro-temporal receptive fields. Given that the precise representations of STMs information in the peripheral and central auditory system may be the underlying mechanism of speech perception [4], various strategies in the extraction of these modulations from speech signals and also decreased ability of the peripheral or central auditory systems in decoding and processing of the STMs can affect the speech comprehension abilities.
Previous studies have shown that children with auditory processing disorder (APD) have reduced ability the processing temporal [9], spectral [10], and STMs [11]. Lotfi et al. reported that children with APD perform poorly in STMs detection tasks, which strongly correlates with speech in noise tests. They asserted that the incomplete development of auditory processing circuits increases STMs detection thresholds accompanied by decreased speech perception in noise results in children with APD [11].
Auditory processing refers to the efficiency and capability of the central nervous system in decoding and processing auditory inputs [12, 13]. Overlapping information processing in different stages of the auditory system leads to intrinsic redundancy. Thus, despite the malfunction in one part of the auditory system, overlapped processing in other structures can compensate for the destructed information. Besides, speech has extrinsic redundancy due to the context, grammar, and visual cues. Both intrinsic and extrinsic redundancy retrieve the degraded speech segments and can improve the perception. Suffering from APD and the competing signals or reverberation, the listeners could not use this redundant information. The result is speech in noise problem, which is a common phenomenon in APD [14].
APD is a perceptual deficit in the auditory modality, not because of higher-order processing disorders, such as linguistic or cognitive domains [15, 16]. APD is considered a dysfunction in at least one of the following areas: temporal ordering, temporal integration, temporal and frequency resolution, sound localization, auditory pattern recognition, and auditory discrimination [17].
Individuals suspected of APD usually have problems in challenging situations like speech comprehension in background noise and rapid speech. The children with APD might have verbal education problems and, notwithstanding normal peripheral hearing, manifest behaviors consistent with hearing loss [12, 13]. It is of particular concern as it may negatively impact their academic performance, learning, and interpersonal interactions that reduce their quality of life. Auditory training is a helpful method for coping with the challenges children with APD face.
Auditory training is commonly composed of strategies and practices that concentrate on auditory skills and speech comprehension improvement. The success rate of a training program is variable and depends on the training protocol, duration, and the stimulus type of the training [18]. The crucial aspect of proper auditory training is the learning transfer from trained stimuli to other untrained skills, especially practical abilities like speech understanding of noise.
Considering variable speech in noise perception as one of the most significant problems in children with APD on the one hand and the multiple STM interactions in speech signals, on the other hand, we suggested that auditory STMs training might improve the children’s ability to use spectro-temporal cues for better speech perception. The current study hypothesizes that auditory STMs training can alter modulation processing and consequently boost speech perception in noise. Thus, the present research aimed to investigate the effects of auditory STMs training on speech in noise perception of children with APD.
2. Materials and Methods
Case selection
A total of 35 children with APD (8-12 years old students) were recruited based on the inclusion criteria in this analytical interventional study (pretest-posttest with a control group). The parents were explained that all tests are non-invasive and that their children’s information will be kept private. Then, written consent was obtained from parents for evaluation and auditory training sessions. The inclusion criteria were identical for the training and control groups. The children who met the inclusion criteria were randomly divided into two groups based on random quadruple blocks: 18 children in the control group (Mean±SD age: 9.88±1.36 years; 11 females and 7 males) and 17 children in the training group (Mean±SD age: 10.11±1.45; 10 females and 7 males). There was no statistical difference between groups in terms of sex and age.
The inclusion criteria were as follows: bilateral normal pure tone audiometry (auditory thresholds less than 20 dB HL in 500 to 4000 Hz frequency range); symmetric hearing (auditory thresholds difference less than 10 dB HL between two ears); An type tympanogram; 85 or higher Wechsler intelligence quotient score, monolingual Persian children; no history of ADHD, seizures, behavioral or developmental disorders; no consumption of any medications affecting central nervous system; poor educational performance (based on the auditory processing domain questionnaire in the Persian language replied by parents) [19]; abnormal results in all 3 subtests of multiple auditory processing assessment (MAPA) [11, 20, 21]; abnormal results in STMs detection tasks and speech in noise tests (consonant-vowel [CV] in noise and word in noise test) [11].
Since this research followed our first study [11], the case selection was based on the obtained results from the first study. Accordingly, MAPA subtests, STM detection, and speech in noise evaluations (before training assessments) were done in our first study. Finally, the APD children with desired features according to inclusion criteria were enrolled in training (the current study). Therefore, a summary of evaluations carried out in our first study is provided here, and the reader is referred to the main article for thorough and detailed information [11].
The MAPA test battery was used for APD assessment which has three subtests with high degrees of sensitivity and specificity, including the dichotic digits test (DDT) [22], pitch pattern sequence test (PPS) [23], respectively. In this survey, the effect of bilingualism as a potential confounding factor on auditory temporal processing abilities was investigated in early Azari-Persian bilinguals. In this cross-sectional non-interventional study, GIN and PPS tests were performed on 24 (12 men and 12 women, and monaural selective auditory attention test (mSAAT) [24]. Table 1 presents MAPA subtest details. Children with abnormal results in all three subtests of MAPA were identified as children with APD.

Before training evaluations
Spectro-temporal modulation detection test
To meet other inclusion criteria, the STM detection tasks and speech in noise tests were used. The STM detection tasks were done using MATLAB software and under headphones at 65 dB SPL monaurally. Six stimuli were created by mixing three temporal modulation rates (4, 12, and 32 Hz) and two spectral modulation densities (0.5 and 2 cycles per octave).
We applied 3-alternative adaptive-3-interval forced-choice methods with 2 down/1 up adaptive steps to obtain the psychometric level of 70.7%. The evaluation was started from the highest modulation depth and adaptively changed (decreased followed a correct answer or increased after an incorrect response) in 6-dB step for the first trial, then 4-dB steps for the second and third trials, and finally, the modulation depth was decreased by 2-dB steps for six remaining trials. For each ear and six stimuli, the task was performed twice. In each run (a total of 9 returns), the modulation depths (in dB) of the last six reversals were averaged, and the final threshold was the mean of averaged modulation depths of the two or three runs [11]. In STM detection tasks, more negative thresholds in dB mean better performance. Based on the 3I-3AFC (3-interval-3-alternative forced choice) method, 3 stimuli were presented to the children in which one of them was different from the other two in a random fashion. The child was instructed to choose a different sound. Before the preliminary test, the children were given the necessary training [11].
Speech in noise tests
For speech in noise assessment, the Persian versions of the consonant-vowel (CV) in noise and word in noise tests were administered [11]. The CV in the noise test has the least cognitive demands and mainly depends on the acoustical processing of the speech signal in the upward auditory system. A total of 25 meaningless CV (especially the stop consonant in a combination of vowels) in white noise at different signal-to-noise ratios (+12 to -12) were implemented monaurally. The child was asked to repeat the syllables, and the scoring was based on percentage. A higher percentage means better performance [11, 25].
Monosyllabic words for children in a babble competing noise were presented monaurally at different signal to noise levels from +24 to 0 dB in word in noise test. This test needs a lower working memory load, so it assesses the bottom-up processing [26] and the words-in-noise test is one of the easiest ones in terms of speech materials. This study aimed to develop and determine the validity and reliability of the Persian version of the words-in-noise (WIN. The children were guided to repeat each word, and the scores were calculated based on a signal-to-noise ratio in which 50% of the correct response was obtained (Spearman-Kärber equation) [11, 27]. The lower scores indicate better speech in noise perception.
The tests were randomly administered across the subjects (inter-subjects), although in each case, the pre- and post-training assessments were in the same order.
After completion of the test battery, the children with APD who met the inclusion criteria were enrolled in the intervention phase. Seventeen children were randomly selected for ten sessions of formal auditory training.
Auditory spectro-temporal modulation training
Based on a psychoacoustic viewpoint, there are two principal training programs: discrimination and identification. These fundamental auditory processes are crucial elements of auditory perception [18]. As there is no straightforward gold standard regarding optimal training, the training of the current study was based on “identification.”Identification tasks possess a decision of different stimuli from the rest.
As shorter training sessions may be practical and more influential [18], our modulation identification games were administered in 1-hour sessions twice a week for five weeks. Our formal training included completely purposeful and structured exercises performed for each subject in consecutive sessions and at specific times, respectively. The difficulty level of the training sessions gradually increased upon the child’s performance. As motivation throughout training is a paramount factor in gaining successful outcomes, reinforcement was provided after each favorable fulfillment to keep the child motivated. The intervention was done in a quiet room with engaging and various methods.
The training focused on the improvement of STM detection. The STM stimuli were created by applying spectral and temporal modulation to four thousand broadband frequency carrier tones in MATLAB software. To boost the modulation identification ability of the auditory system in the range required for speech modulations processing, the training STM stimuli had a wider range of temporal modulation rates (4-8-12-16-32-64 Hz) and spectral modulation densities (0.5-1-2-4 c/o) compared to STM stimuli used for evaluations. So similar and different STM stimuli were trained. The modulation depth of the target signal was adjusted by altering the amplitude relative to the flat spectrum from 0 (0% modulation) to 1 (100% modulation). For each combination of temporal and spectral modulations, by varying the modulation depths from −30 dB to 0 dB (in 2-dB steps], 15 stimuli were created. The training was performed with headphones binaurally at 65±8 dB SPL. As intensity cues may ease the modulation detection, the stimulus intensity levels of the training were randomized to decrease the intensity cues [28].
The 3I-3AFC paradigm was used due to less cognitive demand for kids. In each training trial, three stimuli were presented to the child. Two of them were the reference signal (without modulation), and one was the target (modulated signal) (Figure 1).

The target signal was presented randomly to make the prediction effects the least.
For the best co-operation, the training’s importance and rationale were explained to the children. Also, the participants were trained to identify and choose the odd sound and point to its correspondent card (or training toy). If the children could distinguish the correct answer, they would be encouraged through visual and verbal feedback (Figure 2).

Otherwise, the right answer would be taught.
Studies have shown that implementing an auditory rehabilitation program for children with APD requires a person-centered approach [29]. Thus, the training stimulus characteristics were set in line with the results of the STM detection tasks for each subject correspondingly. Generally, a larger difference between target stimulus and reference stimuli (modulation depth above the threshold) makes the target identification easier and leads to better performance. Also, lower densities of spectral modulation and slower rates of temporal modulation are the simplest stimuli. So, the starting trial included the highest modulation depths and simplest modulation features (0.5 c/o and 4 Hz) to make the task easier. For each training block, 30 trials were applied, so in a single session, 120 trials were done. The difficulty level of each training block (30 trials) changed adaptively. After each appropriate response, the modulation depth was reduced to complicate the task. If the child could not detect the target signal, the modulation depth increased. The changes in modulation depths were in 2-dB steps with a 1 down/ 1 up adaptive procedure. Studies have shown that adaptive training methods with variable hardship levels in accordance with the subjects’ abilities are more effective [30]. The training must be challenging enough for optimal performance and motivation, so to achieve this goal, the training difficulty was adaptively modified to keep training close to the subjects’ performance level (30%-70%) [29]. As a result, the prerequisite for increasing the training difficulty was achieving 70% (at least) of one’s border of competency.
After the last training session, all children in both groups (training and control) went under evaluation of the STM detection and speech in noise. Also, the evaluations were repeated 1-month post-training to assess the reliability of training effects.
During the training sessions, three children were excluded from our study. One of them was no longer eager to participate due to personal problems, and the two others did not finish their sessions. So, we replaced them with new cases, and all the procedures mentioned above (sections 2.2 and 2.3) were repeated. The control group could have participated in the training sessions after the research.
3. Results
Data analysis
The data were statistically analyzed using SPSS v21 (IBM, Armonk, NY, USA) software at the significance level of 0.05. The paired t test and Wilcoxon were applied to show the training effects (comparing pre- and post-training results within groups). The covariance and Mann-Whitney tests were used to assess between groups’ comparisons after training. Finally, the repeated measurement and Friedman tests were applied to analyze the reliability of training outcomes.
We assessed our assumption regarding the efficacy of auditory modulation training by performing the training just for the intervention group. The STM detection thresholds were obtained in both groups before and after the training. The paired t test showed significant differences in STM thresholds (in both ears and all six stimulus conditions) of the intervention group pre- and post-training (P<0.001). In contrast, the control group who had not received the training showed no difference before and after training in STM thresholds (P>0.05). The results indicated improvement in STM detection thresholds of the training group exclusively (Figure 3).

The Kolmogorov-Smirnov test of normality showed that speech tests (CV in noise and word in noise) had non-normal distribution in both intervened and control groups (P<0.05). So, the Wilcoxon test was used to compare speech in noise test results before and after training in both groups. Table 2 displays that in the training group, the speech in noise scores follow similar trends as STM thresholds.

In other words, the training had successful effects on the improvement of speech in noise perception (Table 2).
The STM detection thresholds post-training were compared between the intervention and control groups by the covariance test. Via this test, the interfering effects of pre-training thresholds as a confounder variable could be controlled. Significant differences between training and control groups in STM thresholds post-training were observed (Table 3).

The Man-Whitney test was also used to compare post-training scores of speech in noise tests between groups. Similar to STM thresholds, the speech in noise scores showed significant differences after training between intervened and control groups (Table 4).

The enhancement of STM detection thresholds and speech in noise scores showed that the auditory modulation training was effective. Now the question is how long the effects of training will remain even after the end of the intervention. Therefore, the reliability of the training effects was also examined. For this purpose, the STM detection thresholds and speech test scores were retested one month after the training finished (just in the training group). The STM detection thresholds were compared before, after, and one month later using repeated measurement. At first, Mauchly’s test of Sphericity was done (P>0.05). By the Greenhouse-Geisser test, no statistically significant difference was found between post-training and one-month later STM thresholds (P<0.05). This lack of significant difference indicates the reliability of training effects that remained within one month after the end of intervention (Figure 4).

The non-parametric Friedman test was used to compare speech comprehension scores between three assessment stages (before, after, and one month after training finished). Table 5 indicates no significant difference in the CV in noise test scores between immediately after training and one month later (P>0.05).

It means that the CV in noise scores improvement due to training was reliable. However, the auditory modulation training efficacy for the word in noise test was not well maintained, which was statistically significant only in the right ear (Table 5).
4. Discussion
Effects of auditory STM detection training
One of the most salient problems of children with APD is the variable performance in speech understanding tasks. Given that speech contains spectro-temporal acoustic cues, it is likely that speech comprehension problems of these children can be explained by changes in their coding and accurate processing of STM information. The present study determined whether auditory STM training would enhance the speech comprehension skills of children with APD. The lack of published data regarding our hypothesis demanded the current study. Because there are few studies about STM training and its relation to speech in noise, especially in children with APD, we discussed our findings with somewhat similar studies.
Based on our assumption about the role and importance of STM processing in speech understanding, notably in adverse situations such as background noise or competing signals, we trained children with APD by an auditory STM identification practice. Subsequently, the STM detection thresholds’ modifications and changes in the speech in noise scores were tracked. Only the training group received the STM detection practice, but the evaluations (before and after training) were performed for both training and control groups.
Improvements have been observed in STM detection thresholds in children with APD who were intervened. These children had better performance and lower thresholds in STM detection tasks post-training. Sabin et al. [28] studied the perceptual learning of seven sessions of STM depth discrimination training in healthy young adults. This study demonstrated that in the auditory system, there are modulation-sensitive filters that can be modified through specific training and have an impact on perception. Although they used contrasting training methods of discrimination rather than identification and had different samples, our findings are somewhat consistent. They also stated that the discrimination training led to degradation of the modulation detection thresholds, which means that their discrimination training effects did not generalize to the identification task [28]. Fitzgerald and Wright [31] and if so whether this learning generalizes to untrained cases. To address these issues, normal-hearing adults (n=9 assessed the improvement of sinusoidal amplitude modulation detection with training in 9 healthy adults and investigated whether the learning effects would be generalized to unpracticed stimuli. They used 720 trials every day for 6 to 7 days and recorded the scores before and after training. Their evaluations consisted of amplitude modulation detection and discrimination tasks. They concluded that practice enhanced the amplitude modulation detection ability in the trained group. Although our study differs from the mentioned study based on samples and stimulus, the improved modulation detection ability in both studies is congruent. Unlike the study by Sabin et al. [28], Fitzgerald and Wright [31] and if so whether this learning generalizes to untrained cases. To address these issues, normal-hearing adults (n=9 noted some degrees of the learning generalization to untrained stimulus conditions. These studies indicate that the improvements caused by training are due to perceptual learning in various processing stages of the auditory system. The changes in auditory modulation perception can happen in a wide range from primary sensory representation to interpretation of this sensory information. Alterations in weighting to modulation cues in the central auditory system can explain the learning effects of the modulation practice. In a study by Faulkner, the effects of spectral ripple discrimination training were investigated in cochlear implantee adults [32]. They showed that 23 sessions of training led to better performance in subjects, which is consistent with ours. Although modulation training with spectral or temporal cues can improve the modulation processing, we should bear in mind that spectro-temporally interacted modulation cues (STM stimuli) are more complicated and have different neural pathways with specially tuned neurons. Barlow et al. [33] Frequency Discrimination, Spectral Rippled Noise [SRN], Iterated Rippled Noise, Temporal Modulation reported that 7 hours of temporal modulation detection training doesn’t significantly change the thresholds of the temporal modulation transfer functions in 10 adults with cochlear implants. The inconsistency observed between ours and this study’s outcome may be justified by different samples, methods, and the number of training sessions.
Auditory training can change the auditory perception and its neurophysiological substrate. These induced plastic changes in neural connections and their activity of the auditory pathways lead to perceptual learning. STM training can alter the neural circuits tuned to STM and affects perception. Physiologic studies have revealed that the function of STM-tuned neurons in the inferior colliculus, mid-brain, and auditory cortex can be modified by identification and discrimination practices [34, 35] providing multiple snapshots of the dynamic STRF during ongoing behavior. Attending to a specific target frequency during the detection task consistently induced localized facilitative changes in STRF shape, which were swift in onset. Such modulatory changes may enhance overall cortical responsiveness to the target tone and increase the likelihood of ‘capturing’ the attended target during the detection task. Some receptive field changes persisted for hours after the task was over and hence may contribute to long-term sensory memory.”Previous research has demonstrated that auditory cortical neurons can modify their receptive fields when animals engage in auditory detection tasks.Wetested for this form of task-related plasticity in the inferior colliculus (IC. The behavioral data from our study (decreased STM detection thresholds post-training) also confirm the plasticity of the STM processing pathways in the auditory system.
The optimal goal of successful auditory training is the learning generalization from trained stimuli to untrained conditions. Auditory training based on a specific auditory task can improve performance, not only in the trained task but also in the skills that are not explicitly taught. It can be inferred that the auditory training has some degree of generalization within or between tasks. This inference was our rationale for evaluating the generalization of the STM identification training effects on speech in noise performance. To investigate the learning transfer, the speech tests were done before and after training in intervened and control groups, and the results were compared.
Our results demonstrate that the intervened group had significantly better speech in noise scores post-training than control peers. Post-training speech in noise outcomes revealed that the STM detection training was beneficial, and generalization occurred through procedural learning. The enhanced speech scores are consistent with other studies using computerized auditory training software in APD [18].
Wilson et al. [36] stated that improving modulation detection ability can generalize to speech comprehension in cochlear implant users. In a study by Barlow et al. [33] Frequency Discrimination, Spectral Rippled Noise [SRN], Iterated Rippled Noise, Temporal Modulation, adults with cochlear implants showed better speech in noise results after psychoacoustic training such as temporal modulation detection practice. However, in Faulkner et al. study, the word detection in noise ability and musical perception in two of eight adults with cochlear implants progressed after spectral ripple discrimination rehabilitation [32]. Approximately, auditory training studies in children have shown measurable improvements in trained tasks and similarly general language skills of children post-training programs [37, 38, 39] The auditory brainstem response reflects neural encoding of the acoustic characteristic of a speech syllable with remarkable precision. Some children with learning impairments demonstrate abnormalities in this preconscious measure of neural encoding especially in background noise. This study investigated whether auditory training targeted to remediate perceptually-based learning problems would alter the neural brainstem encoding of the acoustic sound structure of speech in such children. Nine subjects, clinically diagnosed with a language-based learning problem (e.g., dyslexia). Millward et al. [40] listening or language difficulties. Because poor speech-in-noise perception is commonly reported, previous research has focused on the use of complex (word/sentence stated that the learning transfer depends on which extent the trained and untrained stimuli are similar. If the training stimulus and the evaluations have similarities, learning generalization can be expected.
Accurate speech processing depends on the neural mechanisms integration in peripheral and central auditory systems. The spectral and temporal processing of the acoustic signal starts at the periphery and then expands in the central auditory nervous system that responds selectively to these cues. Consequently, training will enhance the peripheral representation and promote the central processing of spectro-temporally acoustic information. The generalization of the learning to speech in noise outcomes in this study reveals that training caused changes in neural pathways in which STM and speech are processed likewise or may demonstrate parallel plasticity in STM pathways and speech processing networks. This result agrees with the STM processing model in which the sensitive STM filters improve the speech features processing of the auditory system. In this study, the training generalization to speech perception was quantitatively assessed by speech in noise tests. In furtherance of qualitative validation of the training effects on an individual’s ability in real life (listening efforts, social behavior, education, and academic achievement), we suggest subjective evaluations with questionnaires.
Auditory STM detection training reliability
Of interest to evaluate our training reliability in the intervened group, the STM detection and speech in noise tests were repeated one month after the end of the training sessions. The STM detection thresholds did not change significantly during this one month. It shows the reliability of the training effects on STM detection thresholds in children with APD. In Fitzgerald and Wright’s study [30], the ramification of modulation rate discrimination training remained till 15 months post-training. Faulkner [31] also noted the reliability of ripple discrimination training till 3 months after rehabilitation in adult cochlear implanted users.
The better CV in noise test results in the training group of our study was retained for one month after training. Nevertheless, the same trend was not seen for the word in noise test, which may be arising from our short course of the training program. However, the word in noise test scores one month after training was still better than before training. Promoting training sessions may be effective in maintaining performance improvement.
5. Conclusion
This study portrays an auditory training program that emphasizes the importance of STM features of acoustic cues. Our training efficacy encompassed not just prosperous consequences on STM detection thresholds but also generalized to upper processing levels like speech in noise. These results show that modulation training may help manage children with APD. However, due to the lack of similar studies, clinically applying this method needs further investigations. We suggest that future studies investigate the training effects in a larger population by adopting electrophysiologic evaluations. Furthermore, comparing spectral, temporal, and spectro-temporal modulation training separately may be proper in framing a successful training protocol. Finally, long-run monitoring of the training outcomes is helpful to evaluate the success of the rehabilitation following an identification.

Ethical Considerations
Compliance with ethical guidelines
All ethical principals were considered in this article. The participants were informed of the purpose of the research and its implementation stages. They were also assured about the confidentiality of their information and were free to leave the study whenever they wished, and if desired the research results would be available to them. A written consent has been obtained from the subjects. Principles of the Helsinki Convention was also observed. This study was approved by the Ethics Committee of the University of Social Welfare and Rehabilitation Sciences of Tehran under code IR.USWR.REC.1396.253.

Funding
This research did not receive any grant from funding agencies in the public, commercial, or non-profit sectors.

Authors' contributions
Conceptualization and Supervision: Parisa Jalilzadeh Afshari, Younes Lotfi, and Abdollah Moossavi; Methodology: Parisa Jalilzadeh Afshari, Enayatollah Bakhshi, Hamed Sadjedi; Data collection: Parisa Jalilzadeh Afshari: Data analysis: Parisa Jalilzadeh Afshari, Enayatollah Bakhshi; Investigation and Writing–original draft: Parisa Jalilzadeh Afshari; Writing--Review & editing: All authors.

Conflict of interest
The authors declared no conflict of interest.

Acknowledgments
We wish to thank all participants for their collaboration in this study.

References

Chi T, Gao Y, Guyton MC, Ru P, Shamma S. Spectro-temporal modulation transfer functions and speech intelligibility. The Journal of the Acoustical Society of America. 1999; 106(5):2719-32. [DOI:10.1121/1.428100] [PMID]
Santoro R, Moerel M, De Martino F, Valente G, Ugurbil K, Yacoub E, et al. Reconstructing the spectrotemporal modulations of real-life sounds from fMRI response patterns. Proceedings of the National Academy of Sciences of the United States of America. 2017; 114(18):4799-804. [DOI:10.1073/pnas.1617622114] [PMID] [PMCID]
Winn MB, Won JH, Moon IJ. Assessment of spectral and temporal resolution in cochlear implant users using psychoacoustic discrimination and speech cue categorization. Ear and Hearing. 2016; 37(6):e377-90. [DOI:10.1097/AUD.0000000000000328] [PMID] [PMCID]
Zheng Y, Escabí M, Litovsky RY. Spectro-temporal cues enhance modulation sensitivity in cochlear implant users. Hearing Research. 2017; 351:45-54.[DOI:10.1016/j.heares.2017.05.009] [PMID] [PMCID]
Kafi AK, Ahmadalinezhad A, Wang J, Thomas DF, Chen A. Direct growth of nanoporous Au and its application in electrochemical biosensing. Biosensors and Bioelectronics. 2010; 25(11):2458-63. [DOI:10.1016/j.bios.2010.04.006] [PMID]
Theunissen FE, Elliott T. Spectral and temporal modulations essential to spoken word, gender and timbre identification. The Journal of the Acoustical Society of America. 2008; 123(5):3933. [DOI:10.1121/1.2935997]
Mohan A. Neural correlates of spectral, temporal and spectro-temporal modulation [MS Thesis]. Tampa: University of South Florida; 2014. [Link]
Hullett PW, Hamilton LS, Mesgarani N, Schreiner CE, Chang EF. Human superior temporal gyrus organization of spectrotemporal modulation tuning derived from speech stimuli. Journal of Neuroscienc. 2016; 36(6):2014-26. [DOI:10.1523/JNEUROSCI.1779-15.2016] [PMID] [PMCID]
Ly W. Amplitude modulation perception in children with auditory processing disorder [MS Thesis]. London: The University of Western Ontario; 2019. [Link]
Ankmnal Veeranna S, Allan C, Macpherson E, Allen P. Spectral ripple discrimination in children with auditory processing disorder. International Journal of Audiology. 2019; 58(11):733-7. [DOI:10.1080/14992027.2019.1627007] [PMID]
Lotfi Y, Moosavi A, Afshari PJ, Bakhshi E, Sadjedi H. Spectro-temporal modulation detection and its relation to speech perception in children with auditory processing disorder. International Journal of Pediatric Otorhinolaryngology. 2020; 131:109860. [DOI:10.1016/j.ijporl.2020.109860] [PMID]
Alles R, Bamiou D, Batchelor L, Campbell NG, Canning D, Grant P, et al. BSA position statement: Auditory processing disorder (APD). Bathgate: British Society of Audiology. 2011. [Link]
Ludwig AA, Fuchs M, Kruse E, Uhlig B, Kotz SA, Rübsamen R. Auditory processing disorders with and without central auditory discrimination deficits. Journal of the Association for Research in Otolaryngology. 2014; 15(3):441-64. [DOI:10.1007/s10162-014-0450-3] [PMID] [PMCID]
Stern CS. “The reliability and validity of the scan and scan-c for use with children with auditory processing disorders: A systematic review. New York: CUNY Graduate Center; 2016. [Link]

Musiek FE, Chermak GD. Handbook of central auditory processing disorder, volume i, second edition: Auditory neuroscience and diagnosis. San Diego: Plural Publishing; 2013. [Link]
Rosen S, Cohen M, Vanniasegaram I. Auditory and cognitive abilities of children suspected of auditory processing disorder (APD). International Journal of Pediatric Otorhinolaryngology. 2010; 74(6):594-600. [DOI:10.1016/j.ijporl.2010.02.021] [PMID]
Glennon J, Kirby B. “Differential diagnosis of auditory processing disorder in children: A literature review.” Normal: Illinois State University; 2018. [Link]
Weihing J, Chermak GD, Musiek FE. Auditory training for central auditory processing disorder. Seminars in Hearing. 2015; 36(4):199-215. [DOI:10.1055/s-0035-1564458] [PMID] [PMCID]
Ahmadi Z, Jarollahi F, Ahadi M, Hosseini AF. Normalization and validation of auditory processing Domain questionnaire in normal 8-12 year-old children. Auditory and vestibular research. 2017; 26(2):93-8. [Link]
Domitz DM, Schow RL. A new CAPD battery--multiple auditory processing assessment: Factor analysis and comparisons with scan. American Journal of Audiology. 2000; 9(2):101-11. [DOI:10.1044/1059-0889(2000/012)] [PMID]
Lotfi Y, Moosavi A, Abdollahi FZ, Bakhshi E, Sadjedi H. Effects of an auditory lateralization training in children suspected to central auditory processing disorder. Journal of Audiology & Otology. 2016; 20(2):102-8. [DOI:10.7874/jao.2016.20.2.102] [PMID] [PMCID]
Nejati V, Alipour F, Jalilvand Karimi L, Esfandiari L. Persian version of the dichotic digit test for children: Design and evaluation of the psychometric properties. Auditory and Vestibular Research. 2016; 2(1):-62. [Link]
Sanayi R, Mohamadkhani G, Pourbakht A, Jalilvand L, Jalayi S, Shokri S. Auditory temporal processing abilities in early Azari-Persian bilinguals. Iranian Journal of Otorhinolaryngology. 2013; 25(73):227-32. [PMID] [PMCID]
Aarabi S, Jarollahi F, Jalaie S. Development and determination of the validity of Persian version of monaural selective auditory attention test in learning disabled children. Auditory and Vestibular Research Journal. 2016; 25(1):49-54. [Link]
Mehrkian S, Fadaie E, Jalilzadeh Afshari P, Bakhshi E. Preliminary standardization of consonant-vowel in noise test in normal Persian speaking children. Auditory and Vestibular Research Journal. 2019; 28(1):28-35. [DOI:10.18502/avr.v28i1.413]
Lotfi Y, Salim S, Mehrkian S, Ahmadi T, Biglarian A. The Persian version of words-in-noise test for young population: Development and validation. Auditory and Vestibular Research Journal. 2016; 25(4):194-200. [Link]
Wilson RH, Farmer NM, Gandhi A, Shelburne E, Weaver J. Normative data for the words-in-noise test for 6- to 12-year-old children. Journal of Speech, Language, and Hearing Research. 2010; 53(5):1111-21. [DOI:10.1044/1092-4388(2010/09-0270)] [PMID]

Sabin AT, Eddins DA, Wright BA. Perceptual learning evidence for tuning to spectrotemporal modulation in the human auditory system. Journal of Neuroscience. 2012; 32(19):6542-9. [DOI:10.1523/JNEUROSCI.5732-11.2012] [PMID] [PMCID]
Bellis TJ, Anzalone AM. Intervention approaches for individuals with (central) auditory processing disorder. Contemporary Issues in Communication Science and Disorders. 2008; 35:143-53. [DOI:10.1044/cicsd_35_F_143]
Moore DR. Auditory processing disorder (APD): Definition, diagnosis, neural basis, and intervention. Audiological Medicine. 2006; 4(1):4-11. [DOI:10.1080/16513860600568573]
Fitzgerald MB, Wright BA. Perceptual learning and generalization resulting from training on an auditory amplitude-modulation detection task. The Journal of the Acoustical Society of America. 2011; 129(2):898-906. [DOI:10.1121/1.3531841] [PMID] [PMCID]
Faulkner KF. Understanding frequency encoding and perception in adult users of cochlear implants [PhD Dissertation]. Seattle: University of Washington; 2012. [Link]
Barlow N, Purdy SC, Sharma M, Giles E, Narne V. The effect of short-term auditory training on speech in noise perception and cortical auditory evoked potentials in adults with cochlear implants. Seminars in Hearing. 2016; 37(1):84-98. [DOI:10.1055/s-0035-1570335] [PMID] [PMCID]
Fritz J, Shamma S, Elhilali M, Klein D. Rapid task-related plasticity of spectrotemporal receptive fields in primary auditory cortex. Nature Neuroscience. 2003; 6(11):1216-23. [DOI:10.1038/nn1141] [PMID]
Slee SJ, David SV. Rapid task-related plasticity of spectrotemporal receptive fields in the auditory midbrain. The Journal of Neuroscience. 2015; 35(38):13090-102. [DOI:10.1523/JNEUROSCI.1671-15.2015] [PMID] [PMCID]
Wilson BS, Dorman MF, Woldorff MG, Tucci DL. Cochlear implants matching the prosthesis to the brain and facilitating desired plastic changes in brain function. Progress in Brain Research. 2011; 194:117-29. [DOI:10.1016/B978-0-444-53815-4.00012-1] [PMID] [PMCID]
Moore DR, Halliday LF, Amitay S. Use of auditory learning to manage listening problems in children. Philosophical Transactions of the Royal Society B: Biological Sciences. 2009; 364(1515):409-20. [DOI:10.1098/rstb.2008.0187] [PMID] [PMCID]
Gaab N, Gabrieli JD, Deutsch GK, Tallal P, Temple E. Neural correlates of rapid auditory processing are disrupted in children with developmental dyslexia and ameliorated with training: An fMRI study. Restorative Neurology and Neuroscience. 2007; 25(3-4):295-310. [PMID]
Russo NM, Nicol TG, Zecker SG, Hayes EA, Kraus N. Auditory training improves neural timing in the human brainstem. Behavioural Brain Research. 2005; 156(1):95-103. [DOI:10.1016/j.bbr.2004.05.012] [PMID]
Millward KE, Hall RL, Ferguson MA, Moore DR. Training speech-in-noise perception in mainstream school children. International Journal of Pediatric Otorhinolaryngology. 2011; 75(11):1408-17. [DOI:10.1016/j.ijporl.2011.08.003] [PMID]

Article type: Original Research Articles | Subject: Audiology
Received: 2021/07/24 | Accepted: 2021/08/16 | Published: 2022/06/1

Send email to the article author

Rights and permissions
	This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Designed & Developed by : Yektaweb

988368