1. Introduction
evelopmental Language Disorder (DLD) that is mainly named as Specific Language Impairment (SLI) or primary language impairment (PLI) in previous literature, refers to children who fail to develop language and communication skills at the expected rate, despite having age-appropriate development in other areas. Moreover, their problem in the language is not associated with known biomedical causes such as intellectual disability, sensorineural hearing loss, or autism [
1].
DLD is a very common disorder that affects two children in each classroom on average [
1]. Affected children experience difficulties in different aspects of language and both comprehension and production modalities [
1,
2]. Although for most children, the prominent language deficits are centered on grammar, the symptoms of this condition vary significantly from child to child and also among different age groups [
2,
3]. Because of these heterogeneous symptoms, the differential diagnosis of young children with DLD from Typically-Developing (TD) children is both a challenge and a necessity [
4].
DLD is commonly diagnosed by specifying inclusionary as well as exclusionary criteria. Exclusionary criteria rule out the possibility of problems in developmental domains other than language. Inclusionary criteria confirm the presence of a problem in the language [
5]. While it seems that there are general agreements about exclusionary criteria, there is no consensus about inclusionary criteria among specialists [
6] and the criteria vary among studies from scores of standardized tests of language to assessing child’s language in real-life situations [
5]. However, despite the presence of many norm-referenced tests and language measures that permit the Speech and Language Pathologists (SLPs) to evaluate language skills, the selected tools should be valid for diagnostic purposes [
7]. Accurate diagnosis is a critical issue that can prevent the occurrence of misidentification or over-identification errors. At the first step, psychometric properties of a test/measure, including validity and reliability are important criteria that affect its applicability as a diagnostic tool [
8]. However, though most of the commercially available tests have acceptable validity and reliability and also demonstrate the group differences between language-impaired and TD children, these properties are not enough to conclude that these tests can be used in clinical situations with certainty. Empirical examinations revealed that some of the widely-used language tests/measures may not be accurate tools for the identification of children with DLD [
5,
7]. For example, while the Structured Photographic Expressive Language test-2 (SPELT-2) can accurately differentiate preschool children with DLD from children with typical language development, the Peabody picture vocabulary test-IV as the third most extensively used norm-referenced test for the diagnosis of children with language disorders in the United States cannot (For more information about accurate standardized tests or measures for the diagnosis of DLD and factors influencing the selection of tests, see Maleki Shahmahmood et al. and Betz et al.) [
5,
9].
To support the accurate identification of impaired persons from a normally developing population, the diagnostic accuracy of a test must be further explored in terms of sensitivity and specificity [
5]. Plante and Vance (1994) suggested that sensitivity and specificity rates of ≥90% are considered good, 80%-89% acceptable, and below 80% unacceptable [
10]. To estimate the probability that the test result will give the correct diagnosis, predictive values, including Positive Predictive Value (PPV) and Negative Predictive Value (NPV) are also calculated. These values define the probability of having a disease by a patient whose test results are known. In the other words, PPV shows how many subjects with positive results truly have the disease of interest and NPV tells us how many subjects truly did not have the disease following the negative test results [
11].
There are many language tests/measures in different languages that their diagnostic power in differentiating children suspected to have DLD have been already investigated and confirmed. However, due to linguistic and cultural differences, these tests/measures could not be used in other languages such as Persian. Not only there are a very limited number of tests/measures for evaluating the language performances of Persian children [
12], but also there are no language tests/measures with predefined accuracy for differential diagnosis of Persian preschool children with DLD, except clinical assessment measures from language samples [
13]. Iranian speech therapists mainly decide about the presence of language impairment based on clinical judgment, after gathering informal information from different developmental areas, including language. Arbitrary cut-off scores of 1.5 to 2 SD below the mean in the Persian version of the test of language development-primary (TOLD-p: 3) are also used by some clinicians and researchers as a criterion to confirm the presence of language disorder. However, substantial data indicate that arbitrary cut-off score could not lead to accurate diagnosis because cut-off scores derived from different tests can differ significantly even when tests are validated on the same sample [
10]; furthermore, language-impaired children commonly do not necessarily obtain scores that fall below these commonly applied cut-off scores [
7].
The “clinical assessment measures from language samples” are the only language measures in which their sensitivity and specificity for differential diagnosis of preschool Persian children with DLD have been investigated and confirmed [
13]. Though language assessment in naturalistic contexts has some advantages, intra- and inter-raters reliability of standardized tests are higher, because the reliability of language sample analysis is dependent on examiners’ expertise and the type, structure, and context of language sampling [
14]. Furthermore, some advantages of standardized tests such as equal assessment protocol, equivalent content for all subjects, objectivity, easy scoring method, understandable protocol, and interpretable results for researchers make more interests in the tests for diagnostic purposes [
14]. Therefore, the lack of a formal linguistic test that can reliably detect language-impaired children from TD children in Persian is one of the concerns of SLPs in Iran.
Photographic expressive Persian Grammar Test (PEGT) is a screening grammar production test that was originally developed, validated, and standardized for evaluating the basic and important grammatical structures in 4-6 years old Persian-speaking children [
15]. This test contains 40 colored pictures of everyday life and elicits 20 frequent and important morphosyntactic structures of the Persian language. Learning of these syntactic structures is particularly problematic for children with DLD, even in older ages [
2,
16]. The content validity and reliability of PEGT have respectively reported as 86% and 90% [
15].
This study is an empirical examination of PEGT [
15]. The main aim was to assess the diagnostic accuracy (including sensitivity, specificity, optimal cut-off point, PPV, and NPV) of PEGT for differentiating 4-6 years old Persian-speaking children with DLD from TD children. As a consequence, speech therapists who work with Persian children in Iran or elsewhere in the world can diagnose language-impaired children with more reliability and certainty.
2. Methods
Ethical consideration
This article was derived from a research project approved by the Ethics Committee of Mashhad University of Medical Sciences, Mashhad, Iran (Ethical code: IR.MUMS.REC.1395.153). Parents signed informed consent at the time of their children’s participation.
Study participants
The participants of this methodological study were forty 4-6 years old Persian children in two groups: children with DLD and TD children. The children suspected of DLD (13 boys and 7 girls) were referred by SLPs who worked in university-affiliated or private clinics at Mashhad City, Iran. The TD group contained 20 normal Persian-speaking children who were selected from local preschools and individually matched to children with DLD in terms of gender, age, and socioeconomic status (mother education and metropolitan area of residence).
Materials and procedures
All participants were assessed by a certified SLP with good experience in working with DLD children. The parents of the children suspected of having DLD were interviewed about the medical and developmental history of the child and they confirmed their child’s normal growth in all developmental areas (including cognitive, perceptual, motor, social, and emotional domains) except language.
The physical and developmental health of TD children were appraised according to the teachers’ reports and the preschool records. None of these children had a history of speech or language problems, intellectual deficits, neurological or psychiatric illnesses, and sensory problems in hearing or vision.
To rule out mental retardation, all 40 participants were evaluated by the Persian version of the Wechsler Preschool and Primary Scale of Intelligence (WPPSI) [
17]. All children obtained Intelligent Quotient (IQ) scores above 75 (which is considered as a “retarded” range) and also showed no signs of limitation in adaptive behaviors. The mean standard score for the TD group was 107.8 (SD=13.2) and the DLD group was 97.5 (SD=13.1). All participants had normal hearing (according to the results of the whisper test) [
18], efficient oral structures and functions for speech (based on oral examination), and no history of seizure or frank neurological disorders.
Children’s language status (DLD or TD) was initially established through both objective (analysis of the child’s language sample) and subjective (clinical judgment) criteria. Twenty minutes of conversations between the child and her/his mother in a free-play setting with a set of toys was recorded and transcribed by one of the authors. Among language measures which have been introduced by Kazemi et al. [
13] as measures which could be utilized clinically for differential diagnosis of Persian-speaking preschool children with DLD, Mean Length of Utterance In words (MLU-w) was selected as the reference standard. It is traditionally calculated by collecting 50 utterances spoken by a child and dividing the number of words by the number of utterances. MLUs lower than 2.96 were interpreted as signaling the presence of language impairment [
13]. Based on the results of the Spearman correlation coefficient test, the intra-transcriber reliability was from 0.85 to 0.90 across measures (P < 0.001).
Clinical judgment was made by an experienced SLP who was associated with this study and confirmed the classification. Several sources of information, including face-to-face interactions between the clinician and the subject, the child’s history of enrollment in therapy, information reported by the child’s speech and language therapist, parent’s reports about their child’s speech, language, and communication.
After the primary classification of participants in TD or DLD groups, PEGT was administered to all participants by a single examiner who was trained for its administration and also was blind to the primary classification status of participants. All participants were tested individually in a room with proper light and ventilation and free from auditory and visual distractors. The test was executed and scored by the examiner based on the instructions which have been described in detail in the design and validation study of PEGT [
15].
Data analysis
Data analysis was done in SPSS version 21. The mean, standard deviation, minimum, and maximum descriptive measures were used. The Kolmogorov-Smirnov test was applied to assess the normal distribution of the variables. Then, the Mann-Whitney U test and Rock Chart (ROC) were respectively used to examine the ability of the PEGT to distinguish children with DLD from TD children and to determine the optimal cutoff point, sensitivity, and specificity.
3. Results
The Mean±SD age of participants was 62.52±9.48 months at the sampling stage (
Table 1).
To evaluate the differential validity of PEGT, the mean total scores of the test were compared between TD and DLD groups using the Mann-Whitney test (
Table 2).
The mean total score of the PEGT was significantly different among the DLD and TD groups (P<0.001).
To calculate the sensitivity and specificity of the PEGT, the ROC chart was drawn (
Figure 1).
4. Discussion
Assessment with an instrument for which the identification accuracy has not been specified is a “poor use of time” [
7] and can lead to misdiagnosis which in turn affects the overall service planning.
As suggested by van der Lely et al., the diagnostic power of a language test heavily depends on the content of that test and the characteristics of language disorder in the target group [
20]. Although children with DLD show dissimilar profiles of weaknesses and strengths in different language domains, their prominent problem mostly laid in the domain of morphosyntax. So, among all tests, it seems that grammar tests are more accurate for the diagnosis of children with DLD than language tests with other paradigms.
PEGT as a grammatical screening test for young children evaluates 20 high-frequency grammatical structures of the Persian language in about 15 minutes (see reference 21 for more information).
The study results provide evidence reflecting that PEGT can usefully differentiate between Persian preschool children with DLD and their TD peers.
Comparing the performances of participants show that children with DLD perform significantly lower than their age-matched control peers at PEGT. So, PEGT reveals group differences between children with DLD and TD. The diagnostic power of PECT was evaluated in terms of sensitivity and specificity. Estimated scores suggest that PEGT is a suitable instrument for differential diagnosis of preschool Persian children with DLD. The results revealed that all children, except one, who were previously classified in the TD group by the reference standard, were also identified as TD children by PEGT (Table 4). So the positive predictive value of PEGT is calculated as 95% that means the probability that a child whose test result is positive will have language impairment with 95% certainty. The NPV of this tool (=100%) predicts if the test result is negative for a child, the chance of being language-impaired is 0%. However, contrary to sensitivity and specificity, predictive values are not part of the intrinsic characteristics of the test and are affected by the prevalence of the disorder in society [
21,
22,
23]. As the prevalence of DLD in the selected sample is deliberately higher than the actual prevalence in the community, the obtained PPV and NPV scores do not indicate the true predictive power of PEGT in the community. However, these values are useful, as they allow comparisons among the diagnostic tests under similar conditions.
The decision as to which cut-off point to choose will be a matter for the clinicians and researchers. Our data provide a basis for these decisions (
Table 3). Regarding the recommendation of Plante and Vance ]10], the levels of both sensitivity and specificity scores were good in 4 cut-off points of 9.5 (sensitivity=100% and specificity=95%), 12.5 (sensitivity=100% and specificity=100%), 17 (sensitivity=95% and specificity=100%) and 21 (sensitivity=81% and specificity=100%). Sensitivity and specificity scores fell within the fair range at the cut-off point of 21. Although at points of -1, 1, and 5, the test was capable of identifying all children with DLD, regarding the low values of the specificity (0, 0.21, and 0.48 respectively), many children with typical language development are over-identified as language impaired. Unlike this, at points of 26, 30, 33, and 35, the PEGT could accurately identify TD children, but the low sensitivity scores did not guarantee the exact identification of the affected children.
Data indicate that at the cut-off score of 12.5, the PEGT has the optimal sensitivity and specificity for differential diagnosis of 4-6 years old children with DLD from their TD peers. Regarding the normative distribution of PEGT (Mean: 25.46; SD: 6.24), using a cut-off point between -0.5 SD to -2.5 SD results in approximately precise identification, though the optimal cut-off point lay in -2 SD. Although using a slightly lower or higher cut-off score of 12.5 (-2 SD) would not cause a noteworthy decrease in the PEGT’s sensitivity or specificity, it may result in misclassification of a few children.
The diagnostic value of PEGT is very close and even surpassed those reported for SPELT-2 and -3 [
4,
24], as English grammar tests with similar structures have good sensitivity and specificity for the diagnosis of English-speaking children with DLD. The sensitivity and specificity values of SPELT-2 and SPELT-3 have been estimated to be above 90% but not as high as 100% that was revealed by PEGT. Moreover, the sensitivity and specificity levels of PEGT exceeded that of “clinical assessment measures from language samples” [
13], as the only measure with predefined identification accuracy for differential diagnosis of preschool Persian children with DLD. As a standardized language test, PEGT has some other advantages other than language sample analysis, such as the ineffectiveness of the degree of proficiency of the examiner on the results, as well as ease of implementation, scoring, and interpretation. However, making a well-designed clinical decision needs incorporating data from different resources, including case histories, observations in the natural environment, informal assessments, and specifically spontaneous speech sampling alongside the results of standardizes language tests [
25].
5. Conclusion
Identification measures must be valid and reliable for their expected purpose. The results of the present study in combination with the results of previous studies on PEGT indicate that the PEGT is a valid and reliable test with excellent power to determine the presence or absence of language impairment in preschool Persian children. So these data provide the evidence needed for Persian clinicians and researchers to make them confident about their diagnosis when attempting to identify preschool Persian children with DLD.
Ethical Considerations
Compliance with ethical guidelines
This article was derived from a research project approved by the Ethics Committee of Mashhad University of Medical Sciences, Mashhad, Iran (Code: IR.MUMS.REC.1395.153). Parents signed informed consent at the time of their children’s participation.
Funding
This article was extracted from a research project and was supported by Mashhad University of Medical Sciencesa research (Grant No.: 941621).
Authors' contributions
All authors equally contributed to preparing this article.
Conflict of interest
The authors declared no conflict of interest.
References