Introduction
The development of teaching and assessment methods in the field of physical education has received increasing attention in recent years, given the opportunities they offer to enhance cognitive achievement levels and change students’ attitudes toward the subject [1, 2].
Recent studies confirm that the use of innovative educational models, such as the KMP and generative learning models, can effectively contribute to improving student performance and enhancing their active participation. Scientific evidence indicates that adopting these models can bring about a qualitative shift in the learning process by activating the learner’s role and enhancing their ability to understand and apply [3]. The importance of this research stems from the urgent need to evaluate the effectiveness of these two models in raising cognitive achievement levels, in addition to their impact on students’ attitudes toward physical education teaching methods [4].
This is because positive attitudes are an important indicator of the success and sustainability of the educational process. Accordingly, this research highlights two key aspects: the impact of the Kemp and generative learning models on improving student performance and changing their attitudes, with the aim of facilitating the development of effective teaching strategies that keep pace with developments in modern education [5].
As many researchers have emphasized, applying the Kemp and generative learning models is an effective method for improving cognitive achievement and fostering positive attitudes toward the subject [6]. Recent studies highlight the importance of adopting these models to achieve a qualitative shift in the teaching and learning process in physical education [6, 7]. The research is expected to contribute to practical recommendations for developing physical education teaching methods, by integrating the Kemp model and generative learning into the curriculum. It can also help improve the teaching strategies used, leading to higher academic achievement and promoting positive attitudes among students.
The goal of teaching physical education teaching methods is to educate learners about the foundations and principles of rhetoric and the quality of speech, in terms of clarity, strength, beauty, brilliant imagery, precise thinking, and brilliant imagination [6]. It also aims to cultivate the senses of the value of words and their importance in conveying meaning appropriately, and to enhance emotions by focusing on masterpieces of speech and literary styles [5]. It also aims to instill in students the value of literary expression and its impact on the soul by understanding the artistic touches highlighted in literary texts and depicting the feelings, emotions, and psychological states of writers [3, 4]. However, poor achievement and weaknesses in understanding and comprehension do not contribute to creating a positive attitude toward the subject and may lead to weak motivation and enthusiasm for learning [8].
Among the difficulties, facing the Teaching Methods subject is the lack of an approved textbook that fully achieves the course’s objectives. In addition, most teachers focus on the theoretical aspect and neglect the practical aspect. This is evident in their distance from the learners’ reality and environment. The content lacks the development of students’ personal, professional, and performance talents, and does not meet their needs, interests, and developmental requirements [9]. Fluctuating achievement levels lead to a lack of interest in and commitment to the subject, which exacerbates the problem of discontinuity and loss of interest in the subject as an essential part of the teaching and learning process [10].
Some believe that the difficulties facing the teaching methods of physical education may be due to the assessment methods used, particularly in the area of cognitive achievement [11-13]. Most assessment methods focus on memorization and rote learning, and their results are primarily adopted at the end of the academic year, ignoring ongoing formative assessment throughout the academic year [14]. Furthermore, assessment methods fail to achieve a balanced measurement of learners’ knowledge, attitudes, and practical skills, nor do they contribute to the development of their strengths and weaknesses, which reduces the effectiveness of the process of developing learning methods in physical education. This approach leads to weak motivation and reduces students’ desire to improve their skills and develop positive attitudes toward the subject [15, 16].
Some believe that the reason behind the weakness of learners in the subject of teaching methods of physical education is due to the teacher, who should be widely aware of the secrets of the subject, its contents, and changed vocabulary [17]. However, it is noted that the subject of teaching methods of physical education is taught by some teachers who are unable to convey information to learners, due to their weak scientific sufficiency and the lack of extra information that develops their field of specialization, as the teacher is satisfied with what is contained in some auxiliary curriculum books without any information that gives the discipline of the lesson a character of vitality and novelty [18].
Among the difficulties facing teaching methods are the way in which the content is presented, as it is presented in a dry manner. It also presents the difficulty of applying the concepts of instruction and motor skills, among others [15]. It also includes the lack of diversity in teaching methods and approaches for physical education, the use of outdated teaching methods, and the focus on providing learners with cognitive information and memorizing rules, while neglecting skill-related aspects [14].
Thus, the problem of the current research can be defined by answering the following question: What is the effect of the Kemp and generative learning models on cognitive achievement in the subject of physical education teaching methods and students’ attitudes toward them?
Materials and Methods
Experimental design
The researchers adopted a 3-armed controlled experimental design appropriate to the research conditions. The design was explained in the following. The first experimental group studies physical education teaching methods using the Kemp model. The second group studies physical education teaching methods using the generative learning model. Finally, the control group studies physical education teaching methods using the established method.
Research community and samples
The first step to consider when selecting a sample is to identify the original community, as results related to a specific community cannot be derived until the components that comprise that community are sufficiently identified. The researchers’ procedures during the implementation of the experiment were as follows. The researchers began implementing the experiment on students in the three research groups on October 1, 2024, with one class per week for each group. The experiment was concluded on January 15, 2025. The current research community represents all fourth-year students (who previously studied the teaching methods course in the third year College of Basic Education, Mustansiriyah University (for the year 2024-2025).
Using the task facilitation book provided by the College Deanship, the researchers could identify the community. For the academic year 2024–2025, they discovered that it comprised three groups for the fifth-grade literary section: A, B, and C. Group B, whose pupils would be exposed to the first independent variable (Kemp model), was chosen at random to represent the first experimental group. The second experimental group, denoted by group C, would subject its pupils to the second independent variable, which is the generative learning model. Group A, whose pupils used the conventional approach to their studies, represented the control group.
The three research groups had 98 students, including 31 students in group A, 33 students in group B, and 34 students in group C. The sample consisted of 93 students, with 30 students representing the control group, 30 students representing the first experimental group, and 30 students representing the second experimental group after the failed students in group C were eliminated.
Matching the three research groups
Even though everyone in the research sample came from the same department, various factors that may impact the findings needed to be controlled. As a result, before conducting the experiment, the researchers confirmed that the three study groups were matched on a few characteristics suggested by earlier research [19], which are thought to have an impact on the precision of the experiment’s findings. These variables are previous grades for the physical education teaching methods course (for the third stage), previous physical education teaching methods course grades (for the third year), and Raven intelligence test. The researchers obtained information on the physical education teaching methods course grades from the department records of the students themselves in the third year, while the data for the fifth variable was obtained from the application of the Raven intelligence test.
The following issues explain the matching processes between the three research groups.
Final grades in physical education teaching methods (third year)
In the last Arabic language course (fourth literary year), three groups of students had average scores of 64.34%, 64.87%, and 62.23%, respectively. The difference was not statistically significant at the 0.05 level using the one-way analysis of variance (ANOVA), a statistical test used to determine if all groups are drawn from similar populations. The tabular P=3.10 with 2 degrees of freedom across groups and 90 within groups, was greater than the calculated P=0.526.
Raven intelligence test scores
To verify the equivalence of the three research groups in terms of intelligence, the researchers relied on the Raven progressive matrices test. This test, designed to measure mental ability, is the most widely used measure of intelligence. It occupies an important position among other intelligence tests that do not include the verbal mental ability of the examinees. Psychologists consider it a good test due to its validity and reliability, and its criteria suitable for the Iraqi educational environment. In addition, this test can be easily administered to large groups of individuals simultaneously and is suitable for the age groups to which the current study sample belongs.
The test consists of 60 items distributed across 5 groups. The average intelligence test scores for the three research groups were 37.16%, 37.1%, and 38.27%, respectively. The difference between the three research groups’ scores was not statistically significant at the 0.05 level, according to a one-way ANOVA used to assess its significance. The tabular P=3.10, which included two degrees of freedom between groups and 90 within groups, was more than the computed P=0.256. As 3 demonstrates, the three study groups performed statistically similarly on the IQ level exam.
Controlling internal variables in the experiment
Although experimental research, especially on human phenomena, cannot be tightly controlled, there are factors, or variables other than the independent factor, that affect the validity of the experiment in one way or another. Therefore, researchers must identify and control these factors to prevent their influence on the dependent variables. Experimental research is subject to internal (non-experimental) factors that affect the internal and external validity of the experimental design. These factors include concomitant incidents, maturation, and experimental attrition. These variables were controlled with the statistical tests.
Measurement tools
The researchers used two standardized tools for the three research groups, namely the cognitive achievement test by the researcher. The researchers adopted a scientifically standardized cognitive achievement test for the purposes of the current research [5-9]. They also used another standardized measurement tool, the attitude scale, which is also a scientifically standardized test, to measure the attitudes of the students of the three research groups towards the subject of physical education teaching methods.
Preparing instructional plans
Instructional plans are preconceived notions of the teaching situations and procedures that the teacher and students aspire to achieve to achieve specific educational goals. This process includes defining objectives, selecting methods that will help achieve them, selecting implementation methods, and evaluating the students’ cognitive achievement of those objectives.
Since developing instructional plans is one of the requirements for successful teaching, the researchers prepared instructional plans for the topics to be taught during the experimental period, based on the contents of the prescribed textbook and the behavioral objectives of the subject. They followed the steps of the Kemp model for teaching students in the first experimental group [11], the generative learning model for teaching students in the second experimental group, and the traditional method for teaching students in the control group.
Preparing research tools
Cognitive achievement test
A cognitive achievement test is defined as a systematic procedure for determining the extent to which students have learned a subject according to specific objectives. It can be used to improve learning methods and contribute to improved planning, implementation control, and assessment of achievement.
One of the requirements of this research was to develop a cognitive achievement test in the physical education teaching methods course to measure the cognitive achievement of students in the three research groups during the experiment period. Given the existence of the test, it was previously applied in the Iraqi environment at the same age level and for recent years, and it enjoyed a content validity degree, as measured by expert opinion, of 0.89, and a reliability degree of 0.88 through the split-half method, in addition to the objectivity of the tests, which reached 0.89. It is characterized by validity and reliability for measuring the cognitive achievement of fourth-grade students in the physical education teaching methods course, according to the behavioral objectives and their levels and the content of the scientific material specified for the experiment. It consists of 30 paragraphs divided into three questions: the first consists of 18 multiple-choice paragraphs, the second consists of 5 interview (pairing) paragraphs, and the third consists of 7 short-answer paragraphs [20].
A scale of attitudes toward the physical education teaching methods course
One of the requirements of the current research was to develop a scale to detect the scientific attitudes of fourth-grade students toward the subject of physical education teaching methods. The researchers selected the scale [10]. It was previously applied in the Iraqi environment at the same age level and for recent years, and it enjoyed a degree of content validity, as measured by expert opinion, of 0.87, and a degree of reliability of 0.89 through the split-half method, in addition to the objectivity of the test degree reaching 0.86. Which is also a standardized scientific test. It was applied at a similar level to the sample and within the same Iraqi environment. It is characterized by validity and reliability. It consisted of 30 items, divided into positive and negative items.
Scale correction and calculation of the total score
To calculate the total score for the items on the attitude scale, the researchers identified all the alternatives for answering each item (agree, disagree, do not know) and assigned weights of (3, 2, 1, respectively). The number of positive items supporting the attitude was 15, while the number of negative items opposing the attitude was 15. The researchers corrected the answer sheets and assigned weights (3, 2, 1) to the alternatives (agree, disagree, don’t know) for the positive items. While weights (1, 2, 3) were assigned to negative items (agree, disagree, don’t know). The researchers calculated the degree of scientific orientation by adding up the score’s students received for each item on the scale. Thus, the highest score students could receive was 90, the lowest 30, and an average of 60.
Pre-test
The researchers adopted the students’ grades in the physical education teaching methods course in the third stage as an indicator of the level of equivalence of the research sample and considered the point at which the students started to move towards the course as zero. This is because the physical education teaching methods course encourages students to form a more accurate and effective attitude towards teaching than in the third stage.
Statistical methods
SPSS software, version 26, was used to perform statistical procedures. The main statistical method for testing the educational achievement and attitude score differences among the 3 independent groups (Kemp model, generative learning model, and traditional instruction) was one-way ANOVA. The study’s quasi-experimental design favored this choice of the primary statistical method, since the said groups were formed based on the different teaching models employed (Kemp model, generative learning model, and traditional instruction). Before conducting the ANOVA, the normality assumption (evaluated through the Shapiro-Wilk tests and Q-Q plots) and variance assumption (checked through Levene’s test) were confirmed, thus supporting the use of parametric testing. After the statistically significant F values were obtained, post hoc pairwise comparisons were performed utilizing Scheffé’s method, which is a conservative approach suitable for unequal sample sizes and is thus robust against the danger of type I error inflation in exploratory comparisons. This method made it possible to accurately pinpoint which specific group pairs were different and the direction of the difference. No analyses of subgroups were planned or executed, for instance, according to gender or prior achievement level, since the focus of the research design was group-level instructional effects over that of individual moderators. Also, analysis of covariance (ANCOVA) controlling for baseline covariates as an adjusted analysis, was not performed, because the pre-intervention equivalence on key confounders, prior course grades and nonverbal intelligence, was already statistically confirmed via 1-way ANOVA (P>0.05 for both), which rendered adjustment unnecessary. With respect to missing data, 5 participants were excluded from the initial sample of 98 due to academic failure or excessive absenteeism (>3 unexcused absences), resulting in a complete-case analytical sample of 93. No imputation methods were applied, for instance, mean substitution or multiple imputation, since the attrition rate was low (5.1%) and considered non-systematic. Thus, complete-case analysis was regarded as suitable and non-biased under the assumption of missing completely at random. The entire process of statistical testing was 2-tailed with the significance level set at α=0.05 [16].
Results
Baseline data
One-way ANOVA confirmed no significant between-group differences in prior achievement (M=64.34, 64.87, and 62.23 for the control, Kemp, and generative learning groups, respectively; F2, 90=0.526, P>0.05) or intelligence scores (M=37.16, 37.10, and 38.27, respectively; F2, 90=0.256, P>0.05), indicating that the three groups were statistically equivalent on variables known to influence learning outcomes.
Following the exclusion of five participants due to academic failure or excessive absenteeism (>3 unexcused absences), 1 from the control group, 1 from the Kemp group, and 3 from the generative learning group, the final analytical sample consisted of 93 students (30, 32, and 31, respectively). A post hoc comparison of baseline characteristics between those retained and those excluded revealed no systematic differences: Excluded students had marginally lower prior course grades (mean ≈58.4 vs 63.9 for completers) but comparable intelligence scores, suggesting that attrition was likely related to academic performance rather than cognitive ability or instructional condition. However, given the low attrition rate (5.1%) and the absence of differential dropout across groups (χ²=1.82, P>0.05), the risk of attrition bias was deemed minimal.
Cognitive achievement
A one-way ANOVA revealed a statistically significant difference in cognitive achievement scores across the three groups (F2, 90=34.894, P<0.001). Descriptive statistics indicated that the generative learning group achieved the highest mean score (Mean±SD; 23.39±2.25), followed by the Kemp group (Mean±SD; 17.09±2.1), and the control group (Mean±SD; 16.07±2.05) (
Table 1).

Post hoc comparisons using Scheffé’s method confirmed that the generative learning group significantly outperformed both the Kemp group (Scheffé=6.29> critical=2.338, P<0.05) and the control group (Scheffé= 7.32> critical=2.376, P<0.05). However, no significant difference was observed between the Kemp group and the control group (Scheffé=1.03
0.05) (Table 2).

Attitudes toward the course
A second one-way ANOVA also revealed a statistically significant difference in attitude scores across groups (F2, 90=25.085, P<0.001). The generative learning group reported the most positive attitudes (Mean±SD; 59.03±6.8), while the Kemp group (Mean±SD; 43.78±5.9) and control group (Mean±SD; 43.27±6.1) showed comparable, less favorable attitudes (Table 3).

Post hoc Scheffé tests showed that the generative learning group had significantly more positive attitudes than both the Kemp group (Scheffé=15.25>critical=6.245, P<0.05) and the control group (Scheffé=15.77>critical=6.346, P<0.05). No significant difference was found between the Kemp and control groups (Scheffé=0.510.05) (Table 4).

Discussion
The results obtained by the current study can be interpreted in accordance with the study’s hypotheses. The findings show us that students in the generative learning experimental group outperformed the students in the Kemp experimental group as well as the students in the control group, which were taught conventionally. This result can be attributed to one or more of the following reasons: The naturalness of the generative learning model, which helps the learner understand the educational situation in its separate parts, then deal with them in an integrated manner by linking new concepts with appropriate previous concepts previously learned. This issue leads to meaningful learning that results in a correct understanding of the material being learned [13]. The use of the generative learning model in teaching physical education methods facilitates learning [20]. This objective is achieved through the stages through which the learner goes through, whereby new information is intertwined with existing information (the invitation stage), whereby it is coordinated with previous information [19]. It also encourages students to research and investigate, leading them to research, experiment, and analyze (the exploration stage). Furthermore, the proposing solutions and explanations stage encourages students to discuss and engage in dialogue with each other and with the teacher, strengthening students’ confidence in their ability to provide logical solutions and explanations. This confidence is demonstrated in the taking action or application stage [19]. The nature of the generative learning model, where the student is both teaching and learning, requires several skills, including observation, inference, summarizing ideas, discussion, and exchange of opinions [2, 6]. These skills are central to the educational process, all of which contribute to increased cognitive achievement [10]. The use of the generative learning model allows students to interact with each other, helps increase their motivation, and connect learning experiences, leading to a deeper understanding of the facts and information of the material being learned [9].
There are several methodological limitations that must be pointed out. First, the use of unaltered class sections, though strengthening ecological validity, ruled out random assignment and, hence, might have caused the introduction of unmeasured confounding factors regardless of the baseline equivalence on the key covariates. Second, the sample was taken from one institution and comprised of male students, which limits the applicability of the findings to more diverse populations or educational systems. Third, the outcome measures were assessed right after the intervention. Therefore, the duration of the observed gains is still to be determined. Finally, even though instructional fidelity was secured through the use of standardized lesson plans and one teacher, the lack of direct classroom observation creates a situation where potential differences in model implementation that could not be measured might exist.
By conducting cross-institutional replications with gender-inclusive samples, longitudinal designs for measuring retention and transferring learning, and mixed-methods approaches that bring together observational and qualitative data to shed light on the experiential processes of the impact of pedagogical models, future research should find a way around these limitations. Further, the use of generative learning in teacher education would be clarified by comparative studies that involve other active learning strategies like problem-based learning or scenario-based instruction which would show us the relative strengths of generative learning within the different teaching methods landscape. Moreover, incorporating digital technologies like video annotation or virtual teaching simulations into generative frameworks could not only boost the scalability but also the engagement especially in places challenged with limited resources.
Conclusion
The generative learning model outperformed the Kemp model and the traditional method in terms of fourth-grade students’ cognitive achievement in the physical education teaching methods course, relative to the study’s limitations. Students in the second experimental group, who used the generative learning model, demonstrated superior cognitive achievement compared to their attitudes towards the physical education teaching methods course, relative to the Kemp model and the traditional method. Teaching according to the generative learning model relies on learner activity and self-construction of knowledge, as students work in groups, which fosters a spirit of cooperation and positive teamwork. Applying the steps of the generative learning model in teaching Physical Education Teaching Methods inspires enthusiasm, vitality, and activity in students, fostering a spirit of cooperation among them.
Ethical Considerations
Compliance with ethical guidelines
This study was approved by the Ethics Committee of Mustansiriyah University, Baghdad, Iraq (Code: EU174, 15-12-2024).
Funding
This research did not receive any grant from funding agencies in the public, commercial, or non-profit sectors.
Authors' contributions
Study design and data analysis: Ali Ahmed Hameed; Data collection: Hamid Kadum Hassan and Abdullah Hameed Mohsin; Writing: Ahmed Naji Mahmood and Ghanim Hadi Sbahi.
Conflict of interest
The authors declared no conflict of interest.
Acknowledgments
The authors would like to thank College of Physical Education and Sports Sciences, Al-Esraa University, Baghdad, Iraq for supporting their scientific works.