Research on the Impact of Social Media Short Videos on College English Listening and Speaking Teaching

Tingting Wu

Department of Social Management and Service, Hefei Preschool Education College, Hefei, Anhui 230001, China
E-mail: wutingt_wtt@outlook.com

Received 24 October 2025; Accepted 22 January 2026

Abstract

This study focuses on the integration path and effect of short videos on social media and college English listening and speaking teaching, aiming to provide a reference for the reform of college English listening and speaking teaching through empirical exploration. The research adopted a mixed research method (quantitative $+$ qualitative), selecting 90 second-year students of other majors from a certain university as the research subjects. All students had achieved a level 4 English proficiency (425–500 points) and had not received short video-assisted listening and speaking teaching. They were randomly allocated to the experimental group (45 people). A 16-week teaching experiment was carried out by using short videos from platforms such as YouTube and TikTok for auxiliary teaching and a control group (45 people, taught using the traditional textbook “New Horizons College English (Third Edition)” listening and speaking course). During the experiment, the teaching of the two groups was carried out around three units: campus life, cross-cultural communication, and career planning. The teacher was the same person to control the variables. Data were collected through several methods, including tests, questionnaires on learning motivation, classroom observations, and interviews. After conducting independent sample t-tests and ANOVA with SPSS 24.0 and qualitative coding analysis with Nvivo 12, it was found that: in terms of listening and speaking abilities, the average score of the experimental group after the test was (78.6 $\pm$ 7.2), which showed a significant advantage compared with the control group (68.5 $\pm$ 8.1) ( $t = 5.82$ , $p < 0.01$ ). The improvement rate (16.3 points) was 2.4 times that of the control group (6.7 points). Meanwhile, the difference in the long dialogue question type was the most significant (difference proportion $+$ 22.95%). After the test, the average score of the experimental group was (76.3 $\pm$ 7.8), which showed a significant advantage compared with the control group (66.2 $\pm$ 8.4) ( $t = 4.95$ , $p < 0.01$ ). Compared with the control group, the pronunciation dimension (78.5 $\pm$ 6.9) was 12.3 points higher and the fluency dimension (77.2 $\pm$ 7.1) was 11.5 points higher. In terms of learning motivation, compared with the control group ( $p < 0.05$ ), the four ARCS model dimensions of attention (4.2 $\pm$ 0.5), relevance (4.1 $\pm$ 0.6), confidence (4.0 $\pm$ 0.5), and satisfaction (3.9 $\pm$ 0.6) in the experimental group all had obvious advantages. Among them, the score difference in the confidence dimension is the most significant, at 0.9 points. Moreover, the motivation showed a phased characteristic of rapid increase from the 1st to the 4th week and stable improvement after the 8th week, forming a positive cycle from ability improvement to motivation enhancement, despite the absence of a prominent fluctuation in the motivation of the control group throughout the process (the difference in each dimension was $\leq$ 0.1). In terms of teaching adaptability, the real context (multiple accents, life-like scenarios) of short videos solves the problem of disconnection between teaching materials and language resources. The duration of 15 seconds to 5 minutes is suitable for fragmented learning, and the high interactivity (imitation, recording, and mutual evaluation) enhances classroom participation The average number of active speeches per class in the experimental group (18.5 times) was three times that of the control group (6.2 times), and the participation in group discussions (92%) was significantly higher than that of the control group (65%). Further analysis of the effects of different types of short videos reveals that the educational category performed best in capturing listening details (4.5 points) and imitating oral pronunciation (4.4 points), the cultural category scored the highest in stimulating learning motivation (4.6 points), and the daily category had a significant advantage in improving oral fluency (4.5 points). Research has confirmed that short videos on social media can effectively assist college students of other majors in English listening and speaking teaching and stimulate their learning motivation. In teaching, complementary advantages should be achieved by adopting strategies such as using educational videos in the basic stage, daily videos in the application stage, and interspersing cultural videos throughout the process.

Keywords: Social media short videos, English listening and speaking, teaching effect, learning motivation.

1 Introduction

Nowadays, the position of college oral English teaching is becoming increasingly prominent. As the core carrier of communication ability, the teaching quality of listening and speaking skills directly determines students’ comprehensive language application level [1–5]. However, the current traditional college English listening and speaking teaching still faces three core predicaments. First, the content update of teaching materials is not done regularly. The dialogues and language materials selected in most teaching materials still remain in standardized but unrealistic scenarios (such as classroom dialogues with a single accent and outdated business scenarios), which makes it difficult to meet the language needs of students in future cross-cultural communication that are multi-accented and life-like. Second, the teaching form is rigid. The classroom scenario mainly consists of the teacher’s explanation, textbook reading aloud, and after-class exercises. Students are mostly in a passive input state, with limited opportunities for oral output and lacking immediate feedback. Thirdly, the learning motivation is weak. Students generally think that listening and speaking learning is boring and it is difficult for them to connect classroom content with practical applications, resulting in insufficient learning continuity [6–8].

Although scholars at home and abroad have carried out certain research on English teaching, there are still two deficiencies in the existing research. First, the samples are mostly concentrated among primary and secondary school students or learners in language training institutions, and there are relatively few empirical studies on non-English major college students. Second, most studies only focus on a single skill (such as only listening or only speaking), lack systematic exploration of the synergistic improvement of listening and speaking skills, and have not deeply analyzed the influence mechanism of short videos on learning motivation [9, 10].

Based on this, this study takes non-English major college students from a certain university as samples and conducts teaching experiments using a mixed research method. The core objectives include: first, to empirically test the synergistic improvement effect of social media short videos on students’ English listening comprehension and oral expression abilities; second, to analyze through which paths short videos influence students’ learning motivation; third, it provides operational strategies for integrating short videos into college English listening and speaking teaching, making up for the gap in practical guidance in existing research.

2 Social Media Short Video English Teaching

2.1 Core Characteristics of Short Videos on Social Media

Social media short videos specifically refer to those posted on platforms such as YouTube, TikTok, and Instagram Reels, with a duration ranging from 15 seconds to 5 minutes, and they are visual and auditory. Digital content has high interactivity (likes, comments, shares, secondary creations) and diverse content (daily conversations, cultural science popularization, disciplinary connections, entertainment creativity). Its core features can be summarized in three points. First, the authenticity of the context. The creators of short videos are mostly native speakers or high-level users. The language materials cover multiple accents such as American, British, and Australian, and the scenes are close to reality (such as shopping in a supermarket, asking for directions on the street, and include clips of academic lectures), which can restore the usage scenarios of language in real communication. Second, is the learning adaptability. The duration of 15 seconds to 5 minutes is highly compatible with the fragmented time of college students during breaks and commutes, enabling high-frequency and low-burden listening and speaking practice. The third is interactive creativity. Students are not only recipients of the content but can also become outputs through imitation recording, commenting and asking questions, and creative adaptation, thus forming a learning closed loop of input, internalization and output.

2.2 The Application Value of Social Media Short Videos in Listening and Speaking Teaching

2.2.1 Solving the problem of disconnection of language materials in listening teaching

Traditional listening teaching relies on textbook materials and has the drawbacks of a single accent and standardized speech flow, which leads to students being able to understand the textbook but not the real person’s dialogue when facing real English communication. Take short videos from the YouTube channel “English Speeches” as an example. The channel includes TED-Ed student speech clips (3–5 minutes long), covering English expressions of students from different countries (such as Indian and African accents), and includes real speech flow features such as pauses, repetitions, and self-correction. In teaching, teachers can design stratified tasks around short videos: the basic level (listen and identify key words, such as climate change, sustainable development; the advanced level (summarize the main idea of the speech); high-level (analyzing the speaker’s tone and emotions) helps students gradually adapt to real listening scenarios.

2.2.2 Break through the output anxiety predicament in oral English teaching

In traditional oral English classes, students are reluctant to speak up due to fear of making mistakes and lack of topics. However, social media short videos can stimulate oral English output through low-pressure imitation and contextualized expression. Take the TikTok account “Learn English with Emma” as an example. The short videos of daily conversations it posts (such as ordering coffee, making an appointment with a doctor) are 2 minutes long, with concise language, specific scenes, and are accompanied by subtitles and action demonstrations. In teaching, teachers can guide students to carry out three-step exercises. The first step is to follow and imitate, imitating the pronunciation and intonation sentence by sentence in the short video, and correcting their own problems through recording comparison. The second step is character replacement. Replace the dialogue characters in the short video with yourself and your classmates and adapt the details (such as changing ordering an Americano to ordering a latte). The third step is creative expansion. Let students add subsequent plots based on the short video scene (such as meeting a friend while ordering coffee and inviting them to sit down and chat together) and record a 1-minute oral video to submit. The teacher will provide personalized feedback through comments. This process from imitation to creation can inspire students to express themselves more confidently in oral language.

2.2.3 Correlation and satisfaction dimensions in reinforcement learning motivation

According to Keller’s ARCS motivation model, correlation (the connection between learning content and students’ needs) and satisfaction (the sense of achievement brought by learning outcomes) are the keys to maintaining learning motivation. Social media short videos can meet the demands of these two dimensions through content relevance and outcome visualization. At the relevance level, teachers can screen short videos based on students’ majors (for example, for business students, select short videos of interviews with Amazon sellers; for engineering students, select short videos introducing technological products), making students realize that listening and speaking learning is related to their future professional development. At the satisfaction level, students can post their oral English videos on the class’s exclusive TikTok account, receive immediate feedback from classmates’ likes and comments, or participate in short video oral English competitions, such as introducing hometown delicacies in English, and the winning works will be displayed on the school’s English platform. This visualization of achievements enables students to directly perceive their own progress, enhance their sense of accomplishment in learning, and thereby strengthen their motivation to learn.

3 Research Design

3.1 Research Object

This study selected two natural classes in the second year of a non-English major in a certain university as the research subjects, totaling 90 students, including 20 from business, 45 from engineering, and 25 from liberal arts. All the students passed the College English Test Band 4 (with scores ranging from 425 to 500) and had not received short video-assisted English listening and speaking teaching before. By random grouping, one class was set as the experimental group (45 students, including 23 boys and 22 girls, The average age was 20.3 years), and the other class was set as the control group (45 students, including 24 boys and 21 girls; the average age was 20.5 years old). To ensure that the initial levels of the two groups were consistent, pre-tests of listening and speaking were conducted on both groups before the experiment. The results of the independent sample t-test indicated that the results of the initial listening and speaking were ( $t$ $=$ 0.28, $p$ $=$ 0.78 $>$ 0.05) and ( $t$ $=$ 0.35, $p$ $=$ 0.73 $>$ 0.05), respectively. The two were not statistically significant and could be compared and analyzed.

3.2 Testing

3.2.1 Teaching intervention plan

The experimental period was 16 weeks, with 2 class hours per week (each class hour lasted 45 minutes). The teaching content of both groups was centered around three units: campus life, cross-cultural communication, and career planning. The same teacher was used to ensure the controllability of teaching variables. The specific intervention plan is shown in Table 1.

Table 1 Intervention plan

Group	Teaching Process	Core Task
Experimental group	1. Introduction: Play short videos to arouse students’ interest. 2. Listening training: Design the tasks of “keyword extraction, main idea summary, detail analysis” around short videos. 3. Oral training: Imitate short video conversations: group role-playing – record a 1-minute oral video. 4. Feedback: The teacher comments on the video and students evaluate each other	1. Submit one oral imitation video every week. 2. After the unit, complete the task of adapting the short video content (such as adapting the campus club recruitment short video into a group dialogue).
Control group	1. Introduction: The teacher explains the unit theme. 2. Listening practice: Listen to the textbook recordings and complete the multiple-choice/fill-in-the-blank questions at the end of the class. 3. Oral practice: Read the dialogues in the textbook aloud and discuss the topics in the textbook in groups. 4. Feedback: The teacher corrects the after-class exercises	1. Complete the listening and speaking exercises at the end of the textbook. 2. Submit the written oral dialogue script after the unit is completed.

Group

Teaching Process

Core Task

Experimental group

1. Introduction: Play short videos to arouse students’ interest.

2. Listening training: Design the tasks of “keyword extraction, main idea summary, detail analysis” around short videos.

3. Oral training: Imitate short video conversations: group role-playing – record a 1-minute oral video.

4. Feedback: The teacher comments on the video and students evaluate each other

1. Submit one oral imitation video every week.

2. After the unit, complete the task of adapting the short video content (such as adapting the campus club recruitment short video into a group dialogue).

Control group

1. Introduction: The teacher explains the unit theme.

2. Listening practice: Listen to the textbook recordings and complete the multiple-choice/fill-in-the-blank questions at the end of the class.

3. Oral practice: Read the dialogues in the textbook aloud and discuss the topics in the textbook in groups.

4. Feedback: The teacher corrects the after-class exercises

1. Complete the listening and speaking exercises at the end of the textbook.

2. Submit the written oral dialogue script after the unit is completed.

3.2.2 Data collection tools

The listening test: This adopts the adapted College English Test Band 4 listening test questions, which are divided into short dialogues (20 points), long dialogues (30 points), and news reports (50 points); the full score is 100. The difficulty of the pre-test and post-test was consistent (evaluated by three experienced English teachers, with difficulty coefficients of 0.65 for both), and the reliability passed the Cronbach coefficient test ( $α = 0.82 > 0.8$ ), demonstrating good reliability and validity.

The oral test: This is conducted in the form of topic discussion. The pre-test and post-test topics are “My College Life” and “Interesting Stories in Cross-cultural Communication” respectively. The test duration is 3 minutes per person. The scoring criteria refer to the “College English Teaching Guide” and are divided into four dimensions: pronunciation (25 points), fluency (25 points), vocabulary and grammar (25 points), and content logic (25 points), with a full score of 100 points. The scores were blindly evaluated by two experienced English teachers. The consistency coefficient (Pearson correlation coefficient) $r = 0.85 > 0.8$ , indicating good reliability.

The learning motivation questionnaire: Adapted from Keller’s ARCS Learning Motivation Scale, it consists of 4 dimensions (attention, relevance, confidence, satisfaction), with a total of 20 questions (5 questions for each dimension), and uses the Likert 5-point scale (1 $=$ strongly disagree, 5 $=$ strongly agree). The questionnaire was revised by the pre-test $(n = 30)$ , with the Cronbach coefficient $α = 0.78 > 0.7$ . The structural validity was verified by exploratory factor analysis (KMO = 0.75, Bartlett’s sphericity test p $<$ 0.001), which met the measurement requirements.

The classroom observation and interviews: During the experiment, the number of times students in both groups actively spoke up and their participation in group discussions were recorded every week. After the experiment was completed, 10 members of the experimental group were randomly selected for interviews. The interview outline included “What do you think are the benefits of short videos for listening and speaking learning?” Do short videos make you more willing to learn listening and speaking? After the interview recordings were transcribed, qualitative coding analysis was conducted.

3.2.3 Data processing methods

Quantitative data processing was carried out using SPSS 24.0 software. First, independent t-test analysis was conducted on the test scores before and after in the experimental group and the control group to clarify the teaching gap between them. Second, one-way analysis of variance (ANOVA) was used to test the questionnaire data to understand the motivation gaps within it. Finally, Nvivo software was used to encode the nature of the interview text to obtain the main viewpoints.

4 Research Results

4.1 Comparison of Listening and Speaking Scores Between the Experimental Group and the Control Group

4.1.1 Comparison of listening test scores

Table 2 gives the mean (M), standard deviation (SD), t value and p value of the anteroposterior experimental group and the control group. It can be observed that before the experiment, p $>$ 0.05, and there was no significant difference in the listening score. After the experiment, $t$ $=$ 5.82, p $<$ 0.01, and the average score was higher than the listening score before the experiment. The experimental group specifically increased by 16.3 points, while the control group improved by 6.7 points, affirming the improvement advantage of the experimental group.

Table 2 Comparison before and after the experiment English listening scores between the experimental group and the control group (M $\pm$ SD, n $=$ 45)

Test Time	Experimental Group	Control Group	t	p
Pre-test	62.3 $\pm$ 8.5	61.8 $\pm$ 8.9	0.28	0.78
Post-test	78.6 $\pm$ 7.2	68.5 $\pm$ 8.1	5.82	$<$ 0.01

As shown in Table 2, compared with the control group, the advantages of the experimental group were more obvious. To further break down the impact of short videos on different dimensions of listening ability, the listening scores were then subdivided by question types (short dialogues, long dialogues, news reports), and the results are shown in Figure 1.

Figure 1 The results of the impact of short videos on different dimensions of listening ability under different types of questions.

As can be seen from Figure 1, the experimental group performed better in all three types of questions, with the most significant improvement in the long dialogue question type (22.5 $\pm$ 3.2 in the experimental group vs. 18.3 $\pm$ 3.5 in the control group, with a difference ratio of $+$ 22.95%). This is highly correlated with the content characteristics of Vlogs such as daily interactions and street interviews in short videos. Such short videos contain multiple rounds of dialogues, topic switching and logical transitions, which match the examination goals of long dialogues with coherent information and strong interactivity, helping students adapt to the logical flow of language in real dialogues. However, the improvement of the news reporting question type was relatively weak ( $+$ 9.36%). Because short video news (such as BBC Short News) has a faster speaking speed and more complex accents (including Australian and Indian English), some students with weak foundations needed to adapt for a longer time. This result also provides empirical evidence for stratified adaptation of the speaking speed of short videos in subsequent teaching. In terms of the extent of improvement, long conversations $>$ short conversations $>$ news reports, indicating that short videos are more adaptable to interactive listening scenarios than to one-way information transmission scenarios.

4.1.2 Comparison of oral English scores

Table 3 represents the comparison of the oral English scores of the two groups. Before the experiment, $p$ $=$ 0.73 $>$ 0.05, showing no significant difference. After the experiment, the average score of the control group was 66.2 $\pm$ 8.4, which was lower than that of the experimental group (76.3 $\pm$ 7.8), and $t$ $=$ 4.95, p $<$ 0.0, showing a significant difference. Considering the situation of other dimensions, the most significant improvements were in the pronunciation dimension and the fluency dimension, with average scores of 78.5 $\pm$ 6.9 and 77.2 $\pm$ 7.1 respectively. Compared with the control group, they increased by 12.3 points and 11.5 points respectively, affirming the voice imitation and low anxiety output characteristics of short videos.

Table 3 Comparison of oral English scores before and after tests between the experimental group and the control group (M $\pm$ SD, $n = 45$ )

Test Time	Experimental Group	Control Group	t	p
Pre-test	60.5 $\pm$ 9.2	59.9 $\pm$ 9.5	0.35	0.73
Post-test	76.3 $\pm$ 7.8	66.2 $\pm$ 8.4	4.95	$<$ 0.01

To visually present the changing trends of the two groups of grades, Figure 2 is drawn.

Figure 2 The trend of change in the comparison between the listening dimension and the speaking score before and after the experiment.

As can be seen from the figure, before the experiment, the scores of the two groups were roughly the same, while the post-test scores showed a significant gap. Moreover, compared with the control group, the growth slope of the experimental group was more prominent, further verifying the effectiveness of short video-assisted teaching.

4.2 Comparison of Learning Motivation Between the Experimental Group and the Control Group

Table 4 represents the comparison of the mean values among the motivation dimensions after the experiment. From the results, it can be seen that the average values of each dimension of learning motivation in the experimental group were higher than those in the control group (p $<$ 0.05). The most significant dimension was the confidence dimension, which was 0.9 points higher, indicating that this model can enhance students’ learning confidence.

Table 4 Comparison of the post-test mean values of each dimension of learning motivation between the experimental group and the control group (M $\pm$ SD, $n = 45$ )

Motivation Dimension	Experimental Group	Control Group	t	p
Attention	4.2 $\pm$ 0.5	3.1 $\pm$ 0.7	10.25	$<$ 0.05
Related	4.1 $\pm$ 0.6	3.2 $\pm$ 0.6	9.87	$<$ 0.05
Confidence	4.0 $\pm$ 0.5	3.1 $\pm$ 0.7	12.36	$<$ 0.05
Satisfied	3.9 $\pm$ 0.6	3.0 $\pm$ 0.8	11.52	$<$ 0.05

Table 4 shows that the learning motivation of the experimental group was significantly higher in all dimensions than that of the control group. To further explore the changing trend of motivation during the 16-week teaching (such as whether there were initial fluctuations and later stability characteristics), the scores of each dimension of the two groups were tracked according to the experimental period (weeks 1, 4, 8, 12, and 16), and the results are shown in Figure 3.

Figure 3 The changing trend of motivation in the 16-week teaching.

Two core features can be observed from Figure 3. First, the motivation of the experimental group shows a phased increase. Weeks 1 to 4 are a period of rapid growth (such as the confidence dimension rising from 2.9 to 3.5, $+$ 20.7%). During this stage, the visual interest and content freshness of short videos quickly stimulate students’ interest. After the 8th week, it enters a stable improvement period. The growth rate of scores in each dimension slows down (for example, the attention dimension rises from 4.0 to 4.2), indicating that the motivation improvement has shifted from being driven by novelty to being driven by ability progress (for instance, students form a positive cycle from ability improvement to motivation enhancement due to the improvement in listening accuracy and favorable comments on oral videos). Second, there was no significant fluctuation in the motivation of the control group throughout the process (the difference in scores of each dimension was $\leq$ 0.1), further confirming that in traditional teaching it is difficult to continuously activate learning motivation and highlighting the advantages of short videos in maintaining motivation. Furthermore, the satisfaction dimension always lags behind other dimensions (for instance, in the 16th week, satisfaction was 3.9 vs. attention was 4.2), as satisfaction needs to be based on visible learning outcomes (such as academic improvement and recognition from others), and it takes longer to accumulate. This suggests that in teaching, it is necessary to regularly display students’ achievements, such as selecting excellent oral English videos every month, to accelerate the formation of satisfaction.

4.3 Qualitative Research Results

The interview texts of 10 students in the experimental group were encoded, and three core themes were extracted.

Real context helps with understanding: Eight students mentioned that the English in the short video is different from that in the textbook. There are many everyday expressions, such as “break a leg”. I’m not so nervous when listening to foreigners now.

Interactivity enhances the sense of participation: Seven students said, “Recording oral English videos is very interesting. Seeing the likes from classmates and the comments from teachers, I am more willing to take the initiative to practice oral English.” The adaptability of fragmented learning: Nine students believe that “watching a 2-minute short video during the break doesn’t require special time and gradually accumulates a lot of expression.”

Classroom observation data show that the average number of active speeches per class in the experimental group (18.5 times) is three times that of the control group (6.2 times), and the participation rate in group discussions (92%) is significantly higher than that of the control group (65%). This further confirms the enhancing effect of short videos on teaching interactivity and more accurately captures the impact of short videos on classroom participation. This study subdivided classroom interaction behaviors into four categories: active questioning (students voluntarily ask questions to teachers or classmates), group discussion (expression during group discussion), class-wide sharing (oral presentation to the whole class), and submission of oral video (asynchronous interaction task after class). The cumulative frequency over 16 weeks was calculated, and the results are shown in Figure 4.

Figure 4 Comparison of the cumulative frequency of classroom interaction behavior between the two group over 16 weeks.

As can be seen from Figure 4, the experimental group was more active in all types of interactions. Among them, the frequency of active questioning differed the most (8.2 $\pm$ 2.3 in the experimental group vs. 2.1 $\pm$ 1.5 in the control group, reaching 3.9 times). Because the content of short videos often involves cultural differences (such as foreign festival customs) and local slang (such as “hit the hay” for sleeping), it stimulates students’ desire to explore and prompts them to actively seek verification. The submission of oral video was an exclusive task for the experimental group (cumulative 14.5 $\pm$ 2.1 times per person). This type of asynchronous output task allowed students to repeatedly record and modify, reducing the anxiety of oral expression and thereby increasing the frequency of sharing throughout the class (6.8 $\pm$ 1.8 for the experimental group vs. 2.5 $\pm$ 1.2 for the control group, 2.7 times). The differences in speeches within the groups were relatively small (1.9 times), as both groups had group discussion tasks. However, the experimental group’s discussions revolved around short video content (such as how to adapt supermarket shopping conversations), with more specific topics, avoiding the predicament of having nothing to say in traditional discussions. This result indicates that short videos not only increase the frequency of interaction but also deepen the quality of interaction.

4.4 Analysis of the Effect Differences of Different Types of Short Video

To further optimize the selection of short video teaching resources, this study classified the short videos used by the experimental group into three categories based on their content attributes: education-related (such as BBC Learning English, focusing on grammar explanations and standard conversations), culture-related (such as cross-cultural misunderstandings in various countries, emphasizing cultural backgrounds and communication scenarios), and daily-related (such as life Vlogs, shopping conversations, emphasizing genuine daily expressions). Through weighted calculation of the student effect rating questionnaire (n $=$ 45, 1–5 points, 1 $=$ no effect, 5 $=$ excellent effect) and teacher ability assessment (scored by three experienced teachers based on students’ listening and speaking performance), from five dimensions: listening detail capture, listening main idea extraction, oral pronunciation imitation, oral fluency improvement, and learning motivation stimulation, The comparison of the effects of the three types of short videos is show in Figure 5.

Figure 5 Comparison of the effects of different types of short videos on listening and speaking skills and motivation.

As can be seen from Figure 5, each of the three types of short video has its own advantages: educational short videos perform best in capturing listening details (4.5 points) and imitating oral pronunciation (4.4 points). This is because these videos are equipped with precise subtitles, pronunciation annotations (such as stress and linking prompts), and the dialogue scripts are logically clear, making them suitable for students with weak foundations to solidify language details. Cultural short videos scored the highest in terms of stimulating learning motivation (4.6 points). In the interview, 8 students mentioned that when watching cultural videos, they would think about the possible scenarios they might encounter when going abroad in the future and were more willing to learn related expressions, indicating the significance of cultural correlation performance reinforcement learning. Short videos of daily life have a significant advantage in improving oral fluency (4.5 points). As the content is close to life (such as ordering coffee, making an appointment with a doctor), students can easily transfer the expressions in the videos to their own experiences, making the output more natural and smoother (for example, when a student ADAPTS a Vlog conversation to a scene of themselves and their classmates shopping in a supermarket). The effect differences among the three types of short videos suggest that, in teaching, it is necessary to avoid relying on a single type and to use them in combination according to the teaching objectives. The basic training stage mainly focuses on education, emphasizing detailed abilities. The scene application stage mainly focuses on daily tasks to enhance the fluency of expression. The entire process is interspersed with cultural elements to maintain learning motivation and form a complementary resource combination model.

5 Conclusion

This study, through a 16-week teaching experiment and a hybrid research method, systematically explored the effect, mechanism and adaptation strategies of integrating social media short videos into college English listening and speaking teaching. The main conclusions are as follows. In terms of teaching effect, social media short videos can significantly achieve a synergistic improvement in English listening and speaking skills in college students of other majors and effectively break through the limitations of traditional teaching. From the perspective of listening ability, the multi-accent real speech flow provided by short videos (such as American, Australian English and the features of linking and weak reading) and interactive scenarios (Vlog daily conversations, street interviews) precisely match the examination requirements of long dialogue information coherence and logical transitions. The improvement in listening ability of the experimental group (16.3 points) reached 2.4 times that of the control group (6.7 points), and the improvement in long dialogue questions was the most significant (with a difference ratio of +22.95%), followed by short dialogues (+18.75%), while the improvement in news reports was relatively weak (+9.36%). This difference not only confirms the adaptability advantage of short videos to interactive listening scenarios but also provides empirical evidence for adjusting the speaking speed of short videos in a stratified manner for students with weak foundations in subsequent teaching. From the perspective of oral language ability, the low-pressure imitation + role replacement + creative expansion training path constructed by short videos has effectively alleviated students’ anxiety about oral expression. The improvement in pronunciation and fluency dimensions in the experimental group was the most prominent, with scores 12.3 and 11.5 higher than those in the control group respectively. At the same time, through a closed-loop design of oral video submission, teacher comments, and mutual evaluation among classmates, it promotes students to shift from passive reading in traditional classrooms to active creation, completely solving the common problem of having nothing to say in traditional oral teaching. At the level of learning motivation, social media short videos have fully activated the four dimensions of the ARCS learning motivation model and established a sustainable positive cycle mechanism for learning motivation. In terms of the attention dimension, the visual presentation of short videos and multicultural content (such as popular science of cross-cultural misunderstandings) quickly attracted students’ interest, making the attention dimension score of the experimental group (4.2 $\pm$ 0.5) significantly higher than that of the control group (3.1 $\pm$ 0.7). In the relevant dimensions, the research strengthened the connection between listening and speaking learning and professional development by screening short videos by students’ majors (such as selecting interviews with Amazon sellers for business students and introductions to technological products for engineering students), enabling students to clearly perceive the practical significance of learning. In the confidence dimension, the short-duration task design of 15 seconds to 1.5 minutes and the asynchronous output mode supporting repeated recording significantly reduced the learning failure risk of students, making the confidence dimension of the experimental group the dimension with the greatest difference from that of the control group (0.9 points higher). In the satisfaction dimension, through the visualization of results such as the display of class-exclusive short video accounts and the monthly selection of outstanding oral English videos, students can directly feel their own progress. Although the improvement pace of the satisfaction dimension lags behind that of other dimensions, it is still significantly higher than that of the control group (3.9 $\pm$ 0.6 vs 3.0 $\pm$ 0.8). Ultimately, the experimental group formed a positive cycle where short videos stimulated interest $\to$ improved listening and speaking skills $\to$ received teacher comments and classmate recognition $\to$ further enhanced learning motivation. In contrast, the learning motivation of the control group showed no significant fluctuations throughout the process (the difference in scores of each dimension was $\leq$ 0.1), fully highlighting the core value of short videos in maintaining long-term learning motivation. In terms of teaching adaptability, social media short videos are highly consistent with the learning characteristics of contemporary college students and effectively fill the core gap of traditional teaching. On the one hand, the duration characteristic of short videos ranging from 15 seconds to 5 minutes is highly compatible with fragmented learning scenarios such as breaks and commutes for college students, achieving high-frequency and low-burden listening and speaking practice. All 9 students in the experimental group interviewed recognized the efficiency of accumulating English expression by watching 2-minute short videos during breaks. On the other hand, the innovation in interactive models brought about by short videos not only increased the frequency of classroom interaction but also deepened the quality of interaction. The frequency of active questions in the experimental group (8.2 $\pm$ 2.3 times per person) was 3.9 times that of the control group (2.1 $\pm$ 1.5 times per person). The frequency of whole-class sharing (6.8 $\pm$ 1.8 times per person) was 2.7 times that of the control group (2.5 $\pm$ 1.2 times per person). Moreover, discussion topics such as adapting supermarket shopping conversations and verifying the meanings of local slang based on the short video content made the group interaction more targeted, avoiding the problem of general and shallow group discussions in traditional classrooms. The participation rate of group discussions in the experimental group reached 92%, significantly higher than 65% in the control group. Based on the differences in the effects of different types of short videos, this study also distills operational teaching integration strategies: educational short videos (such as BBC Learning English), equipped with precise subtitles and pronunciation annotations, are suitable for use in the basic training stage (weeks 1–4) to consolidate students’ ability to capture listening details and the foundation of oral pronunciation; daily short videos (such as life vlogs and shopping conversations) are suitable for improving students’ oral fluency during the scene application stage (weeks 5–12) because their content is close to students’ lives and can easily guide students to “transfer” their language expression to their own experiences; cultural short videos (such as popular science about cross-cultural misunderstandings in various countries and festival customs) can be interspersed throughout the 16-week teaching cycle as they can enhance the sense of meaning in learning, continuously maintaining students’ learning motivation. The three types of short videos complement each other’s advantages, providing a specific and feasible practical solution for college English listening and speaking teaching. Meanwhile, this study also has certain limitations: the research samples were only from non-English major students of one university, and the representativeness needs to be further expanded; the 16-week experimental period is relatively short, making it difficult to fully verify the long-term sustainability of the teaching effect of short videos on social media. Moreover, no in-depth comparison was made on the impact of the differences in content characteristics and interaction modes between different short-video platforms (such as YouTube and TikTok) on teaching effectiveness. Future research can expand the sample range to multiple majors in different regions and at different levels of universities, extend the experimental period to one academic year to track the stability of the effect, and at the same time explore the differences in teaching adaptation of short videos on different platforms. It can also combine artificial intelligence technologies (such as speech recognition and intelligent feedback) to optimize personalized guidance for short video learning. This will further enhance the guiding value of research results for the practical teaching of college English listening and speaking.

References

[1] X.F. Cao, ‘New Perspectives on College English Teaching: A Review of “Research on Applied Linguistics and College English Teaching”, Chin. J. Educ., (12), pp. 125, 2024.

[2] C. Chen, ‘The Application of Interactive Teaching Method in University Teaching – Review of ‘Research on the Application of Interactive Teaching Method in College English Teaching”, Univ. Educ. Sci., (04), pp. 2, 2024.

[3] H.J. Fu, ‘Selection and Integration of Teaching Methods and Strategies for College English in the New Era: A Review of “Contemporary College English Teaching Theory and Research’, Res. Educ. Dev., 44(06), pp. 85, 2024.

[4] J.Y. Zhang, H.Y. Zhao, ‘Innovative Paths of College English Teaching under the Background of Digital Transformation’, J. Foreign Lang., (02), pp. 84–91, 2024.

[5] M.F. Yang, ‘The Inheritance and Development of Traditional Culture in College English Teaching: A Review of “Research on the Integration and Penetration of Chinese Traditional Culture and College English Teaching”’, Chin. J. Educ., (12), pp. 140, 2023.

[6] L. Chuan, ‘Research on the Construction and Practice of Listening and Speaking Teaching Based on Smart Teaching Platform’, Mod. English, (19), pp. 37–40, 2023.

[7] Y.N. Song, ‘Immersive Online Teaching of College English Listening and Speaking Based on Flow Theory: Concepts and Practices’, Univ. Educ., (18), pp. 52–55, 2023.

[8] A.X. Ye, ‘Integration of college English listening and writing teaching design’, Journal of Zhejiang Wanli College, 36(02), pp. 112–116, 2023.

[9] X.Q. Dong, Y. Yuan, Q. Xu, ‘Based on the mobile platform in college English listening continued action research’, Foreign Language and Foreign Language Teaching, (01), pp. 84–95 $+$ 147, 2023.

[10] C.L. Liu, C.Z. Li, ‘Based on output orientation method of college English listening and speaking teaching design and practice’, Journal of Quality Education in the West, 8(21), pp. 182–185, 2022.

Biography

Tingting Wu, born in Wuhu, Anhui Province in 1992, holds a master’s degree and works at the Department of Social Management and Service, Hefei Preschool Education College, as a lecturer. Her research directions are second language acquisition and English education.