STUDENT MOTIVATION 2 Student motivation in low-stakes assessments
Introduction
The overall goal of this paper is to examine how students motivation can be affected by taking low-stakes tests and how the amount of effort students put into a low-stakes assessment may affect the validity of test results. It will also describe how teachers can build motivation in their students for required low-stakes exams and why student effort on assessments is important to teachers, administrators and test creators. This paper will review the project of my group mates and I in Assessment in TESOL at Hawaii Pacific University (HPU) during the spring of 2014. Included are the projects objectives and specifications, a description of the students and their institution, the results of the students who took our quiz, a reflection and discussion on this particular assignment and future inquires based on my previously-mentioned area of study.
Product Description - Background I nformation
Host class
The name of the host class is WR1100: Analyzing and Writing Arguments and it is taught by Professor Brian Rugen. The host class student teacher is Kristen Coulter. There are thirteen students in the class and their countries of origin include: Vietnam, Japan, Norway, Sweden and the United States. Their proficiency levels range from intermediate to advanced (nearly fluent). The class meets every Monday, Wednesday and Friday from 0940-1035. According to the professors syllabus (Rugen, 2014), the class objectives include that passing students should be able to: 1. read, annotate, and summarize texts by finding the thesis (major argument), topic sentences (supporting arguments), and supporting details 2. become able to examine your own cultural assumptions and personal motivations in any argument 3. make the appropriate choices according to a writer's purpose, topic, and audience STUDENT MOTIVATION 3 4. recognize and use a number of models, patterns, and techniques for organizing academic arguments 5. understand and use appropriately these useful strategies: types of appeals, types of claims, logical fallacies, and mediation/negotiation 6. understand and use the writing process, including brainstorming, outlining, drafting, peer review, revising, and editing 7. use sources appropriately and properly
The overwhelming majority of the students are strong communicators, so they currently need to develop their confidence and vary their vocabulary when speaking publicly. They also need to become more proficient at reading and researching academic English texts. From observing the class, my overall impression is that the students are eager to learn and enjoy this class because of the easygoing nature of the professor. Many of the students seem ready for more advanced writing and academic studies. In general, the host teacher uses a genre approach for learning. First, the students are exposed to a genre and/or text and given discovery activities in order to notice the texts features. Next, students perform guided practice and then application activities to practice what theyve learned before incorporating it into an argumentative essay. The host teacher has more of a process approach for assessment, especially since this is a writing course. The students are encouraged to brainstorm, write outlines and revise their essays with help from the teacher and their fellow students. Host institution
WR1100 takes place at HPU, located in Honolulu, Hawaii. It is a private school that offers undergraduate and graduate degrees. Students at this university come from all 50 states of the United Stated and nearly 80 countries around the world. The mission of HPU is that Students from around the world join us for an American education built on a liberal arts foundation. Our innovative undergraduate and graduate programs anticipate the changing needs STUDENT MOTIVATION 4 of the community and prepare our graduates to live, work, and learn as active members of a global society (HPU, 2014). Group members
This assessment group consisted of Nick Bayani, Kristen Coulter, Megan Hanlon and Chrysa Staiano (myself). Nick is from Japan and has a lot of experience taking language tests (TOEFL, TOEFL-iBT, STEP Eiken) and teaching communication at his mothers cram school. Kristen is from Michigan and is WR1100s student teacher. She has no additional language teaching experience. Megan is from Minnesota and has taught English for over three years in the Czech Republic and the United States. Chrysa is from New York and has experience teaching English in Korea and participating in community literacy programs in Hawaii. Kristen, Chrysa and Megan plan to graduate from the MATESOL program in May, 2014. Language Assessment Instrument
Our assessment project, a paraphrasing quiz was given on Friday, March 7, 2014. The students had fifty minutes approximately the entire class period to complete it. Our assessment approach was to make this quiz communicative and authentic so we added context to every item. The context for each item included writing for academic purposes, which is a real task that university students perform. The context also helped the assessment seem purposeful because it allowed the students to understand why they were performing each function (paraphrasing, identifying synonyms, etc.). The version of the quiz given to the students can be found in Appendix A. This quiz is considered an achievement test given at the end of the class unit on paraphrasing. The unit was taught by the student teacher and directly tied to the professors objective of using sources appropriately. The item-design approach included making the quiz STUDENT MOTIVATION 5 integrative and direct. In some items, the students were asked to identify synonyms while maintaining the same part of speech, while in other items they had to change word order and class. In the final sections, they had to paraphrase short paragraphs in complete sentences. The goal of the quiz was to measure how well the students could paraphrase academic sources, so the tasks asked them to paraphrase and cite no more than three sentences of academic text. The quiz was criterion referenced even though the scores did not count towards the students final grade in WR1100. As noted in our specifications, a satisfactory score is considered to be 80%. Objectives
The objectives for this quiz were to measure students' ability to: a) paraphrase by utilizing synonyms, b) paraphrase by changing sentence structure, c) paraphrase by adjusting the word class, and d) retain the main idea of original text or quote within the paraphrase. Specifications
Content Operations: Producing synonyms, using paraphrasing techniques and paraphrasing sentences as part of academic writing skills
Types of text Academic texts from college textbooks and research journals
Addressees Native and non-native speaking university students
Topics of texts General Academic English
Dialect and Style General Academic English, formal style
Length of texts Total two pages.
Speed of processing Careful: 100 words/minute for model sentences
Structure, timing, medium and techniques Test Structure Five sections: 1. vocabulary recognition; 2. producing synonyms in sentences; 3. producing paraphrases by changing word order/sentence structure; 4. paraphrasing whole sentences; 5. paraphrasing 2-3 sentence clusters STUDENT MOTIVATION 6
Number of items 1. vocabulary recognition (3 items) 2. producing synonyms in sentences (5 items); 3. producing paraphrases by changing word order/sentence structure (4 items); 4. paraphrasing whole sentences (2 items); 5. paraphrasing 2-3 sentence clusters (2 items) Total = 16 items
Criterial levels of performance A satisfactory score will be considered 80% or above.
Scoring procedures Section 1: objective scoring with a key (no credit) Section 2: subjective scoring based on whether students used a synonym and kept the same part of speech (1pt/item) Section 3: subjective scoring based on whether students paraphrased and retained the phrases original meaning (2pts/item) Sections 4 & 5: subjective scoring using an analytical rubric that measures whether the original meaning of the passage has been retained, paraphrasing techniques have been used and citations attempted/written correctly (6pts/item).
Each rater graded every item independently according to the rubric for each section. Before discussing the grades as a group, our initial inter-rater reliability was 33%. Then we came together to input each of our grades into a spreadsheet and for a round table discussion to decide the final grade for each section. During the discussion we talked about which items we removed points for and why. If we continued to disagree on the scores, we compared the students given answers to the rubric to ensure we (all the raters) were sticking to the same guidelines for the section/item in question. After the group discussions, our inter-rater reliability was 100%. We STUDENT MOTIVATION 7 did not split the difference (e.g. give a 4.5 if one scorer had a 4 and one had a 5) because our rubric did not allow for half points. Student Results
The total amount of points a student could earn on the quiz was 37. The mean was 28.8/37, or 77%. The highest score was 37 (100%), which one student achieved. The lowest score was 16/37 (43%). The mode was 35, which was earned by two students. A complete a list of all student scores can be found in Appendix B and descriptive statistical analysis of the quiz is found in Appendix C.
STUDENT MOTIVATION 8 Reflection and Discussion
While we administered the quiz to WR1100 on March 7, 2014, one of the students asked if the quiz factored into their course grade. Our group member administering the test at the time replied no but then encouraged all of the students to do their best regardless. Though the process of making, calibrating, administering and analyzing the quiz was important to my assessment group, this was not the case for the WR1100 students. Therefore, the quiz was considered to be low-stakes, meaning there was no academic or meaningful consequence to the students in regards to test performance, regardless of its level of importance for teachers and other stakeholders (Abdelfattah, 2010). In low-stakes assessments, the lack of consequences and personal benefits for the students may lead to decreased student motivation to perform well on the exam. This is especially true of achievement tests where students receive neither grades nor academic credit (Wise & DeMars, 2005). As this was the case with our paraphrasing quiz, it is questionable whether all of the students performed their best, especially since they knew the quiz held no academic weight for them. There is not certain proof that the WR1100 students did not try their best, but some of the quiz scores leave much to be desired. The mean score was a 28/37 (76%) and only 7/13 of the students achieved a satisfactory score of 80% or above. During the test administration, it was not overtly noticeable that students were not putting forth the greatest amount of effort possible. On Section 5.2 of the quiz, the last section (and presumably the section that the students would spend the least amount of effort on), only two students (#4 & #5) received no credit or half of a point (from a total of 6). Student #4s paper has markings on it in Section 5.2 (underlines and circles), which indicates that the student did attempt to read and dissect the question. Student #4 received an overall score of 22/37 (60%) and STUDENT MOTIVATION 9 clearly attempted to answer the other items on the quiz, so the low overall score more likely indicates lack of proficiency with paraphrasing techniques rather than low motivation. A similar case can be made for Student #5. In Section 4.1 this student received a 6/6, in Section 4.2 a 5/6 and in Section 5.1, a 5/6. (Sections 4 & 5 asked the students to paraphrase a short academic paragraph.) In 5.2 however, the student correctly began his/her paraphrase with Meyerhoff (2006) also writes with nothing following. Since the students previous answers clearly showed he/she was rather comfortable paraphrasing, it can be inferred that he/she simply ran out of time. One positive aspect of low-stakes tests are that they do not cause a great amount of stress on students, unlike high-stakes tests, such as the Scholastic Aptitude Test (SAT) and Test of English as a Foreign Language (TOEFL), but there are many drawbacks too. First, giving students an abundance of low-stakes tests causes them to believe all tests are low-stakes, which is entirely not true. In addition, low-stakes tests are linked to providing invalid results because they do not measure what students know. Instead, they reveal how students perform with minimal effort (ONeil, Sugrue, & Baker, 1995/1996, in Barry, Horst, Finney, Brown, & Kopp, 2010). The issue of invalid test results as a result of low student motivation has been looked into by many researchers. Huffman, Adamopoulos, Murdock, Cole, and McDermid (2011) cautions that low motivation on low-stakes exams severely limits the validity of standardized tests as a means of evaluating student outcomes. The quiz for WR1100 was not a standardized test, but it is arguable that any known low-stakes assessment begins with questionable validity before adding all the other factors that normally contribute to decreased validity (e.g. scoring procedures, language and appearance of the test). Wise and DeMars (2005) echo ONeil et al., (1995/1996) by claiming that low motivation leads to reduced test scores so the scores are not STUDENT MOTIVATION 10 an accurate reflection of students true academic capabilities. Because motivated students routinely outperform their less motivated peers, high ability can be obscured by low motivation, which causes test scores to reflect variation in levels of effort and not ability. This is a major problem because differences in students efforts are not what teachers aim to infer from assessment scores. Test-taking effort According to Wise and DeMars (2005), test-taking effort is students engagement and expenditure of energy toward attaining the highest score possible on a test. In low-stakes tests, students have generally been found to give 60-70% of their total effort, which indicates they are not trying their hardest to attain the highest score possible (Cole & Bergin, 2005, in Huffman, et al.). Factors that influence students test-taking effort include their beliefs about their competence and perceived difficulties of the task. These beliefs in turn are influenced by attainment value (the importance of doing well/scoring high), intrinsic value (enjoyment gained by completing the task), utility value (how the task fits into ones future plans) and perceived costs (sacrifices made in order to do the task) (Wise and DeMars, 2005). While some students will not try their hardest on low-stakes tests because of weak beliefs and no perceived personal benefit from the testing experience, many students will give considerable effort because they have been conditioned to do so in the past. Other students may also apply a decent amount of effort, just not as much as if they were working towards a certain assessment or course grade. These are the two most likely scenarios for the students in WR1100. Despite knowing the quiz was low-stakes, they applied a considerable or decent amount of effort because of their self-pride and perceived value of their academic work. They may have also been influenced by a desire to help my assessment group with our project. These efforts can be seen STUDENT MOTIVATION 11 by the 7/13 students who scored an 80%+ on the quiz, but also on the low-scoring quizzes of students #4, #6 and #8. Both #4 and #6 earned no credit on Section 2 of the quiz, yet markings and attempted (yet incorrect) answers indicate they at least tried to answer correctly. Wise and DeMars (2005) state that very few students will honestly attempt a few test questions and either guess on the rest or leave them blank. Only one student (#8) appeared to skip an entire section (no markings or attempted answers), but there is no evidence to suggest he/she did this out of defiance. Raising motivation
Teachers have many options in helping the build motivation in their students before giving a low-stakes text. First, Wise (2009) suggests that teachers appeal to the academic citizenship of their students, meaning they should be encouraged to give a significant amount of effort on tests despite no direct personal benefits (p. 155). With an appeal like this, students will gain a more accurate picture of why the test is important for their teachers, administrators and institution as a whole. If the students are tuned into how the results of their assessments may affect future students, they will be more likely to take them seriously. Before distributing our test, an academic citizenship appeal certainly could have been given to the students of WR1100. My group could have been upfront and admitted the test did not count towards the students grades (instead of being prompted by the student), but that the results were still very important for our own learning. Another option would have been to discuss our own insights into the quiz results and our reflection on the project when we returned the quizzes to the students. Instead, we just told them the high score, low score and average for the class and remarked that everyone did really well (despite a 76% class average). Giving substantive feedback aligns STUDENT MOTIVATION 12 with Wise and DeMars (2005) suggestion that providing feedback is a way to raise students motivation in future assessments. Wise and DeMars (2005) also suggest that in order to raise motivation in low-stakes tests, teachers can change their assessment methods to a format that students prefer, such as multiple- choice as opposed to an essay exam. While our assessment group did change the format of our quiz during the drafting process, we did not do so out of consideration for the WR1100 students. We made changes based on the feedback from our Assessment professor and host teacher, which included adding a recognition section (Section 1) for no credit. We also made aesthetic changes such as adding white space and changing the font of the examples to make students feel more comfortable. My group and I never considered making the quiz purely multiple-choice because those items are only recognitional and would not be direct testing for paraphrasing purposes. Deci, Eghrari, Patrick, and Leone (1994) (in Huffman, et al., 2011) mention that providing rationale for the tests is a way to raise motivation in low-stakes testing. My group did not do this in regards to the paraphrasing quiz as a whole, but we did provide context for each item. Providing context helped the students visualize a real-world task associated with each item. The contexts were helpful because they involved academic writing, which the students will do throughout their studies. One example from Section 3 is Imagine you are writing a research paper (about Lou Gehrigs disease)... To view all the specific contexts, please see Appendix A, which holds our quiz as given the students. Student motivation in low-stakes exams can be a real challenge because low effort leads to lower test scores, which does not reflect the students true potential on exams. This lack of validity is troublesome for teachers and researchers who make important decisions for students educational programs and assessments based on these results. Employing potential motivators on STUDENT MOTIVATION 13 students low stakes tests however will help the students enjoy the process much more and provide more valid results for the teachers, test-creators and other stakeholders who value them so much. Future Inquiries In order to detect students motivation levels, students have been asked to self-report their levels of motivation prior to taking assessments (Swerdzewski, Harmes, & Finney, 2011). The positive aspects of the self-report include that it collects information in minimal time and only adds a few items to the overall test. However, self-reporting methods like this can be unreliable because items require that all the students know and report their level of motivation on the same scale. Because students are the best source of information about their own levels of motivation, self-reporting items can be extremely useful. Therefore, my first question for future research is Which self-reporting method is more accurate in informing teachers about motivation in tests? Secondly, attempts to raise students motivation prior to low-stakes tests have included showing them a motivational presentation, appealing to their academic citizenship, publically recognizing good grades, providing material incentives (like a new binder) and financial incentives ($20 to the highest score) and withholding next steps (like class registration) (Huffman, et al., 2011). Teachers do not have time to trial all of these methods, but still want to provide motivation for their students in order for them to receive higher grades. So, my second question for future research is Which is the most effective way to raise students motivation prior to them taking a low-stakes exam? Finally, assessment scores on low-stakes tests may not measure what students can know and do because the students low effort levels are likely to interfere with the results (Wise and DeMars, 2005). Try as they might, teachers cannot be responsible for consistently raising the STUDENT MOTIVATION 14 motivation levels of their students prior to every exam. As a result, teachers must accept the students scores despite the questionable validity due to test-taking effort. It is my hope however that a method exists to eliminate or decrease this questionable validity, so the scores of highly motivated learners and low-motivated learners can be accepted with the same validity. My question for future inquiry is How can the levels of motivation be factored out of test scores so they can all be examined on the same scoring basis?
STUDENT MOTIVATION 15 References
Abdelfattah, F. (2010). The relationship between motivation and achievement in low-stakes examinations. Social Behavior and Personality, 38(2), 159-168. DOI 10.2224/sbp.2010.38.2.159
Barry, C.L., Horst, S.J., Finney, S.J., Brown, A.R., & Kopp, J.P. (2010). Do examinees have similar test-taking effort? A high-stakes question for low-stakes testing. International Journal of Testing, 10, 342-363. DOI: 10.1080/15305058.2010.508569
Cole, J.S., & Bergin, D.A. (2005, May). Association between motivation and general education standardized test scores. In Huffman, et al. (2011). Strategies to motivate students for program assessment. Educational Assessment, 16, 90-103. DOI: 10.1080/10627197.2011.582771
Deci, E. L., Eghrari, H., Patrick, B. C., & Leone, D. R. (1994). Facilitating internalization: The self-determination theory perspective. In Huffman, et al. (2011). Strategies to motivate students for program assessment. Educational Assessment, 16, 90-103. DOI: 10.1080/10627197.2011.582771
HPU is America's No.1 University for Diversity!. (2014). Hawaii Pacific University. Retrieved April 14, 2014, from http://www.hpu.edu/HPU/diversity/index.html
Huffman, L., Adamopoulos, A., Murdock, G., Cole, A., & McDermid, R. (2011). Strategies to motivate students for program assessment. Educational Assessment, 16, 90-103. DOI: 10.1080/10627197.2011.582771
ONeil, H.F., Sugrue, B., & Baker, E.L. (1995/1996). Effects of motivational interventions on the national assessment of educational progress mathematics performance. In Barry, et al. (2010). Do examinees have similar test-taking effort? A high-stakes question for low- stakes testing. International Journal of Testing, 10, 342-363. DOI: 10.1080/15305058.2010.508569
Rugen, B.R. (2014). WR1100: Anaylzing and Writing Arguments syllabus. Personal Collection of B. Rugen, Hawaii Pacific University, Honolulu, HI.
Swerdzewski, P.J., Harmes, J.C., & Finney, S.J. (2011). Two approaches for identifying low- motivated students in a low-stakes assessment context. Applied Measurement in Education, 24, 162-188. DOI: 10.1080/08957347.2011.555217
Wise, S.L. (2009). Strategies for managing the problem of unmotivated examinees in low-stakes testing programs. The Journal of General Education, 58(3), 152-166.
Wise, S.L. & DeMars, C.E. (2005). Low examinee effort in low-stakes assessment: Problems and potential solutions. Educational Assessment, 10(1), 1-17.
STUDENT MOTIVATION 16 Appendices
Appendix A WR1100 quiz
Paraphrasing Quiz 50 mins March 7, 2014
Section 1 Please match the word from the word bank to the best definition.
summary quote paraphrase
1. A repeat or direct copy of words from a text or speech.
________________________________
2. A statement that expresses the same meaning but in different words or structure. Can be as long as the original text.
________________________________
3. A brief statement about the main points of something. Not as long as the original text.
________________________________
Section 2 5 points Imagine you are writing a research paper about jetlag. Your first step is to identify synonyms for paraphrasing purposes. Read the passage and then write synonyms for the underlined phrases. Remember to keep the same part of speech. The first one has been done for you as an example.
(1) People who travel and people whose work schedules are altered (2) drastically (1) travelers (2) _______________________ often (3) suffer from jetlag, which is a (4) disturbance of the bodys time clock.
Jetlag sufferers are (5) troubled by both night-time sleeplessness and extreme daytime sleepiness, STUDENT MOTIVATION 17
(5) _______________________
which (6) inhibits their ability to function normally.
(6) _______________________
Section 3 8 points Imagine you are writing a research paper about Lou Gehrigs disease. Identify synonyms and change word order and class for paraphrasing purposes. Read the passage and then paraphrase the original phrases in the space below. An example has been provided.
It's hard to imagine a more terrible illness than ALS, also called Lou Gehrig's disease. For most people, it means their nervous system shuts down until their body cant move. That also means they'll lose their ability to speak. So Carl Moore, an ALS patient from Kent, Washington recorded his voice to use later when he can no longer talk on his own. Original Paraphrase Ex. Terrible illness very bad sickness 1. until their body cant move ________________________________ 2. lose their ability to speak ________________________________ 3. recorded his voice to use later ________________________________ 4. when he can no longer talk on his own ________________________________
STUDENT MOTIVATION 18
Section 4 12 points Imagine you are writing a concept essay about blogging. Read the original excerpt and provide a paraphrase on the lines below. Dont forget to include the authors last name and page number for MLA, or the authors last name, year, and page number for APA, in the parenthetical citation.
1. Blog contents represent both a bloggers personal identity and social identities. Bloggers express their identities by showing their feelings, personal values, and thoughts to readers (Colter & Gretzel, 2014, p. 38).
2. Blogging has become an important aspect of tourism consumption and marketing processes (Smith, 2011, p. 130). Bloggers express their personal travel experiences and provide travel information in their blogs, thus constructing meaning for their own travel experiences and projecting meaning to their readers (Lee & Gardener, 2008, pp. 113-8).
Section 5 12 points Imagine you are writing a research paper for your Introduction to Sociolinguistics class. Read the original excerpt and provide a paraphrase on the lines below. Dont forget to include the authors last name and page number for MLA, or the authors last name, year, and page number for APA, in the parenthetical citation.
1. Sociolinguistics is a very broad field, and it can be used to describe many different ways of studying language. A lot of linguists may describe themselves as sociolinguists, but the people who call themselves sociolinguists may have rather different interests from each other and they may use very different methods for collecting and analyzing data. This can be confusing if you are new to the field (Meyerhoff, 2006, pp. 1-2).
2. Labov conducted these sociolinguistic interviews in a number of different parts of the island. In some places, the inhabitants were mainly of Anglo-British decent, in some they were mainly Portuguese descent, and in some they were mainly of Native American descent. He also sampled speakers from different walks of life. Some of the people he talked to worked on farms, some worked in the fishing industry, and some worked in service occupations (Meyerhoff, 2006, p. 30).
Notes: We accidentally skipped test no. 10 when numbering.
* denotes a satisfactory score on the quiz (80%+). 7/13 80%+.
STUDENT MOTIVATION 21 Appendix C Descriptive statistics
Alpha value (for confidence interval) 0.02
Variable #1 (Var1) Count 13 Skewness -0.60466 Mean 28.80769 Skewness Standard Error 0.56695 Mean LCL 23.33189 Kurtosis 1.96977 Mean UCL 34.28349 Kurtosis Standard Error 0.9097 Variance 54.23077 Alternative Skewness (Fisher's) -0.68657 Standard Deviation 7.36415 Alternative Kurtosis (Fisher's) -0.91889 Mean Standard Error 2.04245 Coefficient of Variation 0.25563 Minimum 16. Mean Deviation 6.28402 Maximum 37. Second Moment 50.05917 Range 21. Third Moment -214.16067 Sum 374.5 Fourth Moment 4,936.09896 Sum Standard Error 26.55184 Median 32.5 Total Sum Squares 11,439.25 Median Error 0.70997 Adjusted Sum Squares 650.76923 Percentile 25% (Q1) 24.25 Geometric Mean 27.79807 Percentile 75% (Q2) 35. Harmonic Mean 26.65874 IQR 10.75 Mode 35. MAD 4.5