You are on page 1of 7

METHOD Study 1 Development and Validation of the PTE Participants The participants were undergraduate students in courses at Bakersfield

d College (a community college in Bakersfield, California), recruited through in-class requests for participation. For study one there were 213 students (66 % Female, 34% Male) who volunteered. [See Results section for reporting of complete demographic information.] Procedure Two sets of measures were developed, the first set relating to the persuasion constructs (the PTE), and the second set relating to teaching effectiveness (the Teaching Effectiveness Scale [TES]). Despite the plethora of models of persuasion, no persuasion assessment instruments could be found in the research literature; therefore, items for the PTE were drawn from the concepts in the models of persuasion. Initially, items for each of the three scales for the PTE {Interest/Involvement, Likeability, and Credibility} were generated by students and faculty, previous research, vernacular descriptors, and related synonyms. The Teaching Effectiveness scale (TES) items were similarly generated. These items, taken as a group, served as a master list that was used as part of a class project in an introductory statistics course being taught by the principal researcher. The class members, facilitated by the principal researcher, eliminated awkwardly worded items, combined alternatively worded but similar items, and ultimately compiled a revised master list of descriptors for each of the three persuasion themes and the teaching effectiveness theme. After discussing the remaining list of descriptors, class members individually ranked the descriptors listed under each of the four themes (interest/involvement, likeability, credibility, and teaching effectiveness) in order of clarity.

After compiling the results of the ranking for each theme, it was decided among the class and the principal researcher that the six most highly selected descriptors for each theme would be retained. Six items seemed appropriate because the items beyond the top six were becoming repetitive, and the ultimate size of the instrument was a concern. In order for the assessment instrument to be effective in a typical classroom setting it seemed that a length of about 20 items would be appropriate. The proposed list of 24 descriptors allowed for items to be eliminated in the validation process and still be about the right length. Further, the SEEQ typically contained no more than four items for each construct measured. The preliminary assessment instrument was then administered to volunteers in several undergraduate psychology classes. Subjects were instructed to rate their most recently seen instructor, in order to reduce the evaluation of particularly salient (either favorably or unfavorably) instructors and to collect data with adequate variability. Following data collection, participants were debriefed regarding the nature of the study, and were informed about obtaining a copy of the final results of the study. Questionnaire The questionnaire consisted of the three PTE persuasion constructs, the TES teaching effectiveness construct, and demographic indicators. Each of the core themes of the three PTE predictors was assessed utilizing several questions that addressed different manifestations of the underlying theme. Similarly, the TES teaching effectiveness measure was assessed utilizing several questions that addressed different manifestations of the underlying theme. For example, regarding the first theme of interest/involvement, questions assessing the interest level of the instructor/course, the involvement level of the student, and the enthusiasm of the instructor were asked. The response scale for both the PTE and the TES items was measured on a 13-point high ordinal or interval level scale, ranging from A+(13) to F(1), mirroring a course grading scale familiar to most students. Because the goal of this project was to uncover aspects of effective teaching, rather than grading individual instructors by individual students, the questionnaires were completely anonymous.

There was no indication of the students nor the instructors identity. The initial questionnaire can be found in Appendix A. PTE Theme 1: Interest/Involvement This construct was measured by six items including how interesting, how motivating, how absorbing/fascinating, how exciting, how entertaining, and how involving/engaging was the instructor. PTE Theme 2: Likeability This construct was measured by six items including how humorous, how friendly, how likeable, how personable, how sincere, and how courteous the instructor was. PTE Theme 3: Credibility This construct was measured by six items including how much do you trust what the instructor says, how confident, how credible/believable, how ethical/fair, how prepared, and how knowledgeable was the instructor. Teaching Effectiveness Scale (TES) This construct was measured by six items including how effective a teacher was the instructor, how effective has the instructor been in helping you learn the material, how would you rate the instructors teaching ability, how would you rate how much you have learned from this instructor, how would you rate the instructors teaching effectiveness, and what grade did you/are you getting in the course. Study 2 Test of the PTE Model in Predicting Teaching Effectiveness (TES) Participants The participants were undergraduate students in courses at Bakersfield College (Bakersfield, California), recruited through in-class requests for participation. For study two there were 652 students (67% Female, 32% Male, 1% Not Reported) who volunteered. In study two each student was paired with another student so that both students in the pair were rating the same instructor for the same class.

Although each student rated the instructor on all items (i.e., the PTE, the Teaching Effectiveness Scale (TES), the SEEQ, and the demographic items), each student was randomly assigned to either the predictor group or the criterion group, so only those portions of their responses were used. Thus, of 652 participants, half assessed instructors on the finalized predictor scale items, and the other half assessed those same instructors on the finalized criterion items. Procedure Once the scale development phase of the project was completed, a new group of participants was recruited. Participants were randomly assigned to one of the two questionnaire section groups and although they completed the entire questionnaire, only their responses to the assigned section were used. Thus, one group of subjects contributed the ratings on the predictor side of the multiple regression equation, and another group contributed the ratings on the criterion side of the equation. This was done to eliminate any participation link between the predictor scale items and the criterion scale items. By assigning predictor and criterion variables to different groups of subjects, it was hoped that any halo effects across the predictors and the criterion variables might be reduced or eliminated. Halo effects, in this case would be students thinking highly of an instructor on one item and thus rating that instructor highly on all items. Although, as mentioned previously, the student evaluation of teaching method has demonstrated very good validity, adequate precautions regarding introducing any potential biases seemed prudent. Another benefit of assigning predictor and criterion variables to different groups of students is that the number of total instructors assessed becomes less critical. In most teaching evaluation studies, one group of students provides the entire set of data used in the analysis. This necessitates that a large number of instructors be used because the instructor is the unit of analysis, and too few instructors result in highly dependent and potentially collinear data. Using one group of students to provide the predictors and another group to provide the criterion variables transforms the unit of analysis to the agreement between

the student pairs. Regardless of the number of instructors assessed the question becomes one of how well does the predictor students scores in the pair predict the criterion students scores. In addition to the benefits highlighted above regarding the fewer number of instructors needed to provide adequate data, estimates indicate that the number of instructors was high regardless. Based on random sampling of registration data, only approximately 10.53% of the students shared the same instructors. And that relatively small percentage indicates the estimated maximum amount of overlap, because students did not necessarily select the same instructors, they simply may have. Based on the estimates, of the 326 pairs of students that rated instructors, approximately 34 pairs assessed the same instructors, resulting in an estimated 292 different instructors in the data set. Subjects were instructed to rate the instructor whose class they had most recently (either earlier in the day, or if they had no earlier classes on the day of testing, the latest class from the previous class day). This was done to attempt to collect data that demonstrated adequate variability, and ideally, exhibits the characteristics of a normal distribution, maximizing power for the inferential tests that were to be conducted later. Thus, students did not always pick their most memorable instructor (memorably good or memorably bad). After completing their questionnaire, they were asked to recruit a student from the class that they had rated. The original student was instructed to recruit a student with which they had had very little interaction, to reduce the possibility that like-minded acquaintances rated instructors similarly because of cross talk either before or during the data collection process. Debriefing indicated that students had little difficulty engaging an acquaintance in the classmany indicated that they felt very little awkwardness in asking, because the project was introduced in another class and they viewed it as a class assignment. These newly recruited subjects were then asked to complete the entire questionnaire, although only the responses to the complimentarily assigned section were to be used. For example, if I were an original subject, I completed the entire questionnaire, but only the responses for the randomly assigned

section (e.g., the predictors section) was used. Then I recruited a partner who completed the entire questionnaire although only the responses to the criterion section were actually used. This collection of responses via partners provided data for the predictor items from one set of subjects and data for the criterion items from a different set of subjects, thus reducing some of the inherent circularity in teaching effectiveness ratings and prediction. Despite having ratings from different groups of subjects for the predictors as compared to the criteria, correlational analysis procedures are still appropriate because the unit of analysis was the instructor being rated. Following data collection, participants were debriefed regarding the nature of the study. Further, they were asked if it was difficult or awkward to recruit a partner, and the vast majority indicated no difficulty at all. They stated that they just arrived early or stayed late for the class and asked another student to participate. Finally, participants were informed about obtaining a copy of the final results, and were thanked for their participation. Questionnaire The resulting omnibus indices of the three core concepts are high ordinal or interval in measurement level, which is an appropriate level of measurement for predictor variables in a multiple regression. In addition to the PTE and the TES items, the following items were also included in the final questionnaire. As mentioned previously, the initial overall questionnaire can be found in Appendix A. The Students Evaluation of Educational Quality (SEEQ) Marshs SEEQ teaching effectiveness questionnaire was administered in order to assess hypothesis three, i.e., to place it head-to-head against the PTE model, to see which model is a better predictor of teaching effectiveness. The complete SEEQ can be found in Appendix B. Demographic Variables Gender was noted for both instructor and respondent, because Basow (1995) found that female professors received significantly higher ratings from female students than from male students. Conversely, the gender of the student did not influence the ratings for male professors.

Ages for both student and instructor were collected, based on the assumption that Basows (1995) findings of an interaction between teacher and student gender in teaching effectiveness ratings may apply to age. Does age of the instructor exhibit a main effect on teaching effectiveness, or is there a more complex relationship between age and effectiveness? Other variables that were of interest included class level; number of units completed; years in higher education; full-time/part-time student status; racial/ethnic background; socioeconomic status; and native language.

You might also like