Warning: file_get_contents(http://95.168.177.240/angel14/796f75747562657175657374696f6e73636f7265732e626c6f67732e626674662e6f72676d6574686f642d686f772d77652d67656e6572617465642d7468652d64617461.html) [function.file-get-contents]: failed to open stream: HTTP request failed! HTTP/1.1 404 Not Found in /home/butterworth/blogs.bftf.org/wp-content/themes/sandbox/header.php on line 1
YouTube debate question scores » Method: How We Generated the Data

Method: How We Generated the Data

The night of the debate, participants in the Wake Forest Policy Project and Ben Franklin Transatlantic Fellows Initiative (BFTF) gathered to view the CNN-YouTube Democratic presidential primary debate. These students were a mix of United States and international students ranging in age from 15-18, studying argumentation, debate, and public advocacy over the course of several weeks.

The students were given an evaluation rubric developed by Gordon Mitchell, Tim O’Donnell, and Ross Smith that used a Likert scale to evaluate each question on a 1-5 scale (1 being “strongly disagree” to 5 being “strongly agree”) for four categories:

* Clarity. The question made sense.
* Interesting. The question made me especially interested to hear the answer.
* Demanding. The question demanded that the candidates explain or justify their answer.
* Audio/visual. The video was visually powerful with good sound quality.

The participants viewed two “practice videos” to prepare themselves for critically evaluating the chosen videos. Each video was assigned a number based on the order presented by CNN.

After the evaluation rubrics were gathered, Megan Foelsch, Damien Pfister, Sean Ridley, Ron Von Burg, and Kurt Zemlicka entered the data (.xls). Kurt then processed the data to result in the final rankings (see the correlation scatterplots for more results).

Delphine Masse, the webmaster of the BFTF Initiative, organized the final presentation of the YouTube videos.

We believe that these results are significant, and are one of the only data sets that uses a rigorous criteria to evaluate the chosen videos. At the same time, we recognize that there are some limitations to the method that circumscribe absolute conclusions about “what makes a good question,” including:

* The students were trained in debate and argumentation, perhaps making them more attuned to the quality of the question. If this is the case, then it is perhaps a strong argument for enhanced pedagogical focus on question-asking within the context of public deliberation.
* Ideally, our participants would have been more numerous and more representative demographically (especially age).
* In a more perfect world, we also would have asked the students to evaluate 40 videos at random from the almost 3000 entries to provide a point of comparison. Such a data set would allow us to examine whether or not CNN chose the “strongest” questions (based, of course, on the criteria that we lay out). This would provide data that moved beyond correlations (interesting vs. clarity, demanding vs. interesting).
* Additional criteria could complement the four we have established, altering the rank ordering of the questions. We hope that the evaluation rubric we have generated sparks a broader conversation about what makes a good debate question, and look forward to a robust exchange in comments suggesting additional criteria for evaluating submissions to YouTube for presidential (primary) debates. Additional categories might include “uniqueness” and “speaker credibility” for the content criteria, and then “novelty” for the form criteria. We also probably should have separated audio and visual into two separate criteria.
* “Evaluation fatigue” might have set in on the participants. It’s possible that students were tired, careless, or both towards the end of the debate—meaning that their assessments of the final videos were not as rigorous as the early videos. However, from my own evaluation, I must point out that the students were remarkably diligent at filling out their rubrics and seemed to be incredibly engaged in viewing the debates through the critical lens mandated by the exercise.

We would welcome additional comments on how the method might be improved. As one of the first studies to rigorously evaluate the quality of questions during a public political debate, there are certainly elements that could be improved. At the same time, we believe this is a provocative and timely contribution to the dialogue that has followed the first YouTube debate and look forward to continuing the conversation.