Does PeerWise aid pupil progress in the secondary school?

June 9, 2015 in Talking point

As part of my PGCE in UK secondary schools, I have had to produce some research into educational methods. Having first seen reference to PeerWise in the Royal Society of Chemistry’s Education in Chemistry (EIC) magazine, I felt it would be interesting to discover the usefulness of PeerWise in Secondary Education. Below I have reproduced the research report and findings which I hope will be of interest. I hope that the findings of this research will be of interest for many and look forward to expanding on this research in the near future.


1.1 The Research Premise


Does self-generated content enhance pupil progress by giving them a greater understanding of the subject content? This question is the focus of this research but before detailing how this question is to be answered, it requires dissection to ascertain exactly what is required to provide a suitable answer. This dissection will occur through the following questions:

1. What is self-generated content?

2. Why is self-generated content of any use in education?

3. What methods will be used to enable pupils to self-generate content?

4. How can progress be measured against the use (or lack thereof) of self-generated content?

The following sections will answer each of the following questions and end with a summary of the aims and objectives of this research.

1.1.1 What is self-generated content?


By far one of the most widely used models for cognitive development in education is that from Bloom’s Taxonomy1,2. In summary, objectives for learning are developed such that learners progress from simple factual recall (low-order thinking/ cognitive skills) to application and evaluating (high-order thinking/metacognitive skills). As metacognitive skills are more highly valued, efforts to push learners towards fulfilling these learning objectives are increased2-4.

Self-generated content has been used for some time in education. One simply has to consider assignments such as essays, and lab reports to recognise that these had to, by definition, be produced by the student as part of their studies5. Much research has been performed on self-regulated learning, of which self-generated content is an integral part, and focussed around the use of Bandura’s Theory6. This theory suggests that self-regulated learning is a consequence of personal, environmental and behavioural influences7.

The value of self-generated content varies from subject to subject. For example, artistic work tends to place more value on self-generated content compared to mathematical work5. This tendency leads into concept regarding so-called ‘academic’ subjects where textbooks and the like are more highly valued and rely on students being able to regurgitate information, of which self-generated content is used as a means to assess the learning of a particular student5.

1.1.2 Why is self-generated content of any use?


So if self-generated content is considered mostly of use as a form of assessment, why bother with is as a mode of imparting knowledge and information? In short: previous research on student generated content has shown a significant correlation between summative assessment scores and levels of participation in generating self-assessment content3,8-10.

Clearly the ability to produce one’s own resources based on one’s level of understanding reinforces learning and allows for greater levels of metacognition to occur. If such content is to be used by others, then the process of developing content requires more metacognitive work, as the content needs to then be accessible to others, not just its author. Additionally, this leads to greater engagement and achievement in the overall learning process5,10.

1.1.3 Methods for pupils to self-generate content


Consequently there is a challenge for educators to enhance metacognitive skills through application. There are a range of methods available, but this research will discuss two in detail. The first of these involves multiple choice questions (MCQs). While answering these types of questions can be relatively easy – tending to rely on recall of factual information – writing MCQs requires a much broader skill set. A good understanding of the subject content is a requisite due to good MCQs having an optimal number of answers (ranging between 3 and 5)11 where the incorrect answers also need to be plausible – possibly as a result of common misconceptions or mistakes3. Writing MCQs therefore is more time-consuming than it is to answer them. When learners produce their own multiple choice question, they are challenged to use higher cognitive levels than would be required to simply answer them.

Science is a highly conceptual subject and some concepts can be more easily explained using analogies. In the context of a lesson, this involves explaining a new concept by describing it in a more familiar context 12. These comparisons allow for development of understanding of new knowledge or alter that which is already understood 13,14. Indeed, within the National Curriculum, there are requirements for pupils to learn about several different models such as the heliocentric model of the Solar System and the Bohr model of the atomic nucleus to give two examples. It has therefore been argued that analogies are akin to using models, and therefore inseparable from understanding scientific concepts 15.

However, the issues surrounding models – recognising that they provide a simplified means to understanding ‘real-life’ systems – is not necessarily appreciated by pupils. One of the main issues identified is the creation of misconceptions 14,16 which is ironically what the model is attempting to avoid. It is therefore crucial that any attempt to use analogies are presented in a manner that is explicitly clear that they do not necessarily describe the ‘full picture.

Another issue with analogies are those categorised as Model I 17 analogies. These require low levels of input from pupils, and present low levels of monitoring pupil understanding. These usually arise from when a teacher provides a model to the pupils. From Oliva et al’s(17 teaching model constructs, matching analogies to descriptions of ‘real-life’ processes agrees with Model IIA. The subsequent level, whereby pupils would need to construct their own analogies for the effects of different parameters on reaction rate, matches Model IIB. The final approach, requiring high input from pupils and high levels of monitoring progress focusing on pupils sharing their analogies and creating discussion would result in Model III analogies which are the type which require the hardest level of cognitive action on the part of the pupil but result in the maximum development of understanding of the concepts being taught.

1.1.4 How to measure progress?


By far the easiest method of determining pupil progress is through end-of-topic, or end-of-year tests. Comparisons can be made between differing groups, such as those given the opportunity to self-generate content, and those not given such an opportunity. This method is similar to that performed by other research groups2,9,18-20. Consequently, if a group of pupils were split such that one could use PeerWise, and thus generate their own learning repository, whereas the other group were not, then comparisons could be made to determine how much progress was made between these two groups over a period of time. Ideally, such a comparison would occur over a period of 2-3 years, but in the context of this research, such comparisons will be attempted over one topic (covering approximately 6-7 weeks).

1.2 Aims and objectives


This research will aim to address the following question:

Does the ability to self-generate content on PeerWise improve pupil progress?

This will be answered using the following criteria:

1. What level of participation is achieved when pupils are given the opportunity to generate their own content?

2. How effective were pupils at generating content over an entire topic?

3. What impact did self-generated content have on pupils’ attainment?

4. Did pupils believe that the option to produce their own learning resources was beneficial to them?


2 Research methodology

2.1 The method

This research was performed at a co-educational independent boarding school and focused on the progress of pupils in one topic within the school’s year 9 specification. The topic chosen was Oxygen, Oxides, Hydrogen and Water – a unit within the Edexcel IGCSE specification.

The year is split into four paired sets – sets 1 and 2 being the ‘top’ sets and sets 7 and 8 having the weakest pupils. Two sets were selected – sets 4 and 5 – to be given the task of using PeerWise to aid their studies. A specific course was set up on the PeerWise software for these pupils to use.

The first lesson of the topic was used to introduce PeerWise to the pupils including a short demonstration of how to produce and answer MCQs. Upon completion of the demonstration, pupils were given the task of logging in and completing the following tasks as homework: (1) write three MCQs, (2) answer 1 MCQ and (3) comment on 1 MCQ. Upon completion of this task, pupils were informed that they would be free to use the software as much as they liked. Pupils were also informed that their activity on PeerWise would be monitored – unsuitable questions or comments would be removed and sanctions applied accordingly. Reference to previous research was also provided, stating that greater activity on PeerWise did coincide with higher attainment, thus the desire of pupils to be successful was used as motivation to increase their activity on PeerWise. Pupils were also informed that this would be a ‘use it or lose it’ process. The more activity that was observed on PeerWise, the more likely that it would be opened up to the rest of the year’s cohort for their revision, and use throughout their IGCSE studies. Minimal activity, or failure to use PeerWise would result in the courses set up to be closed down and the opportunity to use it denied for their peers. This too was aimed at ensuring motivation towards using PeerWise.

Data collected would be both quantitative as well as qualitative. The quantitative data would focus around the number of questions and answers uploaded per day as well as comparisons between mean end-of-topic test results for each set in the cohort. The results would be used to determine the effectiveness of pupil-generated content in developing pupil understanding of the subject content, and thus, their overall progress.

The qualitative data would be in the form of a questionnaire given to the pupils in sets 4 and 5 after completing the test to give feedback on their experience with PeerWise, focusing around its ease of use and whether they would continue to use it throughout their studies. This would be used to determine whether pupils felt they had benefitted from its use and what improvements they felt could be applied to aid their use of PeerWise.

For both, the results obtained and the questionnaire answers would be anonymised. The results for the end-of-topic test would be averaged and no names attributed to any particular score while the questionnaire would be provided without a requirement to add the pupil’s name to help the pupils be honest about their experience using PeerWise and explaining to them that only the author would see their answers for the questionnaire.


3 Results and discussion

3.1 Use of PeerWise


PeerWise was introduced to the Pupils in the two sets during their first lesson on Oxygen, Oxides, Hydrogen and Water. They were given the task of producing three MCQs for their peers as well as answer and comment on a minimum of one other question. To aid the pupils, two questions were previously uploaded for them. Figure 1 below shows the number of questions uploaded each day during the topic.


Figure 1 Number of questions uploaded per day during IGCSE topic. Note that the topic took longer than the two week period shown in this plot. After the 24th March, no further questions were uploaded. 30 questions were uploaded in total out of a minimum expectation of 111


A total of 30 questions were uploaded over the entire topic which was considerably lower than expected. The two sets combined had a total of 37 pupils so 111 questions would have been expected if all pupils had completed the minimum requirements. The reasons for this greatly decreased number of questions uploaded stem from some of the feedback gained after the research had been completed once the opportunity to use it for the other topics covered by the pupils. Namely that writing questions was ‘hard’ and pupils would prefer the teacher to write the questions. It was explained to them that this would defeat the point of PeerWise as it is based on pupils providing assistance to one another by producing MCQs on areas they have confidence with, and answering MCQs on areas they feel less confident with.

In looking through the questions, one had been deleted shortly after uploading due to the pupil recognising it as unacceptable. This supports references in the literature where teacher/instructor input is minimal due to the pupils/students self-regulating over their activity on PeerWise. In reviewing the questions, the vast majority (25 out of 30) focused on air composition and acid rain – two areas covered over the very first few lessons in the topic. The remaining five were spread over tests for gases and identifying the elements present in specific compounds e.g. water. These results were anticipated – that activity would decrease as more pupils completed their minimum requirements although it was hoped that pupils would see the benefits of using this program and continue to use it throughout the topic, thus resulting in questions covering every aspect in the topic.

The number of answers uploaded, and the comments uploaded however, showed a marked difference. (Figure 2)


Figure 2 Number of answers uploaded per day during the IGCSE topic. 214 answers were uploaded over the entire topic. The green ringed column comprises of 11 answers given by one pupil during the Easter holidays when usually no homework is set.


The 30 questions were answered a total of 214 times over the course of the topic with the majority answered shortly after the MCQs were uploaded. Of interest is the green ringed results on the 11th April. This is of interest because the answers were all given by one pupil during the Easter holidays when no homework had been given. Thus, for one pupil at least, recognition had been made that this program is useful for revision and can be accessed at any time, even outside of normal school time. The comments usually uploaded with the answers were along the lines of ‘this is a useful question for revision’. Several different pupils had given this comment so the association between the subject matter of the MCQs and the process involved in either writing or answering them as an aid to revision was clearly remarked upon. Within individual questions, the number of answers and the feedback given was useful in assessing the learning of individual pupils. More answers to particular questions coincided with areas of lower understanding and higher rated questions were generally very well written. The consequences of these questions are discussed below.

3.2 Comparisons of End-of-Topic Results


Upon completion of the topic, the pupils were given a week to revise for their end-of-topic test. The mean mark for this test would be compared, on a set-by-set basis, with the remainder of the Year 9 cohort. It was hoped that the process of self-generating content in the form of MCQs would have a positive effect on the overall mark achieved by the pupils in the two sets that had been given the opportunity to use PeerWise. The average marks for each set is shown below in Figure 1Figure 3.


Figure 3 Comparative average scores for the end-of-topic test for Oxygen, Oxides, Hydrogen and Water. The results for 3 set 6 had not been received at the time of submission. The green circle highlights the average score for the pupils in the sets who had access to PeerWise. All scores are ± 1 standard deviation.


From the results shown in , the average results were not greatly affected by the use of PeerWise. As an argument for the use of PeerWise, this does not provide much supporting evidence. This does, however, demonstrate that the overall setting of ability in this school is effective. Despite this, the section of the test in which the PeerWise using sets scored highest was on the composition of the air and on acid rain – the two main areas pupils had produced their MCQs. So as a tool to aid pupil understanding, self-generated content is beneficial. On other areas where MCQs were not produced, or the test did not address the area covered by the MCQs produced, pupils were relying on their ‘normal’ revision techniques. This could pose an issue – why give pupils the task of generating their own content if it is not assessed in a test or exam? The answer to this question derives from the holistic understanding of the subject in question. Science is can be a very difficult subject – some topics can be abstract with little or no obvious link to previous study whereas other topics can quickly become tedious for pupils as they can be covering the same ground at other topics, albeit in a different context. An example of the latter point is from the topic where this research was performed. In this topic, pupils learnt about testing for gases. In the subsequent topic (called Salts and Chemical tests), many of these tests were covered again. In particular, the chemical reaction involving calcium carbonate and acid was covered three times in three different topics over the Year 9 scheme of work, and on each occasion, was addressed in a different manner. By self-generating content, pupils need to develop their understanding of the subject content to a level that is clear and concise to their peers that enables not only their peers to develop their understanding, but to demonstrate their own.

3.3 Results from Questionnaire


Having analysed the results from the end-of-topic test, the opinions of the pupils were considered. How effective did they find PeerWise? Would they use it of their own accord? How difficult did they find using PeerWise? All of these questions were addressed in some form using the questionnaire in 4.2Appendix A. Pupils were asked to grade several aspects of PeerWise and their experience using it on a Likert Scale. The results are summarised below in Figure 4.


Figure 4 Percentage scores for pupil responses to the questionnaire (see Appendix A).


From the results of the questionnaire, it can be seen that there was a generally positive response to the use of PeerWise with the majority of pupils describing the program as easy to use, useful for revision and like PeerWise to be available for use throughout their studies. A sixth question was also provided (but not included in Figure 4) enquiring about how pupils would prefer the use of PeerWise to be regulated. Pupils were given options about it being given purely as homework, to be used as and when the pupils wanted to use it or for its use to count towards the pupils’ end of topic/year mark. The summary can be seen below in Figure 5.


Figure 5 Percentage of pupils preferring PeerWise to be (a) given purely as homework, (b) used as and when needed and (c) counted towards end of topic or end of year exam marks


From Figure 5, it can clearly be seen that the majority of pupils would not want activity on PeerWise to be included in their end of topic or end of year exam marks. This idea was introduced as, before the reforming of the UK education system, pupils would have had to perform some form of coursework which drew on several areas of the subject in order to complete effectively. Additionally, as PeerWise could be used to link topics together, having it contribute to pupil’s end of year marks would force them, in a sense, towards developing this holistic approach to the subject and develop their understanding of each topic. In later years, ‘gaps’ in pupil knowledge would be filled which, when compared to the course list on PeerWise, would enable them to produce more links between topics, thus progress and develop their knowledge and understanding further.

Two pupils did note down they would prefer both of options a and b – that PeerWise be set as homework and for it to be used as and when needed, but the majority opted for either one or the other option. Among the additional comments provided, several pupils stated that they would prefer PeerWise to have several questions uploaded by their teacher or even set as revision homework – a method which would enable the teacher to more accurately gauge how much revision is being performed by individual pupils. This last point was surprising and had not been considered until after the questionnaires had been collected in and read.

3.4 Providing pupils further opportunities


Upon collecting the questionnaires, reviewing and analysing the data, the two sets were then given a chance to vote on whether, for the next topic and for all other topics covered in Year 9, they would like additional courses on PeerWise to be provided for them to use of their own accord. The response was heavily in favour of this and subsequent courses were uploaded. Surprisingly, within six hours of access being given to the pupils for the next topic, four questions had already been uploaded, answered and commented on. Use of PeerWise is now being monitored on an occasional basis to ensure no unsuitable activity is taking place. The questions uploaded have been observed to be well written and will, if this use is continued, result in a strong repository of questions for pupils to use throughout their studies.


4 Conclusions and Future Work

4.1 Conclusions


37 pupils were given the opportunity to enhance their learning through the use of a MCQ forum, PeerWise. Their activity was monitored and comparisons were made between their end of topic test results and those of their peers in the rest of the cohort. Uptake of the activity insofar as writing MCQs was lower than expected and heavily related to content covered in the first few lessons given. Pupils were observed to prefer answering questions, referring to the writing of MCQs as difficult and wanting their teacher to produce them instead.

The results from the end-of-topic test did not show a marked difference in overall pupil attainment, rather the results were as expected for each set’s ability. For the pupils using PeerWise, the majority of their marks received tied very closely with the content they had produced on PeerWise. This was repeated in all 37 pupil’s tests and demonstrates that self-generated content can be used to reinforce learning in lessons.

The overall lack of questions arose mainly from the fact that activity n PeerWise was described as voluntary. As a result, a good number of pupils opted to not participate in writing questions, but preferred to answer questions produced by their peers. Feedback gained from the questionnaire showed that pupils recognised the usefulness of PeerWise, especially for revision purposes. One pupil in particular remarked that using PeerWise could be used to assess whether or not pupils are actually revising for their tests, be it end-of-topic tests or end-of-year exams. A small minority of pupils described how they would prefer PeerWise to contribute to their results but overall, the ability to use PeerWise on an ad hoc basis was more important.

Comparisons between the results from this research, and that of research at the tertiary level demonstrates a clear difference. At the tertiary level, students have chosen to study the subject further and so have an invested interest in high attainment whereas at the secondary level, especially at Key Stage 4, most pupils do not have the choice in certain subjects, and thus their interest may not be as high. Consequently, they may not feel as though they have as strong a vested interest in high attainment.

This last statement obviously comes with a caveat. The pupils selected are a small group compared to the rest of their year group, and indeed, the whole school population. This generalisation may be unfounded and requires further research into determining the true extent of pupils’ vested interests.

4.2 Future work


Further research can therefore follow the subsequent differing routes (or even a combination of them) as detailed below:

1. Expand the pupil numbers to include an entire year group – this will enable more informed discussion about the effects of PeerWise on pupil progress as a result of having a larger sample group and being able to observe more closely the effects of the pupils’ personal vested interests in their education.

2. Apply the use of PeerWise throughout the entire content of the academic year – this would enable greater AfL by monitoring which topics generated more answers which is an indication of where pupils have difficulties as well as ensure pupils are able to use it continuously throughout their studies rather than it being introduced at a comparatively odd time of the year (as was the case with this research).

3. Perform comparative studies between KS4 and KS5 students – this would enable a detailed review of pupil/s vested interests as at KS4, pupils do not have the option regarding studying the sciences whereas at KS5 they do. It would therefore be of interest to observe whether the decision to study the subject further influences the motivation of the student to self-generate content.

Undoubtedly, the short timescale of this research will have influenced the results obtained. Subsequent research would therefore need to be extended over a period of at least three years in order to be able to generate data comparable to that from other tertiary level research groups.


4.3 References

1. Krathwohl DR. A revision of bloom’s taxonomy: An overview. Theory into Practice. 2002;41(4):212-+.

2. Bates SP, Galloway RK, Riise J, Homer D. Assessing the quality of a student-generated question repository. physical review special topics – physics education research. 2014;10(2):020105.

3. Galloway KW, Burns S. Doing it for themselves: Students creating a high quality peer-learning environment. Chem Educ Res Pract. 2015;16(1):82-92.

4. Draper SW. Catalytic assessment: Understanding how MCQs and EVS can foster deep learning. British Journal of Educational Technology. 2009;40(2):285-293.

5. Sener J. In search of student-generated content in online education. E-edukacja na świecie. 2007;21(4).

6. Schraw G, Crippen KJ, Hartley K. Promoting self-regulation in science education: Metacognition as part of a broader perspective on learning. Research in Science Education. 2006;36(1-2):111-139.

7. Bandura A. Self-efficacy: The exercise of control. New York: Freeman; 1997.

8. Frase L, Schwartz B. Effect of question production and answering on prose recall. Journal of educational psychology. 1975;67(5):628.

9. Denner PR, Rickards JP. A developmental comparison of the effects of provided and generated questions on text recall. Contemporary Educational Psychology. 1987;12(2):135.

10. Sanchez-Elez M, Pardines I, Garcia P, et al. Enhancing students’ learning process through self-generated tests. J Sci Educ Technol. 2014;23(1):15-25.

11. Vyas R, Supe A. Multiple choice questions: A literature review on the optimal number of options. Natl Med J India. 2008;21(3):130-133.

12. Gershon M. 19. analogies. In: How to use differentiation in the classroom: The complete guide. ; 2013:185.

13. Mozzer NB, Justi R. Students’ pre- and post-teaching analogical reasoning when they draw their analogies. Int J Sci Educ. 2012;34(3):429-458.

14. Haglund J. Collaborative and self-generated analogies in science education. Studies in Science Education. 2013;49(1):35-68.

15. Coll RK, France B, Taylor I. The role of models/and analogies in science education: Implications from research. International Journal of Science Education. 2005;27(2):183-198.

16. Gilbert J, Osbourne R. The use of models in science and science teaching. European Journal of Science Education. 1980;2(1):1-11.

17. Oliva JM, Azcárateb P, Navarreteb A. Teaching models in the use of analogies as a resource in the science classroom. International Journal of Science Education. 2007;29(1):45.

18. Denny P, Luxton-Reilly A, Hamer J. Student use of the PeerWise system. NEW YORK; 1515 BROADWAY, NEW YORK, NY 10036-9998 USA: ASSOC COMPUTING MACHINERY; 2008:77.

19. Hardy J, Bates SP, Casey MM, et al. Student-generated content: Enhancing learning through sharing multiple-choice questions. International Journal of Science Education. 2014;36(13):2180-2194.

20. Bates SP, Galloway RK, McBride KL. Student-generated content: Using PeerWise to enhance engagement and outcomes in introductory physics courses. 2011 Physics Education Research Conference. 2012;1413:123-126.


Students learn by generating study questions – but the content matters!

November 3, 2014 in Publications

About a year ago, in a large first-year science course at the University of Auckland, students were asked what they felt was most useful about creating, sharing and practicing with their own study questions (the class was using PeerWise, and this was one of several questions students were asked about their experience).

I remember reading through some of the students’ responses, and this one always stuck out to me:

“You don’t really understand how much or how little you know about a concept until you try to devise a good, original question about it”

It seemed to almost perfectly reflect the old adage that in order to teach something (which is, in essence, what students do when they explain the answers to the study questions they create), you must understand it well yourself. And this seemed to be a common perception in this class – overall, students felt that authoring their own questions was more helpful than answering the questions created by their peers. This is illustrated in the chart below, which compares the responses of the class when asked to what degree they felt authoring and answering was helpful to their learning in the course (rated on a typical 5-point scale from “strongly agree” to “strongly disagree”):

So, at least in this class, students felt that question authoring was most helpful to their learning. The instructors felt this was a positive sign, particularly because PeerWise meant they were able to run this activity in their large class with little moderation. But of course student perceptions of learning, and actual learning, are not the same thing! The central question remains – does this activity actually help students learn?

A good starting point is to look at the relationship between student engagement with the activity and their performance in the course – both of which can be measured in various ways. One measure of student engagement is the PeerWise reputation score (an approximate measure of the value of a student’s contributions), and the most obvious measure of course performance is the final mark or grade in the course. The chart below shows this relationship for the surveyed course described above:

To make the relationship clearer in the chart, students have been binned according to the final grade they achieved in the course. At the University of Auckland, there are 12 possible grades: 9 passing (from C- to A+) and 3 failing (from D- to D+). The chart plots the average final course mark, and the average PeerWise reputation score for all students who earned a particular final grade in the course. In this case, students who engaged most actively with PeerWise, tended to perform better in the course.

This relationship between student engagement with PeerWise and exam scores or overall course performance appears to be quite robust. A number of excellent studies have emerged this year, across various academic disciplines, that highlight this link. Hardy et al. report a significant positive correlation between students’ use of PeerWise and their exam performance in 5 large science courses, taught across 3 research-intensive universities in the UK, spanning the subjects of physics, chemistry and biology. Galloway and Burns examined the use of PeerWise by first year chemistry students and report a significant correlation between student activity with PeerWise (as measured by the reputation score) and their exam performance. Similar positive correlations have been reported recently by McQueen et al. (in biology), Singh (in computer science), and by Kadir et al. (in medicine):

So the evidence is clear – students who engage more with PeerWise also tend to perform better in their respective courses. While establishing this relationship is a useful first step, it does not answer our central question regarding student learning. The correlations do not imply that the use of PeerWise by the more successful students has caused their superior performance. In fact, it would be quite a surprise if we didn’t see this positive link. I think most instructors would probably agree that for any kind of course activity, the better students (who tend to earn the better grades) are the ones who are more likely to participate to a greater extent.

To further explore the impact on learning, we recently conducted a randomised, controlled experiment in a first-year programming course (in which engineering students were learning MATLAB programming) at the University of Auckland. Students were randomly assigned to one of two groups (called “Authoring” and “Non-authoring”) to control for their ability. Students in the “Authoring” group were asked to publish 3 study questions on PeerWise prior to a summative mid-semester exam. Students in the “Non-authoring” group could access all of the created questions on PeerWise, but did not author any of their own (NB: for fairness, at a later point in the course we switched these conditions for all students):

This was an “out of class” activity – students participated in their own time, and typically the hours between 8pm and 10pm were when most questions and answers were submitted. This took place over an 11 day period prior to a mid-semester exam that consisted of 10 questions. The activity was almost entirely student-driven – other than setting up the practice repository on PeerWise for their students and setting the exam questions, the instructors were not involved in the activity.

A total of 1,133 questions were authored by students in the “Authoring” group, and a total of 34,602 answers were submitted to these questions by students in both groups as they practiced prior to the exam. So, what was the impact on exam performance?

As a group, the “Authoring” students performed better than the “Non-authoring” students on 9 of the 10 exam questions – a binomial test reveals that this is statistically unlikely to have happened by chance (p = 0.0107). In terms of the average exam scores achieved by each group, there was a difference – but it wasn’t particularly large. As shown in the chart below, the “Authoring” students performed about 5% better than the “Non-authoring” students, again a statistically significant result (Wilcoxon test, p = 0.0197):

While the superior performance of the “Authoring” students on this mid-semester exam can be attributed to their use of PeerWise (more specifically, to their authoring of the study questions), this doesn’t necessarily mean that there aren’t more effective ways for students to study. For one thing, we don’t know how the “Non-authoring” students spent their time while the “Authoring” students were creating questions – we certainly can’t assume that they spent the same amount of time preparing for the exam.

What happens if we take a closer look at the content of the questions? This is where things get more interesting.

In an article published in 1994, entitled “Student Study Techniques and the Generation Effect“, Paul Foos suggests a reason for why we may have seen such a small difference between the average exam scores of each group. He argues that students who prepare for an exam by generating study questions may benefit only if they create questions on topics that are targetted by the exam questions. This certainly makes intuitive sense – some of the students in our “Authoring” group probably created perfectly “good” questions, but these questions did not target the concepts that were examined by any of the 10 exam questions, and thus they didn’t benefit as a result.

To explore this, we classified all 1,139 student authored questions according to the main topics that they targetted, and we did the same for the 10 exam questions. For simplicity, when we focussed on questions that targetted a single topic, we discovered that there were 3 core topics that were each targetted by 2 exam questions. For each of these three topics, the students can be classified into three groups:

  • the “Authoring” students who created at least one question on the topic
  • the “Authoring” students who did not create any questions on the topic
  • and the “Non-authoring” students

The chart below plots, for each of the three topics, the proportion of students in each group that correctly answered both exam questions on the topic:

We see virtually no difference between the performance of the “Non-authoring” students and the “Authoring” students who did not create questions on a topic, when answering exam questions on that topic – precisely as described by Foos’ earlier work. Students who did author questions on a particular topic performed far better on the corresponding exam questions. The impact of question authoring on learning also becomes clearer – with effect sizes of between 10% and 20% across the question pairs.

Of course, the story doesn’t end here. Although the question authoring activity did have a significant positive impact overall, some of the differences observed between the on-topic and off-topic “Authoring” students may be a result of students choosing to author questions on topics they already knew well, rather than learning much new from the process.

It is hard to say a lot more about this without more data – but like the correlation studies mentioned earlier, this helps to paint a picture of an activity which, with very little instructor involvement, can have a measurable positive effect on student learning!

A student’s point of view

July 8, 2014 in Uncategorized

Amongst the numerous posts on the PeerWise Community blog are accounts by instructors of their experiences, descriptions of features within PeerWise, ideas for helping students use PeerWise effectively, and even a few curiosities.

However, one thing missing from this blog has been the student voice – that is, until now!

The idea for this began after reading an excellent post on the Education in Chemistry blog written by Michael Seery (Chemistry lecturer at the Dublin Institute of Technology).  In the post, Seery expressed three main reservations for not (yet) using PeerWise (although, as indicated in his post’s title, it appears he is slowly warming to the idea – and even has an account!).  The post itself is very thoughtfully written and worth a read if you haven’t yet seen it.  And, I should add, Seery maintains a fantastic blog of his own (the wonderfully titled “Is this going to be on the exam?“) which covers all kinds of teaching and education-related topics, which I thoroughly recommend.

A few days after Seery’s post was published, a student named Matt Bird wrote a comment on the post describing his experiences using PeerWise in a first year Chemistry course at the University of Nottingham.  Incidentally, this course was taught by Kyle Galloway who has previously spoken about PeerWise and who I see gave a talk yesterday entitled “PeerWise: Student Generated, Peer Reviewed Course Content” at the European Conference on Research in Chemistry Education 2014 – congratulations Kyle!)

Matt’s comment briefly touched on the reservations expressed by Seery – question quality, plagiarism and assessment – but also discussed motivation and question volume.  I thought it would be interesting to hear more from Matt to include a student perspective on the PeerWise Community blog and so I contacted him by email.  He was good enough to agree to expand a little on the points originally outlined in his comment.

Included below are Matt’s responses to 5 questions I sent him – and, in the interests of trying to be balanced, the last question specifically asks Matt to comment on what he liked least about using PeerWise.

Tell us a little about the course in which you used PeerWise.  How was your participation with PeerWise assessed in this course?  Do you think this worked well?
We used PeerWise for the Foundation Chemistry module of our course. It was worth 5% of the module mark, and was primarily intended as a revision resource. To get 2% we were required to write 1 question, have an answer score of 50 or more, and comment/rate at least 3 questions. Exceeding these criteria would get 3%, and being above the median reputation score would get the full 5%. Despite only being worth a small amount of the module, I think this system worked well to encourage participation as it was easy marks, and good revision.

What did you think, generally, about the quality of the questions created by your classmates?  How did you feel about the fact that some of the content, given that it was student-authored, may be incorrect?
In general the questions were good quality. Obviously some were better than others, but there were very few bad questions. There were cases where the answer given to the question was incorrect, or the wording of the question itself unclear, but other students would identify this and suggest corrections in a comment. In almost all cases the question author would redo the question.

Were you concerned that some of your fellow students might copy their questions from a text book?
I wasn’t concerned about questions being copied from textbooks. At the end of the day it is a revision resource, and textbook questions are a valid way of revising. The author still had to put the question into multiple choice format, thinking about potential trick answers they could put (we all enjoyed making the answers mistakes people commonly made!) so they had to put some effort in. Obviously lecturers may have a different opinion on this!

How did you feel about the competitive aspects of PeerWise (points, badges, etc.)? 
The competitive aspects were what kept me coming back. It was an achievement to earn the badges (especially the harder ones), and always nice to be in the top 5 on one or more of the leader-boards. If you knew your friends’ scores then you could work out if you were beating them on the leader boards or not, which is kind of ‘fun’.  I fulfilled the minimum requirements fairly quickly, so most of my contributions were done to earn badges, and work my way up the leader-boards (and to revise, of course!).

Do you feel that using PeerWise in this course helped you learn? What did you personally find most useful / the best part about using PeerWise? What did you personally find least useful / the worst part about using PeerWise?
I got 79 % for the first year of the course, so something went right! PeerWise probably contributed somewhat to that, as it did help me with areas I was less strong on.  It’s hard to say what the most useful part of PeerWise was, but the number of questions was certainly useful. I guess that’s more to do with the users rather than the system though. As previously mentioned the competitive aspect was fun.  The worst part of PeerWise would be the rating system. No matter how good the question, and how good the comments about the question were hardly anybody rated questions above 3/5 with most coming in at around 2. I guess nobody wanted to rate question too highly and be beaten in that leader-board! It would also have been nice to create questions where multiple answers were correct so you need to select 2 answers.  Overall, I enjoyed using PeerWise and hope it is used again later on in my course.

Many sincere thanks to Matt Bird for taking the time to respond to these questions – particularly during his summer break – enjoy the rest of your holiday Matt!

Although his feedback represents the opinion of just one student, several interesting points are highlighted.  For one thing, that one of the most common instructor concerns regarding PeerWise (the lack of expert quality control) did not seem to be of particular concern.  In fact, Matt seems to appear fairly confident in the ability of his classmates to detect and suggest corrections for errors.

When commenting on the aspects of PeerWise that did concern him, Matt mentioned that the student-assigned ratings did a poor job of differentiating between questions.  Indeed, this does appear to be somewhat of an issue in this course.  The histogram below illustrates the average ratings of all questions available in the course.

Of the 363 questions in the repository, 73% were rated in a narrow band between 2.5 and 3.5 and 96% of all questions had average ratings between 2.0 and 4.0.  While there are some techniques that students can use to find questions of interest to them (such as searching by topic or “following” good question authors) it seems like this is worth investigating further.

Below are two example questions pulled out of the repository from Matt’s course – only the question, student answers and the explanation are shown, but for space reasons none of the comments that support student discussion around the questions are included.  I selected these questions more or less at random, given that I am completely unfamiliar with the subject area!  It is, of course, difficult to pick just one or two questions that are representative of the entire repository – but these examples go a small way towards illustrating the kind of effort that students put into generating their questions.

And finally, one other thing Matt mentioned in his feedback was that he would liked to have seen other question formats (in addition to single-answer multiple choice).  Watch this space…

Correlation between authoring questions and understanding of threshold concepts in PeerWise

March 17, 2014 in Use cases

PeerWise is used in the Chemistry Department at the University of Liverpool as a student contributed assessment system in the “Chemical Engineering for Chemists” module. The aim of this module is to give chemistry students an insight into the world of chemical engineering and to enhance their understanding of the fundamental/threshold concepts in chemical engineering. The continuous assessments in this module play an important role in enhancing student understanding of chemical engineering concepts which are entirely foreign to most chemists.

PeerWise was used to enhance chemistry students’ understanding of threshold concepts in chemical engineering. Our theory was that students have to understand the fundamental/threshold concepts to be able to author good quality questions. Although answering peers’ questions in PeerWise provides a good revision material for learners which strongly supports learning, this research was focused on the importance of authoring questions on students’ understanding of challenging concepts.

The PeerWise scores on authoring questions on mass and energy balances as fundamental operations in a process analysis procedure were compared with exam marks related to these topics. The data were evaluated to find the correlation between PeerWise scores and exam marks. The positive correlation between PeerWise scores on authoring questions and exam marks proved the significance of using student contributed assessment system to enhance understanding of threshold concepts.

Do badges work?

November 19, 2013 in Publications, Talking point

Badges everywhere

Have you ever wondered whether some of the “game-like” rewards that are becoming more and more common online actually have a measurable impact on user participation?  Does the promise of earning a “Hotel specialist” badge on Trip Advisor motivate travellers to write more reviews?  On Stack Overflow, a popular question and answer forum for programmers, do people answer more questions than they otherwise would so that they can increase their reputation score and earn a higher spot on the global leaderboard?

Of course, if you play games these kinds of rewards are nothing new – performance in many games is measured by points, leaderboards have been around since the earliest arcade games, and the Xbox Live platform has been rewarding players with achievements for nearly a decade.  Now, in an attempt to motivate users across a broad range of applications, we see these game-like elements appearing more frequently.  But do they work?

Badges in PeerWise

PeerWise includes several game-like elements (points have been discussed on this blog before), including badges (or “virtual achievements”).  For example, regular practice is rewarded with the “Obsessed” badge, which is earned for returning to PeerWise on 10 consecutive days and correctly answering a handful of questions each time.

Other badges include the “Insight” badge, for writing at least 2 comments that receive an agreement, the “Helper” badge for improving the explanation of an existing question, and the “Good question author” badge, awarded for authoring a question that receives at least 5 “excellent” ratings from other students.  A complete list of the available badges can be seen by clicking the “View my badges” link on the Main menu.

As you would expect, some badges are much harder to earn than others.  Almost every student earns the “Question answerer” badge – awarded when they answer their very first question.  The following chart shows the percentage of students with the “Question answerer” badge that earn each of the other available badges.  Only about 1 in 200 students earn the “Obsessed” badge.

The badges in PeerWise can be classified according to the roles that they play (there is a nice article by Antin and Churchill that explores this further):

  • “Goal setting”: helping the student set personal targets to achieve
  • “Instruction”: helping the student discover features of the application
  • “Reputation”: awarded when the quality of the student’s contributions are endorsed by others

It is interesting to note that most of the badges awarded for answering questions are of the “Goal setting” variety, whereas those awarded for authoring questions are mainly in the “Reputation” category.

And now back to our original question – do these badges have any influence over the way that students use PeerWise?  When considering this question, we must keep in mind that observed effects may not necessarily be positive ones.  One of the criticisms levelled at extrinsic rewards, such as game-like elements, is that they have the potential to undermine intrinsic motivation in a task, which is clearly of concern in an educational context.  However, this is a somewhat contentious claim, and very recent work by Mekler et al. showed no negative impact on intrinsic motivation in an experiment measuring the effect of using game elements to reward user participation in an online image-tagging activity (although it must be noted that this was a short-term study and motivation was self-reported).

Anecdotal support

There is certainly some anecdotal evidence that the PeerWise badges are being noticed by students in a positive way.  Examples of this include public tweets:

as well as responses to a survey conducted in 2012 at the University of Auckland:

“I didn’t think I was “badge” type of person, but I did enjoy getting badges (I was the first one to get the obsessed badge – yay!). It did help motivate me to do extra and in doing so, I believe I have learnt more effectively.”

“The badges did make me feel as if I was achieving something pretty important, and helped keep Peerwise interesting.”

Another example was nicely illustrated in a talk given by James Gaynor and Gita Sedhi from the University of Liverpool in June this year, in which they presented their experiences using PeerWise at a local teaching and learning conference.  On one of their slides, they displayed a summary of student responses to the question: “Was there any particular aspect of PeerWise you liked?

Across the two courses examined, “badges” and “rewards” emerged quite strongly (points, rewards, achievements and rankings were coded as “Other rewards”).

However, it should be noted that not all students are so positive about the badges.  Other responses to the previously mentioned survey indicate that the effect on some students is fleeting:

“well, it kinda increase my motivation a bit at the beginning. but then i get bored already”

“They don’t really affect my motivation now, but they did when I first started.”

and others pay no attention to the badges at all:

“I never cared about the badges -> simply because they dont mean anything -> i.e. does not contribute to our grade”

“They did nothing for my motivation.”

Controlled experiment

To understand the impact of the badges more clearly, we conducted a randomised, controlled experiment in a very large class (n > 1000).  All students in the class had identical participation requirements (author 1 question and answer 20 questions), however only half of the students were able to see the badges in the interface and earn them for their participation.  This group was referred to as the “badges on” group, whereas the control group who were not able to see the badges were referred to as the “badges off” group.  The experiment ran over a period of 4 weeks in March 2012, and the class generated approximately 2600 questions and submitted almost 100,000 answers.

Students in the “badges on” group, who were able to earn the badges, submitted 22% more answers than students in the control group.  The chart below plots the day to day differences over the course of the study – on all but one day, the “badges on” students submitted more answers than the “badges off” students.

The table below summarises the number of questions authored, answers submitted, and distinct days of activity for students in each group.


The presence of the badges in the interface had a significant positive effect on the number of questions answered and the number of distinct days that students were active with PeerWise.  Interestingly, although there was no effect on the number of questions authored by students, no negative effects were observed – for example, the increase in the number of answers submitted did not lead to a reduction in the accuracy of those answers.

If you you would like to see additional data from this experiment, as well as a more complete discussion and acknowledgment of the threats to the validity of the results, the full paper is available online (and on the PeerWise Community Resources page).  Of course, no experiment is perfect, and this work probably raises more questions than it answers, but it does provide some empirical evidence that the badges in the PeerWise environment do cause a change in the way that students engage with the activity.  Perhaps we could see similar effects in other, similar, educational tools?

TLDR: And for those who prefer movies to reading, the conference in which this work was published required a brief accompanying video.  If you are really keen, see if you can last the full 40 seconds!


PW-C is one year old

October 10, 2013 in Announcements, Talking point

It was a year ago today that we launched the PeerWise-Community site: we are officially one! :)

We now have over 300 users registered on the sites, and have had nearly 5,000 unique visitors request 26,000 individual pages.

In terms of location of visitors to the site, we’ve welcomed viewers from all inhabited continents, with the UK topping the country list with nearly 1/3 of all visits. Not bad for an (allegedly) small and irrelevant island (not my words, you understand….)  The USA, Canada, New Zealand and Australia follow for the rest of the top 5 visitor countries. Approximately half of all traffic over the last year was from returning users, who had previously visited the site.

There’s a huge amount of data that Google Analytics gives you, that can consume vast amounts of time drilling into! But one of the more surprising things to me was that the vast majority of all accesses (over 80%) still come from desktop / laptop devices rather than tablet or mobile. And also, after the home page and the registration page, it turns out that publication page was the one most frequently consulted.

If any members have suggestions for things they would like to see in year 2 on the site, please add comments below!


PeerWise – Experiences at University College London

September 13, 2013 in Uncategorized, Use cases

PeerWise – Experiences
at University College London

by Sam Green and Kevin Tang

Department of Linguistics, UCL


In February 2012, as part of a small interdisciplinary team, wPeerWise_Logoe secured a small grant of
 £2500 from the Teaching Innovation Grant fund to develop and implement the use of PeerWise within a single module in the Department of Linguistics at University College London (UCL). The team was made up of various advisory staff from the Centre for Applied Learning and Teaching, also from the Division of Psychology and Language Sciences (PALS), and lecturers and Post-Graduate Teaching Assistants (PGTAs) in the department of Linguistics. The use of the system was monitored and student’s participation made up 10% of their grade for the term.

The subsequent academic year we extended its use across several further modules in the department by obtaining an e-Learning Development Grant at UCL.

Overall aims and objectives

The PGTAs adapted the material developed in the second half of the 2011/12 term to provide guidelines, training, and further support to new PGTAs and academic staff running modules using PeerWise. The experienced PGTAs were also be involved in disseminating the project outcome and sharing good practice.

Methodology – Explanation of what was done and why

Introductory session with PGTAs:

A session run by the experienced PGTAs was held prior to the start of term for PGTAs teaching on modules utilising PeerWise. This delivered information on the structure and technical aspects of the system, the implementation of the system in their module, and importantly marks and grading. This also highlighted the importance of team-work and necessity of participation. An introductory pack was provided for new PGTAs to quickly adapt the system for their respective modules.

Introductory session with students:

Students taking modules with a PeerWise component were required to attend a two-hour training and practice workshop, run by the PGTAs teaching on their module. After being given log-in instructions, students participated in the test environment set up by the PGTAs. These test environments contained a range of sample questions (written by the PGTAs) relating to students’ modules and which demonstrated to the students the quality of questions and level of difficulty required. More generally, students were given instructions on how to provide useful feedback, and how to create educational questions.

Our PGTA - Thanasis Soultatis giving an introductory session to PeerWise for students

Our PGTA – Thanasis Soultatis giving an introductory session to PeerWise for students

Our PGTA - Kevin Tang giving an introductory session to PeerWise for students

Our PGTA – Thanasis Soultatis giving an introductory session to PeerWise for students

Course integration

In the pilot implementation of PeerWise, BA but not MA students were required to participate. BA students showed more participation than MA students, but the latter nevertheless showed engagement with the system. Therefore, it was decided to make PeerWise a compulsory element of the module to maximise the efficacy of peer-learning.

It was decided that students should work in ‘mixed ability’ groups, due to the difficult nature of creating questions. However, to effectively monitor individual performance, questions were required to be answered individually. Deadlines situated throughout the course ensured that students engaged with that week’s material, and spread out the workload.

Technical improvement

The restriction of image size and lack of an ability to upload or embed audio files (useful for phonetic/phonological questions in Linguistics) was circumvented by using a UCL-wide system which allows students to host these sorts of files. This system (MyPortfolio) allows users to create links to stored media. This also allows the students to effectively anonymise the files, thus keeping them secret for the purpose of questioning.

Project outcomes

Using the PeerWise administration tools, we observed student participation over time. Students met question creation deadlines as required, mostly by working throughout the week to complete the weekly task. In addition, questions were answered throughout the week, revealing that students didn’t appear to see the task purely as a chore. Further, most students answered more than the required number of questions, again showing their willing engagement. The final point on deadlines was that MA students used PeerWise as a revision tool entirely by their own choices. Their regular creation of questions created a repository of revision topics with questions, answers, and explanations

Active Engagement

Active Engagement

The Statistics

PeerWise provides a set of PeerWise scores. To increase the total score, one needs to achieve good scores for each component.

The students were required to:

  • write relevant, high-quality questions with well thought-out alternatives and clear explanations
  • answer questions
  • rate questions and leave constructive feedback
  • use PeerWise early (after questions are made available) as the score increases over time based on the contribution history

Correlations between the PeerWise scores and the module scores were performed to test the effectiveness of PeerWise on student’s learning. A nested model comparison was performed to test the effectiveness of the PeerWise grouping in prediction of the students’ performance. The performance in Term 1 differs somewhat between the BA students and MA students, but not in Term 2 after manipulations with the PeerWise grouping with the BAs.

Term 1:

The BA students showed no correlation at all, while the MAs showed a strong correlation (r = 0.49, p < 0.001***)

MA Students - Term 1 - Correlation between PeerWise Scores and Exam Scores

MA Students – Term 1 – Correlation between PeerWise Scores and Exam Scores

In light of this finding, we attempted to identify the reasons behind this divergence in correlations. One potential reason was that grouping with the BAs was done randomly, rather than by mixed-ability, while the grouping with the MAs was done by mixed-ability. We,  hypothesized that mixed-ability grouping is essential to the successful use of the system. To test this hypothesis, we asked the PGTA for the BAs to regroup the PeerWise groups in the second term based on mixed-ability. This PGTA did not have any knowledge of the students’ Peerwise scores in Term 1, while the PeerWise grouping for the MAs largely remained the same.

Term 2:

The assignments in Term 2 were based on three assignments spread out over the term. The final PeerWise score (taken at the end of the Term 2) was tested for correlation with each of the three assignments.

With the BAs, the PeerWise score correlated with all three assignments with increasing levels of statistical significance – Assignment 1 (r = 0.44, p = 0.0069**), Assignment 2 (r = 0.47, p = .0.0040*) and Assignment 3 (r = 0.47, p = .0.0035**).

With the MAs, the findings were similar, with the difference that Assignment 1 was not significant with a borderline p-value of 0.0513 – Assignment 1 (r = 0.28, p = 0.0513), Assignment 2 (r = 0.46, p = 0.0026**) and Assignment 3 (r = 0.33, p = 0.0251**).

A further analysis was performed to test if PeerWise grouping has an effect on assignment performance. This consisted of a nested-model comparison with PeerWise score and PeerWise Group as predictors, and the mean assignment scores as the predictee. The lm function in R statistical package was used to build two models, the superset model having both PeerWise score and PeerWise Group as the predictors, and the subset model having only the PeerWise score as the predictor. An ANOVA was used to compare the two models, and it was found that while both PeerWise scores and PeerWise grouping were significant predictors separately, PeerWise grouping made a significant improvement in prediction with p < 0.05 * (see Table 1 for the nested-model output).

Table 1: ANOVA results

Analysis of Variance Table

Model 1: Assignment_Mean ~ PW_score + group
Model 2: Assignment_Mean ~ PW_score
  Res.Df    RSS Df Sum of Sq      F  Pr(>F)  
1     28 2102.1                              
2     29 2460.3 -1   -358.21 4.7713 0.03747 *
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

The strong correlation found with the BA group in Term 2 (but not in Term 1) is likely to be due to the introduction of mixed-ability grouping. The group effect suggests that the students performed at a similar level as a group, which implies group learning. This effect was only found with the BAs but not with the Mas; this difference could be attributed to the quality of the mixed-ability grouping since the BAs (re)grouping was based on Term 1 performance, while the MA grouping was based on the impression on the students that the TA had in the first two weeks of Term 1. With the BAs and MAs, there was a small increase of correlation and significance level over the term; this might suggest that the increasing use of the system assists with improving assignment grades over the term.

Together these findings suggest that mixed-ability grouping is key to peer learning.


A questionnaire was completed by the students about their experience with our implementation of PeerWise. The feedback was on the whole positive with a majority of students agreeing that

  1. Developing original questions on course topics improved their understanding of those topics
  2. Answering questions written by other students improved their understanding of those topics.
  3. Their groups worked well together

These highlighted the key concept of PeerWise – Peer Learning


Our objective statistical analyses together with the subjective feedback from the students themselves strongly indicated that the project enhanced student learning and benefitted their learning experience.

E-learning awareness

One important experience was the recognition that peer learning – using e-learning – can be a highly effective method of learning for students, even with low amounts of any regular and direct contact from PGTAs to students regarding their participation.

It was necessary to be considerate of the aims of the modules, understand the capabilities of PeerWise and it’s potential for integration with the module, and importantly to plan in detail the whole module’s use of PeerWise from the beginning. Initiating this type of e-learning system required this investigation and planning in order for students to understand the requirements and the relationship of the system to their module. Without explicit prior planning, with teams working in groups and remotely from PGTAs and staff (at least, with regards their PeerWise interaction), any serious issues with the system and its use may not have been spotted and/or may have been difficult to counteract.

As mentioned, the remote nature of the work meant that students might not readily inform PGTAs of issues they may have been having, so any small comment was dealt with immediately. One issue that arose was group members’ cooperation; this required swift and definitive action, which was then communicated to all relevant parties. In particular, any misunderstandings with the requirements were dealt with quickly, with e-mails sent out to all students, even if only one individual or group expressed concern or misunderstanding.

Dissemination and continuation

A division-wide talk (video recorded as a ‘lecturecast’) was given by Kevin Tang and Sam Green (the original PGTAs working with PeerWise) introducing the system to staff within the Division of Psychology and Language Sciences. This advertised the use and success of PeerWise to several interested parties, as did a subsequent lunchtime talk to staff at the Centre for the Advancement of Teaching and Learning. As the experienced PGTAs documented their experiences in detail, created a comprehensive user-guide, included presentations for students and new administrators of PeerWise, and made this readily-available for UCL staff and PGTAs, the system can capably be taken up by any other department. Further, within the Department of Linguistics there are several ‘second-generation’ PGTAs who have learned the details of, and used, PeerWise for their modules. These PGTAs will in turn pass on use of the system to the subsequent year, should PeerWise be used again; they will also be available to assist any new users of the system. In sum, given the detailed information available, and current use of the system by the Department of Linguistics, as well as the keen use by staff in the department (especially given the positive results of its uptake), it seems highly likely that PeerWise will continue to be used by several modules, and will likely be taken up by others.



Introducing the “answer score”

July 31, 2013 in Features

If you have viewed one of your courses in the last day or so you may have noticed a small addition to the main menu.  A new score, called the “answer score“, now appears (see the screenshot on the right).

The original score and associated algorithm remains unchanged except it has now been renamed to “reputation score” which more accurately reflects its purpose.  A number of instructors have been using this original score to assign extra credit to their students – as outlined in the following blog post which also describes the algorithm in detail:

However, this “reputation score” was the source of some confusion for those students who did not read the description of how it was calculated (this description appears if the mouse is hovered over the score area).  This is exemplified by the following comment submitted via the PeerWise feedback form:

This is a great tool. I love it. The only criticism is the slow update on the score. You need to wait 30min+ to see what score you have after a session.

This confusion is related to the fact that the “reputation score” for an individual student only increases when other students participate as well (and indicate that the student’s contributions have been valued).  On the other hand, the new “answer score” updates immediately when a student answers a question and this immediate feedback may alleviate some of the concerns and provide students with another form of friendly competition.  As soon as an answer is selected, the number of points earned (or in some cases lost) is displayed immediately as shown in the screenshots below.

More importantly, the new “answer score” now provides another measure of student participation within PeerWise.  Instructors may like to use this to set targets for students to reach.  A detailed description of how this works follows, but very basically for most students the “answer score” will be close to 10 multiplied by the number of questions they have answered “correctly” (where a correct answer is one that either matches the question author’s suggested answer or is the most popular answer selected by peers).  For example, the chart below plots the number of answers submitted that agreed with the author’s answer (note, this is a lower bound on the number of “correct” answers as just defined previously) against the new “answer score” for all students in one course who were active in the 24 hours after the new score was released.  The line is almost perfectly straight and has a slope close to 10.

A few students fall marginally below the imaginary line with a slope of 10 – this is because every “correct” answer submitted earns a maximum of 10 points however a small number of points are lost if an “incorrect” answer is submitted.  The number of points deducted for an incorrect answer depends on the number of alternative answers associated with the multiple-choice question – for example questions with 5 options have a lower associated penalty than questions with 2 options.  If a large number of questions are answered by randomly selecting answers (which is obviously a behaviour that we would want to discourage students from adopting), the “answer score” should generally not increase.

So, how can you now begin to make use of this?

It appears to be quite common, from hearing a range of instructors describe how they implement PeerWise in their own classrooms, to require students to answer a minimum number of questions per term.  To make things a little more interesting, you can now set an “answer score” target instead.  This is practically the same thing but a bit more fun (just remember to multiply by 10) – if you normally ask your students to answer 50 questions, try setting an “answer score” target of 500!

There is a new student leaderboard for the top “answer scores”, and as an instructor you can download a complete list of answer scores at any time (from the Administration section).

And if in doubt….. guess ‘D’

July 13, 2013 in Talking point

By now, hopefully you’re enjoying a well-deserved summer break (at least those in the Northern Hemisphere…..) In the summer spirit, here’s an interesting question that we were asked this week.

This week, we gave a virtual workshop on PeerWise, as part of the Western Conference on Science Education, held in London, Ontario. (Slides here, if you’re interested) One of the participants asked a seemingly innocent question that got us thinking.

What is the most common choice of correct answer chosen by authors of an MCQ?

Whilst not knowing the answer there and then, we realized that we were sitting on a goldmine of data! The PeerWise system now contains over 600,000 student-authored questions. Granted, not all of these are 5 item MCQs, but a substantial fraction are. Could we mine this data to see if there really was a preferred answer choice and, if so, which option was it?

It turns out that there are nearly a quarter of a million 5-item ‘active’ MCQs in the PeerWise system, and that the most commonly chosen answer by authors of questions is ‘D’. The percentage of ‘D’ correct answers (24.98%) may not look all that much more than the 20% that might be expected if the choice of answers was totally random, but the sheer number of questions analyzed here (223,435) makes this a highly significant result (in the sense of ‘statistical significance’, possibly less so in the sense of ‘change-the-world-significant’…..)

It’s interesting to note that the extremities of the answer range are both below the ‘random choice’ value of 20%. There’s a certain logic in thinking that authors may want to conceal the right answer somewhere other than the last, or perhaps even more so, the first answer choice.

Is this just a peculiarity of student-generated questions, and what about questions with fewer than 5 answer choices, you might be thinking? The collective wisdom of the internet is not a great deal of help here. Various sites (Yahoo answers included) include commentary that lacks a definitive answer, but is not short of ‘definite’ proclamations that it is answer B. Or C. Or, if you’re not sure, pick the longest answer. Clearly, some people would be better served by adopting a different strategy when preparing to answer MCQs, such as actually learning the material. Learning solutions magazine claims the most common answer on a 4-item question is C or D. When I got down the Google search page to a link to ‘Wild guessing strategies on tests’, I stopped reading. Feel free to supplement this with your own research….

(Just for the record, from our analysis of another 220,000 4-item MCQs, the most popular answer chosen by authors is C, by a nose from B, both of which are well above the 25% expected value if truly random.)

If at first you don’t succeed, answer again!

March 28, 2013 in Features

A small milestone was reached at the start of this week when a student (from Central Queensland University in Australia) submitted the 10 millionth answer to a PeerWise question. That’s quite a few answers, but of course, many of them are incorrect.  In fact, around 3.5 million of them do not match the answer as specified by the question author.  So what happens to all these wrong answers?

Well, up until now, students have only been able to attempt a question once.  While they have been able to review their answers, there has not been a way for them to hide earlier attempts and submit new answers.  In other words, wrong answers have stayed wrong!  On the one hand, it is valuable to preserve the originally submitted answers as these can provide useful feedback to the instructor about how well their students are coping with course concepts.  However, the ability to re-answer questions has been a very common request – not only from students:

“It would be better if you could answer questions you’ve already answered a second time. Some questions I answered a long time ago and would not remember the answer from the first time around but it would be helpful to re-test myself.”

but also from instructors:

“Some of my students requested the option of redoing questions after they’ve answered them.  This would be particularly useful for questions they missed.”

and from members of the PeerWise Community:

As a result of this feedback, a new feature has been introduced which allows students to submit new answers to any question and, having reviewed all of the feedback, to indicate which answer they believe is the correct one.  This blog post briefly describes this new feature and presents some very early data showing how students are making use of it.

Colour-coded answers

The first thing that students will notice is that the answers they submit are now colour-coded.  The screenshot below shows a typical view of the “Answered questions” page.

Answers that appear to be incorrect are highlighted in red, answers that appear to be inconclusive are highlighted in orange, and answers that appear to be correct are not coloured.  This colour coding helps students to locate questions that they have answered incorrectly, and therefore might like to attempt again.  The column that displays the answer is now sortable – so that all incorrect answers appear at the top of the table.

A column labelled “Answer again?” has also been added to the table.  Clicking the link in this column will present the question to the student again, without showing any information about their previous attempts.

Confirming and changing answers

You may have noticed in the earlier screenshot that below the correctly answered questions (which are not coloured) some of the entries are coloured in green.  These are “confirmed” answers – questions for which the student has indicated they are sure their answer is actually correct.  Let’s take a look at how this happens.

As soon as a student submits an answer to a question, they are shown a variety of feedback.  This includes the explanation for the answer as written by the question author, any improvements to the explanation written by their classmates, and any comments written about the question.  In addition, they are shown each of the question options and the number of times each was selected by their peers.  Immediately beneath these options, the student now has a choice:

If, after reviewing all of this feedback, the student believes their submitted answer is not correct, they can simply change their answer by immediately attempting the question again.  Note that the original, or “first attempt”, answers to a question are preserved and always appear in the column labelled “First answers”.

However, if the student is certain that their answer is correct after reviewing the feedback, they can “confirm” their answer.  This does two things: it will display their confirmed choice in the column labelled: “Confirmed answers” and in the “Answered questions” table the corresponding result will be highlighted in green (and will appear at the bottom of the list when sorted).

The screenshot below shows a set of options for a question where option D has been confirmed by four students.

Note that in this case, although there was considerable disagreement over the correct answer following students’ first attempts, only option D appears as a confirmed answer.  In a small way, this “confirmation” process mirrors an important element of the Peer Instruction pedagogy – after individually committing to an answer, students have an opportunity to reach consensus by reflecting on feedback from their peers.

Reaching consensus

The ability to re-answer questions and to “confirm” answers is only a few days old, yet we can take a peek at how students are beginning to make use of it.  In particular, to investigate whether students are effectively reaching consensus, we can compare the typical spread of “first” answers with the typical spread of “confirmed” answers.

Students have submitted 74,357 new answers since the feature was released.  Of these, 13,904 (almost 20%) have been “confirmed” as correct (in some cases this is after changing the original answer).  The chart below illustrates the typical spread seen across the “first” answers submitted to a question.  The options are ordered by popularity – so the most popular option, on average, is selected around 71% of the time.  The second most popular option is selected around 18% of the time, and so on.  Only questions that had received 10 or more responses following the release of this feature were considered – in this case a total of 1351 questions.

As a comparison, the chart below illustrates the typical spread seen across the “confirmed” answers submitted to a question.  Once again, the options are ordered by popularity – in this case, the most popular “confirmed” answer is selected 99% of the time.

Only questions with 10 or more confirmed answers were included in this analysis – giving a total of 56 questions.  This is a small data set as the feature is very new, still, there is virtually no disagreement about which answer is correct.  In fact, 45 of the 56 questions had perfect agreement and, at least so far, no question had more than two different confirmed answers.

It will be interesting to see how this progresses, but if this early data is any indication, “confirmed” answers may provide an effective new way for students to verify whether they are right or wrong.