An Empirical Study to Determine if Mutants Can Effectively Simulate Students’ Programming Mistakes to Increase Tutors’ Confidence in Autograding

by Ben Clegg, Phil McMinn, and Gordon Fraser

ACM SIGCSE Technical Symposium on Computer Science Education (SIGCSE 2021)



Automated grading is used to assess large numbers of students’ programs in software engineering courses, often utilising test suites to evaluate the correctness of these programs. However, test suites can vary in how they evaluate a program. In this paper, we investigate how much different suites influence generated grades, and how their properties contribute to this influence. We conduct a modified replication study of existing work, using students’ faulty solution programs and test suites that we constructed using a sampling approach. We find that differing test suites generate greatly varying grades, with the standard deviation of grades for each solution ... [more]


Reference

Ben Clegg, Phil McMinn, and Gordon Fraser. An Empirical Study to Determine if Mutants Can Effectively Simulate Students’ Programming Mistakes to Increase Tutors’ Confidence in Autograding. ACM SIGCSE Technical Symposium on Computer Science Education (SIGCSE 2021), pp. 1055–1061, 2021


Bibtex Entry
@inproceedings{Clegg2021,
  author    = "Clegg, Ben and McMinn, Phil and Fraser, Gordon",
  title     = "An Empirical Study to Determine if Mutants Can Effectively Simulate Students' Programming Mistakes to Increase Tutors' Confidence in Autograding",
  booktitle = "ACM SIGCSE Technical Symposium on Computer Science Education (SIGCSE 2021)",
  pages     = "1055--1061",
  year      = "2021"
}