Pregled bibliografske jedinice broj: 1104062
Utjecaj tehnika za predobradu izvornog koda na točnost otkrivanja plagijata u studentskim zadacima iz programiranja
Utjecaj tehnika za predobradu izvornog koda na točnost otkrivanja plagijata u studentskim zadacima iz programiranja, 2020., doktorska disertacija, Fakultet Organizacije i Informatike, Varaždin
CROSBI ID: 1104062 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Utjecaj tehnika za predobradu izvornog koda na
točnost otkrivanja plagijata u studentskim
zadacima iz programiranja
(Effect of source-code preprocessing techniques on
plagiarism detection accuracy in student
programming assignments)
Autori
Novak, Matija
Vrsta, podvrsta i kategorija rada
Ocjenski radovi, doktorska disertacija
Fakultet
Fakultet Organizacije i Informatike
Mjesto
Varaždin
Datum
03.02
Godina
2020
Stranica
232
Mentor
Kermek, Dragutin ; Joy, Mike
Ključne riječi
detekcija plagijata ; visoko školstvo ; izvorni kod ; programski zadaci ; tehnike predobrade ; usporedba ; sličnost programa
(plagiarsim detection ; high education ; source-code ; programming assignements ; preprocessing tehniques ; comparison ; program similarity)
Sažetak
Plagiarism is a serious problem in academia and students cheat for various reasons, but whateverthereasonsuchbehaviourshouldnotbeaccepted. Whileitiseasytocontrolplagiarism in classrooms with few students it can be a challenge to do it in a classroom with one hundred students or more. To help teacher detect plagiarism similarity detection tools are built, usually called plagiarism detection tools. While in academia plagiarism can be done in many areas the two most common are textual and programming assignments. In this thesis, the focus is on detecting plagiarism in student programming assignments. Since the tools are not perfect there is always room for improvement and one possibility to improve the plagiarism detection quality is the usage of preprocessing techniques. Preprocessing techniques have been used in many plagiarism detection tools but there is not much research focusing on the effects of such techniques. To investigate the effect of preprocessing techniques on plagiarism detection tools an experiment was conducted on six tools using five techniques on two different datasets, whereby one dataset is publicly available. To be more precise the six tools were actually three tools whereby each tool had two modes to operatethespecializedmodewhichisspeciallydeveloped toperformasource-codecomparison and textual mode developed for normal text comparison. In this experiment two hypotheses were stated, one focusing on the differences between the preprocessing techniques and when no preprocessing technique is used and other focusing on differences between two different techniques. In addition to the hypothesis one research question was stated to give more insight into the effects of the preprocessing techniques. Resultsoftheexperimentwereanalysedquantitativelyus ingthemultifactoranalysisofvariance and qualitatively by analysing the most interesting cases. The whole process of detection and statistical analysis was automated using the newly developed system called Multiple Plagiarism Checker and the system R. The experimental results confirmed both hypotheses showing that using preprocessing has a positive effect on the quality of plagiarism detection and that some techniques gave better results than others. The most interesting result of this research is that by using preprocessing techniques textual versions of the tools outperformed in some cases the specialized version of the tool developed specifically for source-code similarity detection.
Izvorni jezik
Engleski
Znanstvena područja
Informacijske i komunikacijske znanosti
POVEZANOST RADA
Ustanove:
Fakultet organizacije i informatike, Varaždin,
Sveučilište u Zagrebu