WO2024118753A1 - Cheating detection and advanced outcome analysis for online exams - Google Patents
Cheating detection and advanced outcome analysis for online exams Download PDFInfo
- Publication number
- WO2024118753A1 WO2024118753A1 PCT/US2023/081567 US2023081567W WO2024118753A1 WO 2024118753 A1 WO2024118753 A1 WO 2024118753A1 US 2023081567 W US2023081567 W US 2023081567W WO 2024118753 A1 WO2024118753 A1 WO 2024118753A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- exam
- takers
- answer
- questions
- taker
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/20—Education
Definitions
- the exemplary system may include several components, including an exam analysis engine back-end and an interactive web user interface front-end.
- the back-end may retrieve students’ submission data and activity logs from online exam platforms.
- the system may calculate detailed statistics and response distributions for every question. It may also perform pairwise comparisons on all submissions using a wide variety of quantitative strategies, including but not limited to considering the similarity, rarity, entropy, and diversity of responses; the nature, and difficulty of questions; as well as the timestamps and logs of students’ activities.
- the system may generate reports containing evidence for Attorney Docket No.10034-223WO1 suspicious cases.
- the interactive web user interface of the front-end may display reports for suspicious plagiarism cases and visualizations of exam outcomes calculated by the exam analysis engine.
- the system may further use a novel question design approach where questions are designed such that there are a large number of possible correct and incorrect answers. This may allow for greater evidence of plagiarism when multiple matching rare correct or rare incorrect answers are detected for pairs of exam takers.
- the system may further incorporate unique watermarks into answers as another tool for plagiarism detection.
- the techniques described herein relate to a computer- implemented method including: receiving, by one or more processors, an identifier of an exam, wherein the exam includes a plurality of questions; retrieving, by the one or more processors, a plurality of submissions for the identified exam, wherein each submission is associated with a different exam taker of the plurality of exam takers and includes a plurality of answers for the plurality of questions of the exam, and further where each answer of the plurality of answers is associated with a timestamp; based on the retrieved submission, generating, by the one or more processors, a suspicion score for each pair of exam takers from the plurality of exam takers; and generating, by the one or more processors, report including some or all of the generated suspicion scores and identifiers of the associated pairs of exam takers.
- the techniques described herein relate to a computer- implemented method, wherein further including: identifying pairs of exam takers with scores that satisfy a requirement; and for each pair of exam takers, generating a report for the pairs of exam takers.
- the techniques described herein relate to a computer- implemented method, wherein generating the report for a pair of exam takers includes: determinizing, based on the timestamps associated with the answer of the plurality of answers, questions where each exam taker provided an answer within a time duration; and providing identifiers of the determined questions or a total number of the determined questions in the generated report.
- the techniques described herein relate to a computer- implemented method, wherein generating the report for a pair of exam takers includes: Attorney Docket No.10034-223WO1 identifying questions of the plurality of questions where each exam taker of the pair provided the same answer; and providing identifiers of the determined questions in the generated report.
- the techniques described herein relate to a method, wherein each exam taker is associated with a unique watermark, and further including: for each exam taker: extracting one or more unique watermarks from the plurality of answers of the submission associated with the exam taker; determining at least one extracted unique watermark that does not match the unique watermark associated with the exam taker; and in response to the determination, recording the exam taker as a plagiarist.
- the techniques described herein relate to a method, wherein the unique watermark includes a unique number in an invisible Unicode format.
- FIG.1 is an example environment for detecting cheating and plagiarism in exams
- FIG.2 is an illustration of an example back-end for detecting cheating and plagiarism in exams
- FIG.3 is an illustration of an example front-end for detecting cheating and plagiarism in exams
- FIG.4 is an illustration of an example report
- FIG.5 is an illustration of a method for detecting plagiarism in online exams
- FIG.6 is an illustration of a method for detecting plagiarism in online exams using watermarks
- FIG.7 is an illustration of a method for detecting plagiarism in online exams using watermarks
- FIG.8 is an illustration of an example exam activities timeline chart
- Attorney Docket No.10034-223WO1 [0020]
- FIG.9 shows an exemplary computing environment in which example embodiments and aspects may be implemented.
- FIG.1 is an example environment for detecting cheating and plagiarism in exams.
- the environment 100 includes an exam analysis system 101 in communication with a testing system 150, one or more exam takers 140, and one or more users 130 through a network 190.
- the network 190 may include a combination of private and public networks (e.g., the internet).
- the exam analysis system 101 and testing system 150 may each be executed by one or more general purpose computing devices such as the computing system 900 illustrated with respect to FIG.9.
- the user 130 and exam takers 140 may connect and interface with the testing system 150 and/or the exam analysis system 101 also using a general-purpose computing device.
- the user 130 may be a teaching professional (e.g., a professor, instructor, teaching assistant, or administrator) who creates and/or administers one or more exams 160 to one or more students or exam takers 140.
- Each exam 160 may include a plurality of questions.
- the types of questions may include multiple choice questions, true or false questions, multi- part questions, and questions where the exam taker is expected to enter or type an answer to the question. Any type of question may be supported. In some embodiments, some of the questions may include multiple correct and incorrect answers.
- the exams 160 may be administered to exam takers 140 electronically via one or more testing systems 150.
- a user 130 i.e., teaching professional
- One or more exam takers 140 may then connect to the testing system 150, and using a web browser or software provided by the testing system 150, electronically take the exam 160.
- the testing system 150 may enforce various rules about the exam 160 such as time limits Attorney Docket No.10034-223WO1 associated with the exam 160.
- Example testing systems 150 include Canvas and Gradescope, for example.
- Each exam taker 140 may complete the exam 160 by providing one or more answers for each question of the exam 160.
- the set of answers received from an exam taker 140 for the questions in an exam 160 is referred to herein as a submission 170.
- the testing system 150 may record the submission 170 provided by each exam taker 140 and may further maintain an activity log 165 for each exam taker 140.
- the activity log 165 may include information about the activity of an exam taker 140 such as such as a timestamp associated with each answer of the submission 170, and an IP address associated with each answer and/or the exam taker 140.
- the activity log 165 may include the various changes made to the answer and timestamps of when the changes were made.
- exam takers 140 may include the exam analysis system 101.
- the system 101 can use the answers provided by the exam takers 140 in their submissions 170 and information from the activity logs 165 including timestamps, to identify pairs of exam takers 140 who may have plagiarized one or more questions of the exam 160. These exam takers 140 pairs may then be identified to the teaching professional associated with the exam 160.
- the exam analysis system 101 may use the answers and other information from the activity logs 165 to generate statistics regarding the exam 160 including the questions that the exam takers 140 most answered incorrectly or showed the largest amount of variation in the incorrect answers. These statistics may be used by the teaching professional to identify concepts that the exam takers 140 are struggling with and to identify questions that may be improved or reworded.
- the exam analysis system 101 may support regrading.
- the exam analysis system 101 may directly interface with a testing system 150 API to post grades for individual exam taker 140 submissions 170. With API level access, the system 101 may programmatically perform regrading using arbitrary regrading logic that isn’t supported by the native testing system 150. For example, the exam analysis system 101 may account for cascading errors in arithmetic-based questions. Assume there are three questions, and the second question depends on the exam taker 140 getting the first question right, and the third question depends on the exam taker 140 getting the second question right.
- the exam taker 140 gets the first question wrong but uses the correct formula to calculate the answer for the second and third questions, with the exam analysis system 101, the first question can be marked incorrect, while the second and third questions can be marked correct for the exam taker 140. This is not possible in current online testing systems 150.
- the system 101 may allow for complex answer to partial credit mapping, as well as customized feedback per answer. For a hypothetical question, exam takers 140 submit one of three answers, A, B and C. Answer A is accepted for full credit. Answer B is accepted for half credit. Answer C is accepted for a quarter credit.
- the analysis system 101 allows us to give specific feedback for students who submitted B as to why their question only received half credit, and specific feedback for students who submitted answer C as to why their question only received a quarter credit.
- the analysis system 101 supports the streamlining of exam taker 140 regrade submissions as well.
- Exam takers 140 can log into a student-facing-portal provided by the analysis system 101 to submit a regrade request for their specific answers as well as an explanation as to why they believe their request is valid.
- the analysis system 101 may aggregate regrade requests from all exam takers 140, group them based on unique (answer, question) pairs so that duplicate regrade requests are only shown to a teaching professional once.
- the analysis system 101 includes a front-end 110 and a back-end 120.
- the front-end 110 may provide an interface through which users 130 (i.e., teaching professionals) may interact with the exam analysis system 101.
- the users 130 may use the interface provided by the front-end 110 to select an exam 160, view the results of the exam Attorney Docket No.10034-223WO1 160, and to view the results of any analysis performed on the exam 160 by the exam analysis system 101.
- the back-end 120 for one or more exams 160, may perform one or more analyses on the answers and other information from the activity logs 165 associated with the exams 160.
- the back-end 120 includes several components including, but not limited to, a data retrieval engine 210, a statistics engine 220, and a report engine 230. More or fewer components may be supported. Each component may be implemented together or separately using one or more general purpose computing devices such as the computing device 900.
- the back-end 120 may receive an exam identifier 221.
- the exam identifier 221 may be a number or other identifier and may identify a unique exam 160 stored by the testing system 150.
- the identified exam 160 may be an exam that has been completed by a plurality of exam takers 140.
- the back-end 120 may receive an exam identifier 221 from the front-end 110.
- the exam identifier 221 may have been provided in response to the exam 220 being selected by a user 130.
- a teaching professional may have selected the exam 160 in an interface provided by the front-end 110 to receive an analysis.
- the exam 160 may not have been selected, but all completed exams may be automatically processed by the back-end 120.
- the data retrieval engine 210 may retrieve exam data associated with the exam from the testing system 150. Depending on the embodiment, the data retrieval engine 210 may retrieve the exam data using credentials for the texting system 150. The credentials may be associated with the exam analysis system 101 and/or provided by the user 130.
- the data retrieval engine 210 may retrieve the exam data from the testing system 150 using an API provided by the testing system 150. Other methods may be used.
- the exam data received by the data retrieval engine 210 for the identified exam 160 may include the submission 170 provided by each exam taker 140 for the exam 160 and Attorney Docket No.10034-223WO1 activity logs 165 for each exam taker 140.
- the activity log for an exam taker 140 may include timestamps for each answer provided by the exam taker 140.
- the statistics engine 220 may generate a variety of statistics 225 about the identified exam 160.
- the generated statistics 225 may be used for a variety of purposes including evaluating the performance of exam takers 140 both individually and as a group, evaluating the quality of the questions of the exam 160, and detecting cheating or plagiarism.
- One example statistic 225 that may be generated is, for each question of the exam 160, the number of exam takers 140 that answered the question correctly. This statistic may be useful for determining if a question should be reworked (e.g., a low percentage may indicate that the question is too difficult or may be poorly worded) or if a topic associated with the question should be revisited (e.g., a low percentage may indicate the exam takers need more instruction on the topic associate with the question).
- Another example statistic 225 may include the average time spent answering each question.
- the statistics engine 220 may calculate this statistic using the timestamps from the activity logs 165. The average time spent answering each question may be used to determine if a particular question is taking too long to complete or if further instruction on the topic associated with a question is needed.
- the statistics engine 220 may generate statistics 225 for unique pairs of exam takers 140. These statistics may be used to determine if the exam takers 140 in the pair plagiarized each other or were otherwise cheating on one or more questions of the exam 160. One example of such a statistic is referred to herein as similarity.
- the similarity of the exam takers 140 is a measure of how similar the answers given by each of the exam takers 140 are. In some embodiments, the similarity of the pair may be calculated by determining the total number of questions that the exam takers 140 provided matching answers, divided by the total number of questions in the exam 160. Other methods may be used. [0043] As may be appreciated, exam takers 140 in pair having a high similarity may not necessarily indicate that the exam takers 140 cheated or plagiarized each other. For example, an exam 160 with a high number of correct answers may likely have many pairs of exam takers 170 with high similarity.
- the statistics engine 220 may generate a statistic 225 referred to herein as a suspicion score for each pair of exam takers.
- the suspicion score may be calculated similarly as similarity but may also consider the rarity of each answer provided by the exam takers 140.
- the rarity of an answer is a measure of how rarely or infrequently the answer was given for a particular question by any of the exam takers 140 that completed the exam 160.
- Rarity of an answer may be calculated as (how many exam takers 140 submit the answer) / (the total number of exam takers 140 in the class). The lower this number is, the rarer the answer is.
- two exam takers 140 provide an answer to a question that was provided by very few exam takers 140 it is more indicative of plagiarism than if the two exam takers 140 provide an answer to a question that was provided by many exam takers 140. There may be both rare correct answers and rare incorrect answers.
- the suspicion score for a pair of exam takers 140 may be calculated using the following formula: ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ , where n is the number questions in the exam 160, ⁇ is a constant, number of exam takers 140 that had a same answer for a question i in a submission 170, N is the total number of exam takers 140, and ⁇ ⁇ is either 1 or 0 depending on whether the exam takers 140 had a same answer or not for the question i.
- the constant ⁇ may be ⁇ ⁇ . Other constants may be used.
- Another statistic 225 generated by engine 220 for each pair of exam takers 140 is known as suspicious activities.
- a suspicion activity is said to occur when the exam takers 140 in the exam taker 140 pair provide an identical answer to the same question at around the same time.
- Two answers may be said to be received at the same time from two exam takers 140 if they have timestamps that are within some window or duration of time of one another (i.e., close).
- the window may be one minute, five minutes, or ten minutes, for example.
- the statistics engine 225 may identify the questions of the exam that are suspicious activities for the exam takers 140 of the exam pair using the answers in the submissions 170 provided by the exam takers 140 and the timestamps in the activity logs 165 associated with the exam takers 140.
- Another statistic 225 generated by the statistics engine 220 for each pair of exam takers 140 may be, for a pair of exam takers having a high similarity score indicating a large number of similar answers, an indication of whether one of the exam takers 140 in the pair Attorney Docket No.10034-223WO1 submitted a large number of answers in a short period of time. This may indicate that one of the exam takers 140 received some or all of the answers from the other exam taker 140.
- the statistics engine 220 may determine that the other exam take 140 submitted the answers in a short period of time when the average time spent per question (as determined based on the timestamps) was below the average time for the other exam takers 140.
- the report engine 230 may generate one or more reports 240 for an exam 160 based on the generated statistics 225.
- the report 240 may include an identifier of each pair of exam takers 140 along with some of the generated statistics 225.
- the report 240 may include the suspicion score generated for each exam taker 140 pair along with other statistics 225 such as the number of suspicious activities associated with each pair and/or the number of correct or incorrect answers the exam takers 140 have in common.
- the report 240 may further include statistics about the exam such as the average score, highest score, lowest score, and the correct or incorrect percentage for each question. Where there are topics or subjects associated with each question, the report 240 may further show the number of incorrect and correct answers by subject. The users 130 may use these statistics 225 to determine the overall mastery of each topic or subject by their students.
- the reports 240 may be stored by the report engine 230. The users 130 may then access the report for an exam 140 using the associated exam identifier 221. [0051] The report engine 230 may allow users 130 to leave comments on the reports 130.
- the report engine 230 may allow user 130 to drill down on particular exam takers 140 or exam taker 140 pairs to receive more detailed reports 240 about selected exam takers 140 or exam taker 140 pairs. For example, a report 140 may show a list of exam taker 140 pairs ranked by suspicion scores. A user 130 may select a particular exam taker 140 pair 140 and may be presented with a report 140 showing each question of the exam 160 and Attorney Docket No.10034-223WO1 the answer and associated timestamp.
- the user 130 may use the information provided in the report 140 to determine if the exam takers 140 of the pair likely plagiarized each other.
- the report engine 230 may further use machine learning models to reduce the number of exam taker pairs 140 that are provided in the reports 240.
- the machine learning models may have been trained on previous exam taker 140 pairs that were determined to be either plagiarism or not plagiarism including various statistics 225 generated for the exam taker pairs 140.
- only exam taker pairs 140 flagged as likely to be plagiarism by the machine learning models may be included in the reports 240 by the report engine 230.
- the back-end 120 may provide for real-time cheating detection during an exam 160.
- FIG.4 is an illustration of an example report 240. As shown, the report 240 is for a selected pair of exam takers 140 and includes several windows 401, 403, and 405.
- the window 401 In the window 401 are displayed various statistics 225 related to the pair of exam takers 140 (i.e., student 1 and student 2).
- the displayed statistics 225 include suspicion score, percentage of correct responses, percentage of incorrect responses, timestamp similarity, and suspicious activities. Other statistics 225 may be displayed.
- the window 405 In the window 405 is displayed some or all of the questions of the exam 160 and statistics 225 related to those questions. For example, the window 405 includes statistics 225 about the questions such as whether or not the answer provided for each question was identical for the pair of exam takers 140, whether the provided answer was correct, the rarity of the answer provided to the question, and the suspicion score generated for the answer. Other statistics 225 may be displayed.
- a chat interface through which the user 130, and other users 130, may read and provide comments on the pair of exam takers 140.
- a group of teaching assistants for a college course may have a discussion related to Attorney Docket No.10034-223WO1 a suspicions exam taker pair 140 and may decide whether or not to formally accuse the exam takers 140 of plagiarism.
- the questions in the exams 160 may be crafted such there are multiple correct answers, rather than just one correct answer.
- This type of question design may be adapted to various question formats, including but not limited to: Ordering Questions: Where exam takers 140 arrange a list of options; Matching Questions: Where exam takers 140 associate each prompt with the correct description or label; and Categorizing Questions: Where exam takers 140 classify items into specific categories. [0060] For example, consider a question where students are required to arrange a list of 9 options in a specific order.
- 1 can be A or B or C
- 2 can be B C or D
- 3 can only be E
- 4 Can be C or D or E
- 5 can be A or B
- exam takers 140 are asked to categorize items. Assume there are two categories, vegetables and fruits, and exam takers 140 are asked to classify tomato, carrot, and potato each into a category.
- Tomato can be Attorney Docket No.10034-223WO1 either fruits or vegetables (Scientifically a fruit, commonly treated as a vegetable.) Carrot should be classified into vegetables, and potato should be classified into vegetables as well. [0063] With the traditional plagiarism detection methods that compares students' answers for questions that only have one or few correct answers, it is difficult to catch competent cheaters who get most of the questions right and get high scores as they will have identical correct answers for most of the questions. However, by carefully designing questions such that it has a large answer space with multiple correct answers, two competent exam takes 140 are unlikely to answer the same question with the exact same answer given that there are multiple possible ways to answer the question correctly.
- a system for embedding watermarks into exam taker 140 answers is provided.
- a plug-in or application associated with the testing system 150 and/or the exam analysis system 101 generates a unique watermark for the exam taker 140.
- the watermark may be a sequence of invisible Unicode characters.
- the unique watermark is associated with the exam taker 140 by the exam analysis system 101.
- the watermark is inserted into or appended to the answer provided by the exam taker 140.
- the plug-in or application will paste the watermark into the answer box. Because the watermark is invisible, the exam taker 140 should not notice the watermark in the answer box.
- the question may direct the exam taker 140 to use another webpage under the control of the exam analysis system 101 to format their answer before submitting it.
- the webpage may insert the unique watermark into the answer as part of the formatting.
- the exam taker 140 cuts and pastes the formatted answer back into their exam 160, the unique watermark is preserved and saved by the testing system 150 as part of the submission.
- the watermarks help detection of plagiarism in the following way.
- the exam taker 140 may allow a second exam taker 140 to view their answers. If the Attorney Docket No.10034-223WO1 exam taker 140 shares their answers with the second exam taker 140 by “copying and pasting” their answers into a text document, the invisible watermark will likely be persevered in the text document. When the second exam taker 140 then copies and pastes the answer into the answer box of their exam 160, the invisible watermark will also be copied and pasted. When the second exam taker 140 submits the answer, the watermark of the first user will be stored in with the submission 170 for the second user.
- An advantage of the method for detecting plagiarism described above is that it may detect plagiarism even where an exam taker 140 has taken steps to modify an answer to conceal the plagiarism. For example, an exam taker 140 may copy and paste the answer to a question from a second user. However, before submitting the answer, the exam taker 140 may rewrite or modify the answer to obscure the plagiarism before submitting the answer. However, because the Unicode watermark is not affected by the modification, the plagiarism will still be detected via the unique watermark.
- FIG.3 is an illustration of an example front-end 110.
- the front-end includes an interface 310.
- the user 130 may use the interface 310 to select an exam 160 for analysis.
- the interface 310 may be a web-interface that the user 130 connects with using a web browser associated with their computer device.
- the interface 310 may be an API through which the user 130 connects to using an app or other specialized application.
- the user 130 may connect to the interface 310 using credentials provider by their school or associated academic institution.
- the user 130 may be presented with exam identifiers 220 associated with various exams that have been completed on the testing system 150.
- the user 130 may be presented with all of the available exam identifiers 220 or only those exam identifiers 220 that the user 130 is associated with or has permission to access.
- the user 130 may select an exam identifier 221 for a desired exam 140 through the interface 310 for analysis.
- the interface 310 may place the exam identifier 221 Attorney Docket No.10034-223WO1 into a queue for the back-end 120 for processing.
- the back-end 220 may process the identified exams from the queue in order as described with respect to FIG.2.
- FIG.5 is an illustration of a method 500 for detecting plagiarism in online exams.
- the method 500 may be implemented by the exam analysis system 101.
- an identifier of an exam is received.
- the identifier of an exam 160 may be received by the exam analysis system 101 from a user 130.
- the exam 160 may have been electronically administered by a testing system 150.
- the user 130 may be a teaching professional and may have used a webpage or other user interface to select an exam 160 that they would like to receive a report for.
- a plurality of submission is retrieved.
- the plurality of submissions 170 may be retrieved by the exam analysis system 101 from the testing system 150.
- Each submission may be associated with an exam taker 140 and may include a plurality of answers that were provided by the exam taker 140 to the questions associated with the identified exam 160.
- the exam analysis system 101 may normalize some or all of the answers in the plurality of submissions.
- one or more activity logs 165 may be retrieved.
- Each activity log 165 may be associated with an exam taker 140 and may include timestamps for each answer provided by the exam taker 140 in their associated submission 170.
- a suspicion score is calculated for each pair of exam takers.
- the suspicion score may be computed by the exam analysis system 101 for each unique pair of exam takers that are associated with a retrieved submission.
- the suspicion score for a pair of exam takers 140 may represent the likelihood that the exam takers 140 of the exam takers 140 pair plagiarized each other or cheated together on the identified exam 160.
- the suspicion score may be based on one or more of the similarity, rarity, entropy, and diversity of the answers provided by the exam takers 140 in their submissions 170.
- the suspicion score may be calculated using the formula described above. Other methods may be used.
- the exam analysis system 101 may use machine learning to reduce the number of pairs of exam takers that may be associated with plagiarism. In a roughly 1000 person class, the exam analysis system 101 achieved an 80% down sampling of reports through machine learning methods alone.
- large language models may further be used to allow for variable length context windows, allowing question context in the natural language reports fed to the models.
- exam statistics are generated.
- the exam statistics may be generated by the exam analysis system 101 based on the submissions 170, and may include statistics for each question such as the percentage of exam takers 140 that answered the question incorrectly, the number of exam takers 140 that answered the question correctly, and the average amount of time that each exam taker spent on the question. Other statistics may be generated.
- a report is generated using the suspicion scores and exam statistics.
- the report 230 may be generated by the exam analysis system 101.
- the report 230 may include identifiers of each exam taker pair and the suspicion scores generated for each exam taker pair.
- the report 230 may include other information about each exam taker pair such as the number of questions that each exam taker pair answered the same, and the number of questions where each exam taker provides a same answer within a selected window of time. In some embodiments, the report 230 may be automatically generated or only in response to a user or teaching professional requesting that the report 230 be generated.
- FIG.6 is an illustration of a method 600 for detecting plagiarism in online exams using watermarks. The method 600 may be implemented by the exam analysis system 101. [0081] At 610, the exam taker is directed to a webpage to format their answer.
- the question displayed to the exam taker 140 may have a link to a webpage that the exam taker 140 is instructed to use to format their answer a specific way.
- the exam taker 140 may be directed by the testing system 150.
- the webpage may be associated with the exam analysis system 101.
- the answer is received from the exam taker at the webpage.
- the user may type their answer into a field or other graphical user-interface provided by the webpage.
- Attorney Docket No.10034-223WO1 [0083]
- the answer is formatted, and a watermark is added.
- the answer is formatted into a particular format by the webpage and/or the exam analysis system 101. As part of the formatting a unique watermark is generated and added to the answer.
- the watermark may be an inviable Unicode watermark and may be invisible to the exam take 140.
- the watermark may be unique to the exam taker 140 or the particular answer formatting session.
- the webpage may generate a new unique watermark for each session or use of the form.
- the formatted answer is displayed.
- the formatted answer is displayed to the user in the webpage. Because the watermark is invisible, the exam taker 140 cannot see the watermark.
- the exam taker is instructed to cut and paste the formatted answer into an answer field associated with the question. When the exam taker 140 pastes the answer into the answer field of the exam, the unique watermark is pasted with the formatted answer.
- FIG.7 is an illustration of a method 700 for detecting plagiarism in online exams using watermarks.
- the method 700 may be implemented by the exam analysis system 101.
- an answer associated with a question is received.
- the answer may be received by the exam analysis system 101.
- the answer may be part of a submission 170 associated with an exam taker 140 for an online exam 160.
- the exam taker 140 may be associated with a unique watermark that was assigned to the exam taker 140 by the exam analysis system 101.
- the watermark may be a unique sequence of invisible Unicode characters.
- one or more watermarks are extracted from the answer.
- the one or more watermarks may be extracted from the answer by the exam analysis system 101.
- the exam analysis system 101 may scan the answer for strings of invisible Unicode text.
- Attorney Docket No.10034-223WO1 [0089]
- whether the extracted watermarks match the watermark associated with the exam taker 140 is determined.
- FIG.8 is an illustration of another example FIG.8 is an illustration of an example exam activities timeline chart 240.
- the example exam activities timeline chart may be a type of report 240.
- the chart 240 of FIG.8 shows the information about the questions of the exam 160 answered by a pair of exam takers 140.
- the user 130 may use the chart 240 to quickly see, for a pair of exam takers 140, which questions each exam taker 140 had the same answer and when the answers were submitted.
- each question On the vertical axis is listed each question.
- On the horizontal axis is listed the timestamp associated with each question.
- An answer from a first exam taker 140 of the exam taker 140 for a question is represented by a first shade of grey and an answer from the second example for the question is represented by a second shade of grey. Answers that are the same are shown as linked with a length of the link proportional to the difference between the timestamps associated with each answer.
- FIG.9 shows an exemplary computing environment in which example embodiments and aspects may be implemented.
- the computing device environment is only one example of a suitable computing environment and is not intended to suggest any limitation as to the scope of use or functionality.
- Numerous other general purpose or special purpose computing devices environments or configurations may be used. Examples of well-known computing devices, environments, and/or configurations that may be suitable for use include, but are not limited to, personal computers, server computers, handheld or laptop devices, multiprocessor systems, microprocessor-based systems, network personal computers (PCs), Attorney Docket No.10034-223WO1 minicomputers, mainframe computers, embedded systems, distributed computing environments that include any of the above systems or devices, and the like.
- Computer-executable instructions such as program modules, being executed by a computer may be used.
- program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types.
- Distributed computing environments may be used where tasks are performed by remote processing devices that are linked through a communications network or other data transmission medium.
- program modules and other data may be located in both local and remote computer storage media including memory storage devices.
- an exemplary system for implementing aspects described herein includes a computing device, such as computing device 900. In its most basic configuration, computing device 900 typically includes at least one processing unit 902 and memory 904.
- memory 904 may be volatile (such as random access memory (RAM)), non-volatile (such as read-only memory (ROM), flash memory, etc.), or some combination of the two.
- RAM random access memory
- ROM read-only memory
- FIG.9 This most basic configuration is illustrated in FIG.9 by dashed line 906.
- Computing device 900 may have additional features/functionality.
- computing device 900 may include additional storage (removable and/or non-removable) including, but not limited to, magnetic or optical disks or tape. Such additional storage is illustrated in FIG.9 by removable storage 908 and non-removable storage 910.
- Computing device 900 typically includes a variety of computer readable media.
- Computer readable media can be any available media that can be accessed by the device 900 and includes both volatile and non-volatile media, removable and non-removable media.
- Computer storage media include volatile and non-volatile, and removable and non- removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
- Memory 904, removable storage 908, and non-removable storage 910 are all examples of computer storage media.
- Computer storage media include, but are not limited to, RAM, ROM, electrically erasable program read-only memory (EEPROM), flash memory or other Attorney Docket No.10034-223WO1 memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by computing device 700. Any such computer storage media may be part of computing device 900. [0100] Computing device 900 may contain communication connection(s) 912 that allow the device to communicate with other devices. Computing device 900 may also have input device(s) 914 such as a keyboard, mouse, pen, voice input device, touch input device, etc.
- input device(s) 914 such as a keyboard, mouse, pen, voice input device, touch input device, etc.
- Output device(s) 916 such as a display, speakers, printer, etc. may also be included. All these devices are well known in the art and need not be discussed at length here. [0101] It should be understood that the various techniques described herein may be implemented in connection with hardware components or software components or, where appropriate, with a combination of both. Illustrative types of hardware components that can be used include Field-programmable Gate Arrays (FPGAs), Application-specific Integrated Circuits (ASICs), Application-specific Standard Products (ASSPs), System-on-a-chip systems (SOCs), Complex Programmable Logic Devices (CPLDs), etc.
- FPGAs Field-programmable Gate Arrays
- ASICs Application-specific Integrated Circuits
- ASSPs Application-specific Standard Products
- SOCs System-on-a-chip systems
- CPLDs Complex Programmable Logic Devices
- the methods and apparatus of the presently disclosed subject matter may take the form of program code (i.e., instructions) embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other machine-readable storage medium where, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the presently disclosed subject matter.
- program code i.e., instructions
- tangible media such as floppy diskettes, CD-ROMs, hard drives, or any other machine-readable storage medium
- the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the presently disclosed subject matter.
- exemplary implementations may refer to utilizing aspects of the presently disclosed subject matter in the context of one or more stand-alone computer systems, the subject matter is not so limited, but rather may be implemented in connection with any computing environment, such as a network or distributed computing environment.
- aspects of the presently disclosed subject matter may be implemented in or across a plurality of processing chips or devices, and storage may similarly be effected across a plurality of devices.
- Such devices might include personal computers, network servers, and handheld devices, for example.
- Attorney Docket No.10034-223WO1 [0103]
Landscapes
- Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- Engineering & Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Primary Health Care (AREA)
- Health & Medical Sciences (AREA)
- Economics (AREA)
- General Health & Medical Sciences (AREA)
- Educational Administration (AREA)
- Marketing (AREA)
- Educational Technology (AREA)
- Strategic Management (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Electrically Operated Instructional Devices (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
An exemplary system (101) and method (500) are disclosed for an advanced web-based tool that can detect plagiarism and perform an in-depth analysis of outcomes for exams hosted on online learning platforms (150). Instructors and teaching assistants (130) can use the exemplary system to identify potential plagiarism cases, pinpoint concepts that students are struggling with, and improve the design of future exams based on the analysis provided by the exemplary system.
Description
Attorney Docket No.10034-223WO1 CHEATING DETECTION AND ADVANCED OUTCOME ANALYSIS FOR ONLINE EXAMS CROSS-REFERENCE TO RELATED APPLICATIONS [0001] This application claims priority to U.S. Provisional Patent Application Serial No. 63/385,246, filed on November 29, 2022, and entitled “Cheating Detection and Advanced Outcome Analysis for Online Exams”, the contents of which are hereby incorporated by reference in their entirety. BACKGROUND [0002] Cheating in online exams administered on online learning platforms poses a significant challenge due to the difficulty in detection. The virtual nature of these assessments makes it easier for students to exploit loopholes, such as using hidden resources or collaborating with others. Advanced cheating tactics, like screen mirroring and virtual machines, further complicate detection efforts. Currently, there exist no online tools that can be used to detect cheating among exam takers on online learning platforms. SUMMARY [0003] An exemplary system and method are disclosed for an advanced web-based tool that can detect plagiarism and perform an in-depth analysis of outcomes for exams hosted on online learning platforms. Instructors and teaching assistants can use the exemplary system to identify potential plagiarism cases, pinpoint concepts that students are struggling with, and improve the design of future exams based on the analysis provided by the exemplary system. [0004] The exemplary system may include several components, including an exam analysis engine back-end and an interactive web user interface front-end. The back-end may retrieve students’ submission data and activity logs from online exam platforms. It may calculate detailed statistics and response distributions for every question. It may also perform pairwise comparisons on all submissions using a wide variety of quantitative strategies, including but not limited to considering the similarity, rarity, entropy, and diversity of responses; the nature, and difficulty of questions; as well as the timestamps and logs of students’ activities. The system may generate reports containing evidence for
Attorney Docket No.10034-223WO1 suspicious cases. The interactive web user interface of the front-end may display reports for suspicious plagiarism cases and visualizations of exam outcomes calculated by the exam analysis engine. The system may further use a novel question design approach where questions are designed such that there are a large number of possible correct and incorrect answers. This may allow for greater evidence of plagiarism when multiple matching rare correct or rare incorrect answers are detected for pairs of exam takers. In addition, the system may further incorporate unique watermarks into answers as another tool for plagiarism detection. [0005] In some aspects, the techniques described herein relate to a computer- implemented method including: receiving, by one or more processors, an identifier of an exam, wherein the exam includes a plurality of questions; retrieving, by the one or more processors, a plurality of submissions for the identified exam, wherein each submission is associated with a different exam taker of the plurality of exam takers and includes a plurality of answers for the plurality of questions of the exam, and further where each answer of the plurality of answers is associated with a timestamp; based on the retrieved submission, generating, by the one or more processors, a suspicion score for each pair of exam takers from the plurality of exam takers; and generating, by the one or more processors, report including some or all of the generated suspicion scores and identifiers of the associated pairs of exam takers. [0006] In some aspects, the techniques described herein relate to a computer- implemented method, wherein further including: identifying pairs of exam takers with scores that satisfy a requirement; and for each pair of exam takers, generating a report for the pairs of exam takers. [0007] In some aspects, the techniques described herein relate to a computer- implemented method, wherein generating the report for a pair of exam takers includes: determinizing, based on the timestamps associated with the answer of the plurality of answers, questions where each exam taker provided an answer within a time duration; and providing identifiers of the determined questions or a total number of the determined questions in the generated report. [0008] In some aspects, the techniques described herein relate to a computer- implemented method, wherein generating the report for a pair of exam takers includes:
Attorney Docket No.10034-223WO1 identifying questions of the plurality of questions where each exam taker of the pair provided the same answer; and providing identifiers of the determined questions in the generated report. [0009] In some aspects, the techniques described herein relate to a method, wherein each exam taker is associated with a unique watermark, and further including: for each exam taker: extracting one or more unique watermarks from the plurality of answers of the submission associated with the exam taker; determining at least one extracted unique watermark that does not match the unique watermark associated with the exam taker; and in response to the determination, recording the exam taker as a plagiarist. [0010] In some aspects, the techniques described herein relate to a method, wherein the unique watermark includes a unique number in an invisible Unicode format. BRIEF DESCRIPTION OF THE DRAWINGS [0011] The accompanying figures, which are incorporated herein and form part of the specification, illustrate an image transmission system and method. Together with the description, the figures further serve to explain the principles of the system and method described herein and thereby enable a person skilled in the pertinent art to make and use the system and method. [0012] FIG.1 is an example environment for detecting cheating and plagiarism in exams; [0013] FIG.2 is an illustration of an example back-end for detecting cheating and plagiarism in exams; [0014] FIG.3 is an illustration of an example front-end for detecting cheating and plagiarism in exams; [0015] FIG.4 is an illustration of an example report; [0016] FIG.5 is an illustration of a method for detecting plagiarism in online exams; and [0017] FIG.6 is an illustration of a method for detecting plagiarism in online exams using watermarks; [0018] FIG.7 is an illustration of a method for detecting plagiarism in online exams using watermarks; [0019] FIG.8 is an illustration of an example exam activities timeline chart; and
Attorney Docket No.10034-223WO1 [0020] FIG.9 shows an exemplary computing environment in which example embodiments and aspects may be implemented. DETAILED DESCRIPTION [0021] To facilitate an understanding of the principles and features of various embodiments of the present invention, they are explained hereinafter with reference to their implementation in illustrative embodiments. [0022] FIG.1 is an example environment for detecting cheating and plagiarism in exams. As shown, the environment 100 includes an exam analysis system 101 in communication with a testing system 150, one or more exam takers 140, and one or more users 130 through a network 190. The network 190 may include a combination of private and public networks (e.g., the internet). The exam analysis system 101 and testing system 150 may each be executed by one or more general purpose computing devices such as the computing system 900 illustrated with respect to FIG.9. The user 130 and exam takers 140 may connect and interface with the testing system 150 and/or the exam analysis system 101 also using a general-purpose computing device. [0023] The user 130 may be a teaching professional (e.g., a professor, instructor, teaching assistant, or administrator) who creates and/or administers one or more exams 160 to one or more students or exam takers 140. Each exam 160 may include a plurality of questions. The types of questions may include multiple choice questions, true or false questions, multi- part questions, and questions where the exam taker is expected to enter or type an answer to the question. Any type of question may be supported. In some embodiments, some of the questions may include multiple correct and incorrect answers. [0024] In some embodiments, the exams 160 may be administered to exam takers 140 electronically via one or more testing systems 150. Generally, a user 130 (i.e., teaching professional) may use tools provided by the testing system 150 to create an exam 160, or may electronically upload an already created exam 160 to the testing system 150. One or more exam takers 140 may then connect to the testing system 150, and using a web browser or software provided by the testing system 150, electronically take the exam 160. The testing system 150 may enforce various rules about the exam 160 such as time limits
Attorney Docket No.10034-223WO1 associated with the exam 160. Example testing systems 150 include Canvas and Gradescope, for example. [0025] Each exam taker 140 may complete the exam 160 by providing one or more answers for each question of the exam 160. The set of answers received from an exam taker 140 for the questions in an exam 160 is referred to herein as a submission 170. The testing system 150 may record the submission 170 provided by each exam taker 140 and may further maintain an activity log 165 for each exam taker 140. Depending on the testing system 150 the activity log 165 may include information about the activity of an exam taker 140 such as such as a timestamp associated with each answer of the submission 170, and an IP address associated with each answer and/or the exam taker 140. Where the exam taker 140 changes an answer, the activity log 165 may include the various changes made to the answer and timestamps of when the changes were made. [0026] As may be appreciated, because of the online nature of the exams 160, there may be many opportunities for exam takers 140 to cheat or plagiarize each other’s answers. For example, a pair of exam takers 140 may take the exam 160 together in the same room using their personal computing devices and may collaborate on the answers to the questions. As another example, exam takers 140 may share answers to one or more questions with their friends via text message or social media. [0027] In order to solve the problems noted above and to help detect cheating and plagiarism among exam takers 140, the environment 100 may include the exam analysis system 101. As will be described further below, the system 101 can use the answers provided by the exam takers 140 in their submissions 170 and information from the activity logs 165 including timestamps, to identify pairs of exam takers 140 who may have plagiarized one or more questions of the exam 160. These exam takers 140 pairs may then be identified to the teaching professional associated with the exam 160. [0028] In addition, the exam analysis system 101 may use the answers and other information from the activity logs 165 to generate statistics regarding the exam 160 including the questions that the exam takers 140 most answered incorrectly or showed the largest amount of variation in the incorrect answers. These statistics may be used by the teaching professional to identify concepts that the exam takers 140 are struggling with and to identify questions that may be improved or reworded.
Attorney Docket No.10034-223WO1 [0029] In addition, the exam analysis system 101, may support regrading. The exam analysis system 101 may directly interface with a testing system 150 API to post grades for individual exam taker 140 submissions 170. With API level access, the system 101 may programmatically perform regrading using arbitrary regrading logic that isn’t supported by the native testing system 150. For example, the exam analysis system 101 may account for cascading errors in arithmetic-based questions. Assume there are three questions, and the second question depends on the exam taker 140 getting the first question right, and the third question depends on the exam taker 140 getting the second question right. If the exam taker 140 gets the first question wrong but uses the correct formula to calculate the answer for the second and third questions, with the exam analysis system 101, the first question can be marked incorrect, while the second and third questions can be marked correct for the exam taker 140. This is not possible in current online testing systems 150. [0030] As another example, the system 101 may allow for complex answer to partial credit mapping, as well as customized feedback per answer. For a hypothetical question, exam takers 140 submit one of three answers, A, B and C. Answer A is accepted for full credit. Answer B is accepted for half credit. Answer C is accepted for a quarter credit. The analysis system 101 allows us to give specific feedback for students who submitted B as to why their question only received half credit, and specific feedback for students who submitted answer C as to why their question only received a quarter credit. [0031] In addition, the analysis system 101 supports the streamlining of exam taker 140 regrade submissions as well. Exam takers 140 can log into a student-facing-portal provided by the analysis system 101 to submit a regrade request for their specific answers as well as an explanation as to why they believe their request is valid. The analysis system 101 may aggregate regrade requests from all exam takers 140, group them based on unique (answer, question) pairs so that duplicate regrade requests are only shown to a teaching professional once. To regrade, each teaching professional may then only have to view and grade identical regrade requests once which saves time and resources. [0032] As shown, the analysis system 101 includes a front-end 110 and a back-end 120. The front-end 110 may provide an interface through which users 130 (i.e., teaching professionals) may interact with the exam analysis system 101. The users 130 may use the interface provided by the front-end 110 to select an exam 160, view the results of the exam
Attorney Docket No.10034-223WO1 160, and to view the results of any analysis performed on the exam 160 by the exam analysis system 101. [0033] The back-end 120, for one or more exams 160, may perform one or more analyses on the answers and other information from the activity logs 165 associated with the exams 160. The back end 120 and front-end 110 are described further with respect to FIGS.2 and 3. [0034] As shown in FIG.2, the back-end 120 includes several components including, but not limited to, a data retrieval engine 210, a statistics engine 220, and a report engine 230. More or fewer components may be supported. Each component may be implemented together or separately using one or more general purpose computing devices such as the computing device 900. [0035] The back-end 120 may receive an exam identifier 221. The exam identifier 221 may be a number or other identifier and may identify a unique exam 160 stored by the testing system 150. The identified exam 160 may be an exam that has been completed by a plurality of exam takers 140. [0036] The back-end 120 may receive an exam identifier 221 from the front-end 110. The exam identifier 221 may have been provided in response to the exam 220 being selected by a user 130. For example, a teaching professional may have selected the exam 160 in an interface provided by the front-end 110 to receive an analysis. Alternatively, the exam 160 may not have been selected, but all completed exams may be automatically processed by the back-end 120. [0037] The data retrieval engine 210 may retrieve exam data associated with the exam from the testing system 150. Depending on the embodiment, the data retrieval engine 210 may retrieve the exam data using credentials for the texting system 150. The credentials may be associated with the exam analysis system 101 and/or provided by the user 130. In some embodiments, the data retrieval engine 210 may retrieve the exam data from the testing system 150 using an API provided by the testing system 150. Other methods may be used. [0038] The exam data received by the data retrieval engine 210 for the identified exam 160 may include the submission 170 provided by each exam taker 140 for the exam 160 and
Attorney Docket No.10034-223WO1 activity logs 165 for each exam taker 140. The activity log for an exam taker 140 may include timestamps for each answer provided by the exam taker 140. [0039] The statistics engine 220 may generate a variety of statistics 225 about the identified exam 160. The generated statistics 225 may be used for a variety of purposes including evaluating the performance of exam takers 140 both individually and as a group, evaluating the quality of the questions of the exam 160, and detecting cheating or plagiarism. [0040] One example statistic 225 that may be generated is, for each question of the exam 160, the number of exam takers 140 that answered the question correctly. This statistic may be useful for determining if a question should be reworked (e.g., a low percentage may indicate that the question is too difficult or may be poorly worded) or if a topic associated with the question should be revisited (e.g., a low percentage may indicate the exam takers need more instruction on the topic associate with the question). [0041] Another example statistic 225 may include the average time spent answering each question. The statistics engine 220 may calculate this statistic using the timestamps from the activity logs 165. The average time spent answering each question may be used to determine if a particular question is taking too long to complete or if further instruction on the topic associated with a question is needed. [0042] The statistics engine 220 may generate statistics 225 for unique pairs of exam takers 140. These statistics may be used to determine if the exam takers 140 in the pair plagiarized each other or were otherwise cheating on one or more questions of the exam 160. One example of such a statistic is referred to herein as similarity. The similarity of the exam takers 140 is a measure of how similar the answers given by each of the exam takers 140 are. In some embodiments, the similarity of the pair may be calculated by determining the total number of questions that the exam takers 140 provided matching answers, divided by the total number of questions in the exam 160. Other methods may be used. [0043] As may be appreciated, exam takers 140 in pair having a high similarity may not necessarily indicate that the exam takers 140 cheated or plagiarized each other. For example, an exam 160 with a high number of correct answers may likely have many pairs of exam takers 170 with high similarity.
Attorney Docket No.10034-223WO1 [0044] Accordingly, to better distinguish exam takers 140, the statistics engine 220 may generate a statistic 225 referred to herein as a suspicion score for each pair of exam takers. The suspicion score may be calculated similarly as similarity but may also consider the rarity of each answer provided by the exam takers 140. The rarity of an answer is a measure of how rarely or infrequently the answer was given for a particular question by any of the exam takers 140 that completed the exam 160. Rarity of an answer may be calculated as (how many exam takers 140 submit the answer) / (the total number of exam takers 140 in the class). The lower this number is, the rarer the answer is. As may be appreciated, if two exam takers 140 provide an answer to a question that was provided by very few exam takers 140 it is more indicative of plagiarism than if the two exam takers 140 provide an answer to a question that was provided by many exam takers 140. There may be both rare correct answers and rare incorrect answers. [0045] In some embodiments, the suspicion score for a pair of exam takers 140 may be calculated using the following formula: ^^ ൌ ∑^ ^ ^ୀ^ െ ^^ ^^ ^^థ ^ ே ∗ ^^^ , where n is the number questions in the exam 160, ^^ is a constant,
number of exam takers 140 that had a same answer for a question i in a submission 170, N is the total number of exam takers 140, and ^^^ is either 1 or 0 depending on whether the exam takers 140 had a same answer or not for the question i. The constant ^^ may be ^ା√ହ ଶ . Other constants may be used. [0046] Another statistic 225 generated by
engine 220 for each pair of exam takers 140 is known as suspicious activities. A suspicion activity is said to occur when the exam takers 140 in the exam taker 140 pair provide an identical answer to the same question at around the same time. Two answers may be said to be received at the same time from two exam takers 140 if they have timestamps that are within some window or duration of time of one another (i.e., close). The window may be one minute, five minutes, or ten minutes, for example. The statistics engine 225 may identify the questions of the exam that are suspicious activities for the exam takers 140 of the exam pair using the answers in the submissions 170 provided by the exam takers 140 and the timestamps in the activity logs 165 associated with the exam takers 140. [0047] Another statistic 225 generated by the statistics engine 220 for each pair of exam takers 140 may be, for a pair of exam takers having a high similarity score indicating a large number of similar answers, an indication of whether one of the exam takers 140 in the pair
Attorney Docket No.10034-223WO1 submitted a large number of answers in a short period of time. This may indicate that one of the exam takers 140 received some or all of the answers from the other exam taker 140. The statistics engine 220 may determine that the other exam take 140 submitted the answers in a short period of time when the average time spent per question (as determined based on the timestamps) was below the average time for the other exam takers 140. [0048] The report engine 230 may generate one or more reports 240 for an exam 160 based on the generated statistics 225. In some embodiments, the report 240 may include an identifier of each pair of exam takers 140 along with some of the generated statistics 225. For example, the report 240 may include the suspicion score generated for each exam taker 140 pair along with other statistics 225 such as the number of suspicious activities associated with each pair and/or the number of correct or incorrect answers the exam takers 140 have in common. In addition, information such as average suspicion score, lowest suspicion score, and highest suspicion score may be displayed in the report 240 for context. [0049] The report 240 may further include statistics about the exam such as the average score, highest score, lowest score, and the correct or incorrect percentage for each question. Where there are topics or subjects associated with each question, the report 240 may further show the number of incorrect and correct answers by subject. The users 130 may use these statistics 225 to determine the overall mastery of each topic or subject by their students. [0050] The reports 240 may be stored by the report engine 230. The users 130 may then access the report for an exam 140 using the associated exam identifier 221. [0051] The report engine 230 may allow users 130 to leave comments on the reports 130. These comments may be stored with the reports 240 and may be visible to other users 130, or selected users 130. The users 130 may use the comments to discuss particular questions, exam takers 140, or pairs of exam takers 140. [0052] The report engine 230 may allow user 130 to drill down on particular exam takers 140 or exam taker 140 pairs to receive more detailed reports 240 about selected exam takers 140 or exam taker 140 pairs. For example, a report 140 may show a list of exam taker 140 pairs ranked by suspicion scores. A user 130 may select a particular exam taker 140 pair 140 and may be presented with a report 140 showing each question of the exam 160 and
Attorney Docket No.10034-223WO1 the answer and associated timestamp. The user 130 may use the information provided in the report 140 to determine if the exam takers 140 of the pair likely plagiarized each other. [0053] In some embodiments, the report engine 230 may further use machine learning models to reduce the number of exam taker pairs 140 that are provided in the reports 240. The machine learning models may have been trained on previous exam taker 140 pairs that were determined to be either plagiarism or not plagiarism including various statistics 225 generated for the exam taker pairs 140. Depending on the embodiment, only exam taker pairs 140 flagged as likely to be plagiarism by the machine learning models may be included in the reports 240 by the report engine 230. [0054] In some embodiments, the back-end 120 may provide for real-time cheating detection during an exam 160. In these embodiments, when an exam 160 is being taken, the data retrieval engine 210 may receive answers and timestamps soon after they are provided by the exam takers 140. The statistics engine 220 may then generate statistics 225 using the real-time data, and when suspicious pairs of exam takers 140 are detected (e.g., pairs with high suspicion scores or more than a threshold number of suspicious activities), the users 130 (e.g., teaching assistants) associated with the exam 160 may be notified. [0055] FIG.4 is an illustration of an example report 240. As shown, the report 240 is for a selected pair of exam takers 140 and includes several windows 401, 403, and 405. In the window 401 are displayed various statistics 225 related to the pair of exam takers 140 (i.e., student 1 and student 2). The displayed statistics 225 include suspicion score, percentage of correct responses, percentage of incorrect responses, timestamp similarity, and suspicious activities. Other statistics 225 may be displayed. [0056] In the window 405 is displayed some or all of the questions of the exam 160 and statistics 225 related to those questions. For example, the window 405 includes statistics 225 about the questions such as whether or not the answer provided for each question was identical for the pair of exam takers 140, whether the provided answer was correct, the rarity of the answer provided to the question, and the suspicion score generated for the answer. Other statistics 225 may be displayed. [0057] In the window 403 is provided a chat interface through which the user 130, and other users 130, may read and provide comments on the pair of exam takers 140. For example, a group of teaching assistants for a college course may have a discussion related to
Attorney Docket No.10034-223WO1 a suspicions exam taker pair 140 and may decide whether or not to formally accuse the exam takers 140 of plagiarism. [0058] In some embodiments, to facilitate detection of plagiarism by the back-end 120, the questions in the exams 160 may be crafted such there are multiple correct answers, rather than just one correct answer. By crafting multiple correct answers for multiple questions of the exam 160 it becomes more improbable that any pair of exam takers 140 will answer multiple questions with the same answer and not be cheating or plagiarizing each other. [0059] This type of question design may be adapted to various question formats, including but not limited to: Ordering Questions: Where exam takers 140 arrange a list of options; Matching Questions: Where exam takers 140 associate each prompt with the correct description or label; and Categorizing Questions: Where exam takers 140 classify items into specific categories. [0060] For example, consider a question where students are required to arrange a list of 9 options in a specific order. Instead of designating only one sequence, such as "1, 2, 3, 4, 5, 6, 7, 8, 9", as correct, the criteria could be that "1, 2, 3" must precede "4, 5, 6", and "4, 5, 6" must come before "7, 8, 9". The internal order within each subgroup doesn't matter. Therefore, "1, 3, 2, 6, 5, 4, 9, 7, 8" would also be a valid answer, leading to a total of 27 correct answer combinations. In addition, the incorrect answer space is large as we have 9! - 27 = 362880 - 27 = 362853 number of possible incorrect answers. [0061] As another example, consider a question where exam takers 140 are asked to match the numbers 1-5 with the letters A-E. Suppose 1 can be A or B or C, 2 can be B C or D, 3 can only be E, 4 Can be C or D or E, 5 can be A or B; students have to assign one letter to each number. This leads to 3 x 3 x 1 x 3 x 2 = 54 correct answers and 55 - 54 = 3,071 incorrect answers. [0062] As another example, consider a question where exam takers 140 are asked to categorize items. Assume there are two categories, vegetables and fruits, and exam takers 140 are asked to classify tomato, carrot, and potato each into a category. Tomato can be
Attorney Docket No.10034-223WO1 either fruits or vegetables (Scientifically a fruit, commonly treated as a vegetable.) Carrot should be classified into vegetables, and potato should be classified into vegetables as well. [0063] With the traditional plagiarism detection methods that compares students' answers for questions that only have one or few correct answers, it is difficult to catch competent cheaters who get most of the questions right and get high scores as they will have identical correct answers for most of the questions. However, by carefully designing questions such that it has a large answer space with multiple correct answers, two competent exam takes 140 are unlikely to answer the same question with the exact same answer given that there are multiple possible ways to answer the question correctly. [0064] In some embodiments, to help detect plagiarism, a system for embedding watermarks into exam taker 140 answers is provided. When a student connects with the testing system 150 using a web browser associated with exam taker 140, a plug-in or application associated with the testing system 150 and/or the exam analysis system 101 generates a unique watermark for the exam taker 140. The watermark may be a sequence of invisible Unicode characters. The unique watermark is associated with the exam taker 140 by the exam analysis system 101. [0065] As the exam taker 140 takes the exam and provides written answers to questions, the watermark is inserted into or appended to the answer provided by the exam taker 140. For example, after the user has inserted text answers into an answer box, the plug-in or application will paste the watermark into the answer box. Because the watermark is invisible, the exam taker 140 should not notice the watermark in the answer box. When the exam taker 140 submits their answer to the testing system 150, the answer, including the watermark, is then recorded by the testing system 150 as part of the submission 170. [0066] In another embodiment, the question may direct the exam taker 140 to use another webpage under the control of the exam analysis system 101 to format their answer before submitting it. The webpage may insert the unique watermark into the answer as part of the formatting. When the exam taker 140 cuts and pastes the formatted answer back into their exam 160, the unique watermark is preserved and saved by the testing system 150 as part of the submission. [0067] The watermarks help detection of plagiarism in the following way. At some later time, the exam taker 140 may allow a second exam taker 140 to view their answers. If the
Attorney Docket No.10034-223WO1 exam taker 140 shares their answers with the second exam taker 140 by “copying and pasting” their answers into a text document, the invisible watermark will likely be persevered in the text document. When the second exam taker 140 then copies and pastes the answer into the answer box of their exam 160, the invisible watermark will also be copied and pasted. When the second exam taker 140 submits the answer, the watermark of the first user will be stored in with the submission 170 for the second user. Accordingly, the presence of a unique watermark in the answers of the submission 170 that does not match the associated exam taker 140 may indicate that there has been plagiarism. [0068] An advantage of the method for detecting plagiarism described above, is that it may detect plagiarism even where an exam taker 140 has taken steps to modify an answer to conceal the plagiarism. For example, an exam taker 140 may copy and paste the answer to a question from a second user. However, before submitting the answer, the exam taker 140 may rewrite or modify the answer to obscure the plagiarism before submitting the answer. However, because the Unicode watermark is not affected by the modification, the plagiarism will still be detected via the unique watermark. Furthermore, where an external webpage is used to format the answers and insert the watermarks, the unique watermarks may be used by the exam analysis system 101 to detect plagiarism even if watermarks are not specifically supported by the testing system 150. [0069] FIG.3 is an illustration of an example front-end 110. As shown, the front-end includes an interface 310. The user 130 may use the interface 310 to select an exam 160 for analysis. The interface 310 may be a web-interface that the user 130 connects with using a web browser associated with their computer device. Alternatively, the interface 310 may be an API through which the user 130 connects to using an app or other specialized application. Depending on the embodiments, the user 130 may connect to the interface 310 using credentials provider by their school or associated academic institution. [0070] Through the interface 310 the user 130 may be presented with exam identifiers 220 associated with various exams that have been completed on the testing system 150. The user 130 may be presented with all of the available exam identifiers 220 or only those exam identifiers 220 that the user 130 is associated with or has permission to access. [0071] The user 130 may select an exam identifier 221 for a desired exam 140 through the interface 310 for analysis. In response, the interface 310 may place the exam identifier 221
Attorney Docket No.10034-223WO1 into a queue for the back-end 120 for processing. The back-end 220 may process the identified exams from the queue in order as described with respect to FIG.2. The user 130 may further interact with the one or more generated reports 240 through the interface 310 and may view and leave comments on the reports as described with respect to FIG.4. [0072] FIG.5 is an illustration of a method 500 for detecting plagiarism in online exams. The method 500 may be implemented by the exam analysis system 101. [0073] At 510, an identifier of an exam is received. The identifier of an exam 160 may be received by the exam analysis system 101 from a user 130. The exam 160 may have been electronically administered by a testing system 150. The user 130 may be a teaching professional and may have used a webpage or other user interface to select an exam 160 that they would like to receive a report for. Alternatively, after an exam 160 has been completed by all of the exam takers (or after a certain date) it may be automatically selected for analysis by the exam analysis system 101. [0074] At 520, a plurality of submission is retrieved. The plurality of submissions 170 may be retrieved by the exam analysis system 101 from the testing system 150. Each submission may be associated with an exam taker 140 and may include a plurality of answers that were provided by the exam taker 140 to the questions associated with the identified exam 160. In some embodiments, the exam analysis system 101 may normalize some or all of the answers in the plurality of submissions. [0075] In addition, one or more activity logs 165 may be retrieved. Each activity log 165 may be associated with an exam taker 140 and may include timestamps for each answer provided by the exam taker 140 in their associated submission 170. [0076] At 530, a suspicion score is calculated for each pair of exam takers. The suspicion score may be computed by the exam analysis system 101 for each unique pair of exam takers that are associated with a retrieved submission. The suspicion score for a pair of exam takers 140 may represent the likelihood that the exam takers 140 of the exam takers 140 pair plagiarized each other or cheated together on the identified exam 160. The suspicion score may be based on one or more of the similarity, rarity, entropy, and diversity of the answers provided by the exam takers 140 in their submissions 170. In some embodiments, the suspicion score may be calculated using the formula described above. Other methods may be used.
Attorney Docket No.10034-223WO1 [0077] As may appreciated there may be a large number of unique exam taker pairs. For example, for ax exam with 700 exam takers, there are 700 choose 2 number of unique exam taker pairs = 244,650 In some embodiments, the exam analysis system 101 may use machine learning to reduce the number of pairs of exam takers that may be associated with plagiarism. In a roughly 1000 person class, the exam analysis system 101 achieved an 80% down sampling of reports through machine learning methods alone. In some embodiments, large language models may further be used to allow for variable length context windows, allowing question context in the natural language reports fed to the models. [0078] At 540, exam statistics are generated. The exam statistics may be generated by the exam analysis system 101 based on the submissions 170, and may include statistics for each question such as the percentage of exam takers 140 that answered the question incorrectly, the number of exam takers 140 that answered the question correctly, and the average amount of time that each exam taker spent on the question. Other statistics may be generated. [0079] At 550, a report is generated using the suspicion scores and exam statistics. The report 230 may be generated by the exam analysis system 101. The report 230 may include identifiers of each exam taker pair and the suspicion scores generated for each exam taker pair. The report 230 may include other information about each exam taker pair such as the number of questions that each exam taker pair answered the same, and the number of questions where each exam taker provides a same answer within a selected window of time. In some embodiments, the report 230 may be automatically generated or only in response to a user or teaching professional requesting that the report 230 be generated. [0080] FIG.6 is an illustration of a method 600 for detecting plagiarism in online exams using watermarks. The method 600 may be implemented by the exam analysis system 101. [0081] At 610, the exam taker is directed to a webpage to format their answer. While taking the exam 160, the question displayed to the exam taker 140 may have a link to a webpage that the exam taker 140 is instructed to use to format their answer a specific way. The exam taker 140 may be directed by the testing system 150. The webpage may be associated with the exam analysis system 101. [0082] At 620, the answer is received from the exam taker at the webpage. The user may type their answer into a field or other graphical user-interface provided by the webpage.
Attorney Docket No.10034-223WO1 [0083] At 630, the answer is formatted, and a watermark is added. The answer is formatted into a particular format by the webpage and/or the exam analysis system 101. As part of the formatting a unique watermark is generated and added to the answer. The watermark may be an inviable Unicode watermark and may be invisible to the exam take 140. Depending on the embodiment, the watermark may be unique to the exam taker 140 or the particular answer formatting session. For example, the webpage may generate a new unique watermark for each session or use of the form. [0084] At 640, the formatted answer is displayed. The formatted answer is displayed to the user in the webpage. Because the watermark is invisible, the exam taker 140 cannot see the watermark. [0085] At 650, the exam taker is instructed to cut and paste the formatted answer into an answer field associated with the question. When the exam taker 140 pastes the answer into the answer field of the exam, the unique watermark is pasted with the formatted answer. The testing system 150 may then store the unique watermark with the answer as part of the submission 170 for the exam taker 140. Later, the exam analysis system 101 may then look for the presence of unique watermarks that match their associated user when looking for plagiarism. For example, the presence of a unique watermark in an answer that does not match the one associated with an exam taker may indicate that the exam taker 140 copied the answer from the exam taker 140 associated with the watermark. [0086] FIG.7 is an illustration of a method 700 for detecting plagiarism in online exams using watermarks. The method 700 may be implemented by the exam analysis system 101. [0087] At 710, an answer associated with a question is received. The answer may be received by the exam analysis system 101. The answer may be part of a submission 170 associated with an exam taker 140 for an online exam 160. The exam taker 140 may be associated with a unique watermark that was assigned to the exam taker 140 by the exam analysis system 101. In some embodiments, the watermark may be a unique sequence of invisible Unicode characters. [0088] At 720, one or more watermarks are extracted from the answer. The one or more watermarks may be extracted from the answer by the exam analysis system 101. For example, the exam analysis system 101 may scan the answer for strings of invisible Unicode text.
Attorney Docket No.10034-223WO1 [0089] At 730, whether the extracted watermarks match the watermark associated with the exam taker 140 is determined. If the extracted watermarks match the watermark associated with the exam taker 140, then the exam taker did not cut and paste the answer from a different exam taker 140, and the method 700 may exit at 640. Else, the method 700 may continue at 750. [0090] At 750, possible plagiarism is reported. The plagiarism may be reported by the exam analysis system 101 to a user 130 or teaching professional associated with the exam 160. The report 240 may indicate the non-matching watermark and may indicate the exam taker 140 associated with the non-matching watermark and the exam taker 140 associated with the exam 140. [0091] FIG.8 is an illustration of another example FIG.8 is an illustration of an example exam activities timeline chart 240. The example exam activities timeline chart may be a type of report 240. The chart 240 of FIG.8 shows the information about the questions of the exam 160 answered by a pair of exam takers 140. The user 130 may use the chart 240 to quickly see, for a pair of exam takers 140, which questions each exam taker 140 had the same answer and when the answers were submitted. [0092] In the example shown, on the vertical axis is listed each question. On the horizontal axis is listed the timestamp associated with each question. An answer from a first exam taker 140 of the exam taker 140 for a question is represented by a first shade of grey and an answer from the second example for the question is represented by a second shade of grey. Answers that are the same are shown as linked with a length of the link proportional to the difference between the timestamps associated with each answer. [0093] FIG.9 shows an exemplary computing environment in which example embodiments and aspects may be implemented. The computing device environment is only one example of a suitable computing environment and is not intended to suggest any limitation as to the scope of use or functionality. [0094] Numerous other general purpose or special purpose computing devices environments or configurations may be used. Examples of well-known computing devices, environments, and/or configurations that may be suitable for use include, but are not limited to, personal computers, server computers, handheld or laptop devices, multiprocessor systems, microprocessor-based systems, network personal computers (PCs),
Attorney Docket No.10034-223WO1 minicomputers, mainframe computers, embedded systems, distributed computing environments that include any of the above systems or devices, and the like. [0095] Computer-executable instructions, such as program modules, being executed by a computer may be used. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. Distributed computing environments may be used where tasks are performed by remote processing devices that are linked through a communications network or other data transmission medium. In a distributed computing environment, program modules and other data may be located in both local and remote computer storage media including memory storage devices. [0096] With reference to FIG.9, an exemplary system for implementing aspects described herein includes a computing device, such as computing device 900. In its most basic configuration, computing device 900 typically includes at least one processing unit 902 and memory 904. Depending on the exact configuration and type of computing device, memory 904 may be volatile (such as random access memory (RAM)), non-volatile (such as read-only memory (ROM), flash memory, etc.), or some combination of the two. This most basic configuration is illustrated in FIG.9 by dashed line 906. [0097] Computing device 900 may have additional features/functionality. For example, computing device 900 may include additional storage (removable and/or non-removable) including, but not limited to, magnetic or optical disks or tape. Such additional storage is illustrated in FIG.9 by removable storage 908 and non-removable storage 910. [0098] Computing device 900 typically includes a variety of computer readable media. Computer readable media can be any available media that can be accessed by the device 900 and includes both volatile and non-volatile media, removable and non-removable media. [0099] Computer storage media include volatile and non-volatile, and removable and non- removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Memory 904, removable storage 908, and non-removable storage 910 are all examples of computer storage media. Computer storage media include, but are not limited to, RAM, ROM, electrically erasable program read-only memory (EEPROM), flash memory or other
Attorney Docket No.10034-223WO1 memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by computing device 700. Any such computer storage media may be part of computing device 900. [0100] Computing device 900 may contain communication connection(s) 912 that allow the device to communicate with other devices. Computing device 900 may also have input device(s) 914 such as a keyboard, mouse, pen, voice input device, touch input device, etc. Output device(s) 916 such as a display, speakers, printer, etc. may also be included. All these devices are well known in the art and need not be discussed at length here. [0101] It should be understood that the various techniques described herein may be implemented in connection with hardware components or software components or, where appropriate, with a combination of both. Illustrative types of hardware components that can be used include Field-programmable Gate Arrays (FPGAs), Application-specific Integrated Circuits (ASICs), Application-specific Standard Products (ASSPs), System-on-a-chip systems (SOCs), Complex Programmable Logic Devices (CPLDs), etc. The methods and apparatus of the presently disclosed subject matter, or certain aspects or portions thereof, may take the form of program code (i.e., instructions) embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other machine-readable storage medium where, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the presently disclosed subject matter. [0102] Although exemplary implementations may refer to utilizing aspects of the presently disclosed subject matter in the context of one or more stand-alone computer systems, the subject matter is not so limited, but rather may be implemented in connection with any computing environment, such as a network or distributed computing environment. Still further, aspects of the presently disclosed subject matter may be implemented in or across a plurality of processing chips or devices, and storage may similarly be effected across a plurality of devices. Such devices might include personal computers, network servers, and handheld devices, for example.
Attorney Docket No.10034-223WO1 [0103] Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.
Claims
Attorney Docket No.10034-223WO1 What is claimed is: 1. A computer-implemented method comprising: receiving, by one or more processors, an identifier of an exam, wherein the exam comprises a plurality of questions; retrieving, by the one or more processors, a plurality of submissions for the identified exam, wherein each submission is associated with a different exam taker of the plurality of exam takers and includes a plurality of answers for the plurality of questions of the exam, and further where each answer of the plurality of answers is associated with a timestamp; based on the retrieved submission, generating, by the one or more processors, a suspicion score for each pair of exam takers from the plurality of exam takers; and generating, by the one or more processors, a report comprising some or all of the generated suspicion scores and identifiers of the associated pairs of exam takers. 2. The method of claim 1, wherein the suspicion score for a pair of exam takers is based on a number of matching rare incorrect answers provided by the exam takers in the pair. 3. The method of claim 1, wherein the suspicion score for a pair of exam takers is based on a number of matching rare correct answers provided by the exam takers in the pair. 4. The method of claim 1, wherein there are a plurality of incorrect and correct answers to at least one question of the plurality of questions. 5. The method of claim 4, wherein the at least one question comprises one or more of an ordering question, a matching question, or a categorizing question. 6. The computer-implemented method of claim 1, wherein generating the report for a pair of exam takers comprises: determining, based on the timestamps associated with the answer of the plurality of answers, questions where each exam taker provided an answer within a time duration; and
Attorney Docket No.10034-223WO1 providing identifiers of the determined questions or a total number of the determined questions in the generated report. 7. The computer-implemented method of claim 1, wherein generating the report for a pair of exam takers comprises: identifying questions of the plurality of questions where each exam taker of the pair provided a same answer; and providing identifiers of the identified questions in the generated report. 8. The computer-implemented method of claim 1, wherein generating the report for a pair of exam takers comprises: identifying questions of the plurality of questions where each exam taker of the pair provided a same answer; and providing identifiers of the identified questions in the generated report. 9. The computer-implemented method of claim 1, further comprising identifying a pair of exam takers where the exam takers in the pair provided a same answer to more than a threshold number of questions. 10. The computer-implemented method of claim 9, wherein at least some of the answers have similar timestamps. 11. The computer-implemented method of claim 9, wherein at least some of the answers are provided by one of the exam takers in a short period of time. 12. The method of claim 1, wherein generating a suspicion score S for a pair of exam takers is calculated using a formula: ^^ ൌ ∑^ ^ ^ୀ^ െ ^^ ^^ ^^థ ^ ே ∗ ^^^ , where n is a number of the plurality of questions of the exam, ^^ is a
of exam takers that had a same answer value for a question i in a submission, N is a total number of exam takers, and ^^^ is either 1 or 0 depending on whether the exam takers had a same answer or not for the question i.
Attorney Docket No.10034-223WO1 13. The computer-implemented method of claim 12, wherein ^^ is ^ା√ହ ଶ .
14. The computer-implemented method of claim 1, further comprising: while administering the exam: directing at least one exam taker to a webpage to format their answer to at least one question; receiving, at the webpage, the answer to the at least one question from the exam taker; formatting, by the webpage, the answer, wherein the formatting includes adding a unique watermark associated with the exam taker to the answer; displaying, by the webpage, the formatted answer to the exam taker; and instructing, by the webpage, the exam taker to copy the displayed formatted answer into an answer field associated with the at least one question. 15. The method of claim 1, wherein each exam taker is associated with a unique watermark, and further comprising: for each exam taker: extracting one or more unique watermarks from the plurality of answers of the submission associated with the exam taker; determining at least one extracted unique watermark that does not match the unique watermark associated with the exam taker; and in response to the determination, recording the exam taker as a plagiarist. 16. The method of claim 15, wherein the unique watermark comprises a unique number in an invisible Unicode format. 17. The method of claim 1, The method of claim 1, further comprising: while administering the exam: detecting pairs of users where answers to the same questions are received with timestamps that are close to each other; and
Attorney Docket No.10034-223WO1 notifying at least one teaching professional associated with the exam in response to the detecting. 18. A system comprising: one or more processors; and a computer-readable medium-storing computer-readable instructions that when executed by the one or more processors cause the system to: receive an identifier of an exam, wherein the exam comprises a plurality of questions; retrieve a plurality of submissions for the identified exam, wherein each submission is associated with a different exam taker of the plurality of exam takers and includes a plurality of answers for the plurality of questions of the exam, and further where each answer of the plurality of answers is associated with a timestamp; based on the retrieved submission, generate a suspicion score for each pair of exam takers from the plurality of exam takers; and generate a report comprising some or all of the generated suspicion scores and identifiers of the associated pairs of exam takers. 19. The system of claim 18, wherein generating a suspicion score S for a pair of exam takers is calculated using a formula: ^^ ൌ ∑^ ^ୀ^ െ ^^ ^^ ^^ ^ థ ^ ே ∗ ^^^ , where n is a number of the plurality of questions of the exam, ^^ is a
of exam takers that had a same answer value for a question i in a submission, N is a total number of exam takers, and ^^^ is either 1 or 0 depending on whether the exam takers had a same answer or not for the question i. 20. A non-transitory computer-readable medium-storing computer-readable instructions that when executed by one or more processors cause a system to: receive an identifier of an exam, wherein the exam comprises a plurality of questions;
Attorney Docket No.10034-223WO1 retrieve a plurality of submissions for the identified exam, wherein each submission is associated with a different exam taker of the plurality of exam takers and includes a plurality of answers for the plurality of questions of the exam, and further where each answer of the plurality of answers is associated with a timestamp; based on the retrieved submission, generate a suspicion score for each pair of exam takers from the plurality of exam takers; and generate a report comprising some or all of the generated suspicion scores and identifiers of the associated pairs of exam takers.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202263385246P | 2022-11-29 | 2022-11-29 | |
| US63/385,246 | 2022-11-29 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024118753A1 true WO2024118753A1 (en) | 2024-06-06 |
Family
ID=91324900
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2023/081567 Ceased WO2024118753A1 (en) | 2022-11-29 | 2023-11-29 | Cheating detection and advanced outcome analysis for online exams |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2024118753A1 (en) |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20120077177A1 (en) * | 2010-03-14 | 2012-03-29 | Kryterion, Inc. | Secure Online Testing |
| US20150143534A1 (en) * | 2013-11-21 | 2015-05-21 | Erica Christine Bowles | System and method for managing, tracking, and utilizing copy and/or paste events |
| US20160306986A1 (en) * | 2015-04-17 | 2016-10-20 | Dropbox, Inc. | Collection folder for collecting file submissions and using facial recognition |
| US9870713B1 (en) * | 2012-09-17 | 2018-01-16 | Amazon Technologies, Inc. | Detection of unauthorized information exchange between users |
| US20180232829A1 (en) * | 2017-02-10 | 2018-08-16 | International Business Machines Corporation | Dynamic irregularity management |
| WO2022082219A1 (en) * | 2020-10-14 | 2022-04-21 | The Regents Of The University Of California | Systems and methods for detecting collusion in student testing using graded scores or answers for individual questions |
-
2023
- 2023-11-29 WO PCT/US2023/081567 patent/WO2024118753A1/en not_active Ceased
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20120077177A1 (en) * | 2010-03-14 | 2012-03-29 | Kryterion, Inc. | Secure Online Testing |
| US9870713B1 (en) * | 2012-09-17 | 2018-01-16 | Amazon Technologies, Inc. | Detection of unauthorized information exchange between users |
| US20150143534A1 (en) * | 2013-11-21 | 2015-05-21 | Erica Christine Bowles | System and method for managing, tracking, and utilizing copy and/or paste events |
| US20160306986A1 (en) * | 2015-04-17 | 2016-10-20 | Dropbox, Inc. | Collection folder for collecting file submissions and using facial recognition |
| US20180232829A1 (en) * | 2017-02-10 | 2018-08-16 | International Business Machines Corporation | Dynamic irregularity management |
| WO2022082219A1 (en) * | 2020-10-14 | 2022-04-21 | The Regents Of The University Of California | Systems and methods for detecting collusion in student testing using graded scores or answers for individual questions |
Non-Patent Citations (1)
| Title |
|---|
| RIZZO STEFANO GIOVANNI, BERTINI FLAVIO, MONTESI DANILO: "Fine-grain watermarking for intellectual property protection", EURASIP JOURNAL ON INFORMATION SECURITY, vol. 2019, no. 1, 1 December 2019 (2019-12-01), XP093183163, ISSN: 2510-523X, DOI: 10.1186/s13635-019-0094-2 * |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US9043298B2 (en) | Platform for generating, managing and sharing content clippings and associated citations | |
| Wise et al. | A general approach to measuring test-taking effort on computer-based tests | |
| Purpura et al. | Conversion to remote proctoring of the community English language program online placement exam at Teachers College, Columbia University | |
| US20150161903A1 (en) | System and method for enabling crowd-sourced examination marking | |
| Correos | Teachers’ ICT literacy and utilization in English language teaching | |
| US20120045744A1 (en) | Collaborative University Placement Exam | |
| US20240379020A1 (en) | Systems and methods for electronic prediction of rubric assessments | |
| Dinndorf-Hogenson et al. | Applying the flipped classroom model to psychomotor skill acquisition in nursing | |
| Mwita et al. | Thematic Analysis of Qualitative Research Data: A Seven-Step Guide | |
| Lipschultz et al. | The effects of feedback accuracy and trainer rules on performance during an analogue task | |
| Borràs et al. | Traditional study abroad vs. ELFSA: Differences and similarities in L2 reading, vocabulary, and use | |
| WO2024118753A1 (en) | Cheating detection and advanced outcome analysis for online exams | |
| Paullet et al. | Can GPTZero detect if students are using artificial intelligence to create assignments? | |
| Koss | Writing and information literacy in a cryptology first-year seminar | |
| Bartolomé et al. | Validating item response processes in digital competence assessment through eye-tracking techniques | |
| CN113487458B (en) | A medical chain teaching management system based on the Internet | |
| Forgasz | Scholarly writing | |
| Freiling et al. | Reconstructing people's lives: A case study in teaching forensic computing | |
| TWI898502B (en) | An intelligent job hunting interactive matching system and method thereof | |
| Suseela et al. | Plagiarism and Academic Dishonesty: Study of Users’ Perceptions in the University of Hyderabad | |
| Stuart | Effectiveness of acceptance and commitment therapy (ACT) in enhancing psychological flexibility in adults: a systematic review; and, Wellbeing in retirement: the role of psychological flexibility, value-directed living and cognitive defusion | |
| Juan et al. | ChatGPT as a Tool to Foster Critical Thinking in a Human Physiology Course for Students of the Degree of Human Nutrition and Dietetics | |
| Wheldall et al. | Preliminary evidence for the validity of the new test of everyday reading comprehension | |
| Grünebaum et al. | Action Research in Guatemala: Addressing Educational Gaps through Teacher-Led Inquiry | |
| Stetter | Plagiarism and the Use of Blackboard’s TurnItIn |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23898785 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 23898785 Country of ref document: EP Kind code of ref document: A1 |