UserGuide

Note: The example data set use here is 500 files of random characters, with multiple snippets of Alices Adventures in Wonderland inserted into the files to represent plagiarism. The file names are also random characters.

Before digging into this user guide, why not watch the 3min demonstration video? Watch P3D scan 500 files and review the results in 3mins.

For usage scenarios see How Plagiarism3D works.


Overview of software

Upon opening P3D for the first time, the screen will initially look like this:

The first window of P3D - named “Summary View” - consists of 3 panels:

  • Left:  “File Sets Importer”
    • used to import files to be scanned (drag and drop)
    • displays a tree of imported files 
  • Middle:  “Summary of File Matches”
    • displays results after scanning files or importing a saved results file
    • bottom of the panel provides three optional filters to simplify viewing the results
  • Right:  “Settings” and “Flagged Files”
    • Top section:  “Settings” usually default is adequate
    • Bottom section:  “Flagged Files” panel lists files flagged as "plagiarized" or "suspicious" by user

 Panel width can be resized by dragging the vertical lines separating the panels.


Importing test documents

  1. Open P3D program
  2. Import a previously saved results file (.p3d these P3D files are only readable by the program)  OR
  3. Import at least one file set (a folder of files) to be scanned

Both actions use the “File” > “Import…” menu at the top left of the screen.


Importing a saved results (.p3d) file

Importing a saved results file, populates the centre “Summary” panel with the results, and the document set(s) associated with that results file appear in the left panel. The root node of the tree (initially called “Project_name”) will be renamed to reflect the name of the saved results file.

A saved results file can also be imported into P3D by dragging and dropping the file onto the "Summary of File Matches" panel (centre).

Importing one or more document sets

P3D accepts multiple document types (.docx, .doc, .pdf, and .rtf), which can be mixed in a single scan. Documents that do not end with one of these extensions are ignored. All documents are converted into .txt format - some conversions take longer than others; in this process, document elements such as images and tables are removed from the files and are therefore not compared. A dialog window alerts users of attempts to import unsupported document formats.

Documents are treated in 1 of 4 ways: 


  • Primary docs: the set of files to be scanned for plagiarism. This is the only mandatory file set. When importing a file set, it will be set as Primary by default.
  • Reference docs: e.g.  a set of student docs from a previous semester of the course. Text matches between Reference docs are not reported, but matches between Primary and Reference docs are reported. This file set is optional.
  • Wiki docs: a set of file(s) from Wikipedia or other online resources which may have been used as reference material by students. These files are treated as Reference doc. This file set is optional. In future versions, these files will be created from key words.
  • Deletion docs: Note: This is a future feature that hasn't been implemented yet. A set of document(s) containing text the user expects to appear in Primary documents, which aren’t to be considered as plagiarism (e.g. lab manual instructions). This file set data type is optional.

 If required, there can be several of each of these document sets.


 It is HIGHLY RECOMMENDED that you organize files to be scanned into the appropriate groups as a set of folders on your own computer:


For example:

How-to-organize




Name the folders as shown below, the first letter showing the type of document. This allows P3D to automatically assign the correct data type to the folders.

Naming

P = Primary

R= Refernce

W = Wiki

It is easiest to drag and drop your document folders onto the "File Sets Importer" panel, but you can use the “File Set” menu to select individual files or a folder of files (multiple directories and/or files can be selected at the same time). Individual files will be collected and put into a folder under the name of the files' parent directory in the file explorer tree on the left panel. The directories the user has selected will immediately appear as nodes in the file tree. The files within each directory will not be visible in the tree as children of the parent node until they have been fully imported.

Individual files will be imported and placed in a folder (named for the files' parent folder on your computer). The directories the user has selected will immediately appear as nodes in the file tree. The files within each directory will not be visible in the tree as children of the parent node until they have been fully imported.

P3D will not accept multiple file sets that have the same name. Renaming a file set within P3D can be done by right-clicking on the file set node in the tree and selecting "Rename this file set," or by triple-clicking the file set node in the tree.

Import time depends on number and type of file. The status bar at the bottom of the P3D window displays the number of files imported and when this is complete.

If the user accidentally selects a file set which they do not wish to be included in the P3D submission, the user may right-click to “Remove this file set” once the file set has finished importing. That file set’s node will be deleted from the tree.


Settings

The Settings panel is located at the top of the right-side panel in the main P3D window. Defaults are usually adequate.

  • Minimum length of matching text string (Default=20): Matching text strings below this value will not be displayed. The lowest accepted value is 10.
  • Maximum number of results (Default=10,000): This field specifies the limit on the number of strings of matched text the user would like returned. For example, setting a value of 10,000 would result in the program returning the 10,000 longest strings of matched text (or the maximum number of strings longer than the specified minimum length of string). This field must contain a value larger than 0.

Now click the “Run” button. The status bar at the bottom of the window indicates the scan is executing.  


Results

1. Summary of results

After a scan completes, the Summary View panel displays P3D’s result table.

Pairs of files with matched text occupy rows in the results table. 

Each column in the table can be sorted in descending or ascending order by clicking the corresponding column header. Click here to see sorting.

The four columns in the table are:

  • File IDs: For easier readability, all submitted files are assigned a unique file ID number. To view the file names associated with each file ID, hover over the row in the table - a text tip will appear with the file names listed in the order that their file IDs appear in the row entry.
  • Number of matches: This integer value indicates how many instances of text strings longer than the specified minimum length appear in the files listed in that row. A larger value is often indicative that the files in that row have been plagiarized.
  • LCS length: This integer value indicates the longest character string match in the pair of files.
  • Total score: The sum of the lengths of all matching character strings of text in the pair of files.

 When mouse pointer is held over a row in the Results Table, names of the 2 files with matches are shown.


2. Detail of matches 

Select row in Results Table (i.e. a pair of files), Right Click Mouse to show Menu of options. 

Select “View detail of files [x, y]” from menu (or from main Tools menu).

The left side of the window shows the number and distribution of the match sizes.

The right side of the window displays the strings of matched text and their lengths (in characters). Matches too long for the window can be viewed by "Right Clicking" on a string and selecting “View full match of this text”.

This view gives the user a very quick and easy way to assess what kind of text has been matched in the pair of files (citations, student discussion, figure legends), providing the first step in evaluating whether work has been plagiarized. 

The user also has the option to delete any string matches (such as false positives) from the details window. 


The next step is to view the pair of files in a side-by-side comparison.

Click the “View files in Similarity View” button at the bottom of the Detail Window or

Right-Click on a file pair (row) in the Results Table and selecting "View files in Similarity View" or

Select a file pair (row) in the Results Table and use “Tools” > “View selected files in Similarity View.”


3. Similarity View

The Similarity View window provides a side-by-side comparison of a pair of text files. Strings of matched characters are highlighted by different colours.

Key Points:

Remember, some elements were stripped out of the original files when imported into P3D.

In this format, it is very easy to recognize where students have altered text to try and defeat plagiarism detection; synonyms may be used, spelling and phrase order may be altered. 

When displaying matches between large files such as theses, the "Compress" button hides the large sections of unmatched text, showing the blocks of matching text with short regions of flanking unmatched text.

To Print or Save the "Similarity View", PD3 can export an HTML copy of this information by selecting “File” > “Export…”. Files displayed in the compressed format will also be exported in this compressed view.

      Load this results file.


 When the HTML page is viewed in a web browser, it contains a button (top right) that allows switching between a one or two column display. The single column format is more appropriate for printing.


4. Flagging pairs of files

The colour of the text in the main Summary panel is used to give the status of the results.

BLACK:  results not reviewed

GREY:  results reviewed

RED:  results reviewed and flagged as plagiarized

ORANGE:  results reviewed and flagged as suspicious

Any document pair can be flagged by the user as “Suspicious” or “Plagiarized,” from the "Similarity View" window (bottom bar), "Detail" window (bottom bar), or from the main "Summary View" window (by right-clicking on a row of the table and selecting “Mark [x,y] as suspicious” or “Mark [x, y] as plagiarized.” Flagged documents are listed in the right-side panel of the Summary View window. Files flagged as “suspicious” will be rendered in orange font, while files flagged “plagiarized” will be rendered in red.

The status of results can be modified by right-clicking on the row in the "Results Table" and selecting the desired classification. 


5. Table Filters

The main Results Table has 3 optional filters at the bottom of the panel; the filters can be used together in combinations.


  1. The top filter is a text field which accepts any quantity of file ID numbers, separated by commas or spaces. It is an AND/OR filter, meaning that an input of “1 2” will display the file match [1, 2] (if the file match exists) and all other file matches containing either file ID 1 or file ID 2. This is useful if you suspect one document may be involved in more than one instance of plagiarism and allows display of all document pairs that involve a particular document.
  2. The middle filter is an OR filter allowing users to display results based on status. By default (all check boxes unselected) all results are displayed. The filter allows for multiple check boxes to be selected.
  3. The bottom filter is an AND filter that filters results based on input file set name. Any file match involving a file within the specified file set will be displayed in the table.


Saving results

  • All data can be saved at any time by clicking “File” > “Save results file…”;  These filenames must end with the extension “.p3d”. These files are only readable by the program. Use “Export" to produce grapical versions of the plagiarized text from a pair of files.
  • Classifications for every document match are saved
  • String matches that were deleted are not saved.
  • Results files store all the file sets associated with the results. This means that upon importing a saved results file, users will be able to obtain the text in the files from P3D, even if those files are no longer stored on the user’s local computer.

 

Glossary

  • Detail window:  A table and bar chart summarizing the matched strings in a document pair. Matches that are false positives can be deleted by the user from this window.
  • Document match:  An instance in the results where two or more files share at least one string of text in common. A file match is represented as a row in the Results table. Each document match has its own corresponding Detail Window and Similarity View window.
  • File set:  A collection (folder/directory) of documents grouped together for greater organization and convenience.
  • Results table:  Table in centre panel of Summary View window that presents an overview of P3D’s results. Each row of the table is a document match, with columns for file ID numbers in the file match, the number of string matches between the files, the LCS length, and the total number of characters matching in the file match.
  • Similarity View:  A window offering a side-by-side comparison of the two or more documents in a selected pair. Matched strings of text between the files are highlighted in corresponding colours. HTML files can be created and saved from this window.
  • String match:  A string of identical characters appearing in two documents. 
  • Summary View:  The first window that appears when opening the program
 Created by academics for academics            © Plagiarism3D 2017             efficient, effective and economic