Plagiarism: A Serious Malpractice in Latest Infrastructure through Webhunting

 

Palakben K. Parikh, Julee P. Soni, Ravi N. Patel, Urviben Y. Patel, Kaumil N. Modi, Hiren M. Marvaniya and Dhrubo Jyoti Sen

Department of Pharmaceutical Chemistry, Shri Sarvajanik Pharmacy College, Gujarat Technological University, Arvind Baug, Mehsana-384001, Gujarat,

 

 

ABSTRACT:

Modern wired generation has a serious hobby in webhunting through net-surfing cum browsing. Huge data collection on various topics and further re-formatting to present in desired platform is a new style between new generations to old generation generates the coinage malpractice of plagiarism. Strategies of Select-Copy-Paste are the three muskatieers to build a new entity which is an assembled chapter on new pages of presentable format of new article engineered by manipulated fabrication under the umbrella of malpractice. Some detectors have been launched to control over this malpractice by latest software.

 

 

INTRODUCTION:

Malpractice or Dishonesty or Misconduct is any type of cheating that occurs in relation to a formal exercise. It can include:

·         Plagiarism: The adoption or reproduction of original creations of another author (person, collective, organization, community or other type of author, including anonymous authors) without due acknowledgment.

·         Fabrication: The falsification of data, information, or citations in any formal academic exercise.

·         Deception: Providing false information to an instructor concerning a formal academic exercise—e.g., giving a false excuse for missing a deadline or falsely claiming to have submitted work.

·         Cheating: Any attempt to give or obtain assistance in a formal academic exercise (like an examination) without due acknowledgment.

·         Bribery: or paid services. Giving certain test answers for money.

·         Sabotage: Acting to prevent others from completing their work. This includes cutting pages out of library books or willfully disrupting the experiments of others.

·         Professorial misconduct: Professorial acts that are academically fraudulent equate to academic fraud.

Academic dishonesty has been documented in most every type of educational setting, from elementary school to graduate school, and has been met with varying degrees of approbation throughout history. Today, educated society tends to take a very negative view of academic dishonesty1,2.

 

Assemblage refers to a text "built primarily and explicitly from existing texts in order to solve a writing or communication problem in a new context". The concept was first proposed by Johndan Johnson-Eilola (author of Datacloud) and Stuart Selber in the journal, Computers  and  Composition, in 2007.

 


The notion of assemblages builds on remix and remix practices, which blur distinctions between invented and borrowed work3.

 

Contract cheating is a form of academic dishonesty in which students get others to complete their coursework for them by putting it out to tender. The term was coined in a 2006 study by Thomas Lancaster and Robert Clarke at the University of Central England in Birmingham (now known as Birmingham City University)4,5.

 

Dealing with contract cheating:

Every approach for dealing with contract cheating recognises the distinction between contract cheating and "plagiarism" (uncited copying from books, web etc) and "collusion" (copying of the work of other students in the same cohort)6.

Contract cheating can be successfully prevented if all of the following actions are taken:

·         Do not re-use assignments: Too easy for students to copy from a previous year; common assignments appear on "essay mill" sites.

·         Individualise assignments: Harder for students to collude; easier to identify which student posted the assignment.

·         Vivas and tests to contribute to marks: Makes "outsourcing" a less easy option to pass a module; provides evidence for "non-originality". (A viva is an examination which is given orally in universities.)

·         Monitor known sites: And tell students that you are doing so; look for trends in the characteristics of contract cheating

·         Change academic regulations: Until changed, regulations only recognise plagiarism and collusion; types of evidence needed in contract cheating cases are different.

 

Figure-1

 

Copyscape is an online plagiarism detection service that checks whether similar text content appears elsewhere on the web. It was launched in 2004 by Indigo Stream Technologies; Ltd. Copyscape is used by content owners to detect cases of "content theft", in which content is copied without permission from one site to another. It is also used by content publishers to detect cases of content fraud, in which old content is repackaged and sold as new original content7.

 

Function:

Given a URL of the original content, Copyscape returns a list of web pages that contain similar text to all or parts of this content. It also shows the matching text highlighted on the found web page. Copyscape banners can be placed on a web page to warn potential plagiarists not to steal content.  Copyscape also provides two paid services: Copysentry monitors the web and sends notifications by email when new copies are found, and Copyscape Premium verifies the originality of content purchased by online content publishers.

 

Design and Independent Evaluation:

Copyscape uses the Google Web API to power its searches. Copyscape uses a set of algorithms to identify copied content that has been modified from its original form. Independent plagiarism software tests conducted by Professor Debora Weber-Wulff of the University of Applied Sciences in Berlin found Copyscape Premium to be the best performing service.

 

Limitations:

Copyscape finds online copies of textual content, but not of images or other media. Copyscape is not able to determine whether a copy is authorized or unauthorized, nor is it able to determine which of two websites copied the other. Both of these determinations are left up to users. Contacting the offending site to have content removed by DMCA notice is also left up to the user.

 

Copyright is the set of exclusive rights granted to the author or creator of an original work, including the right to copy, distribute and adapt the work. These rights can be licensed, transferred and/or assigned. Copyright lasts for a certain time period after which the work is said to enter the public domain. Copyright applies to a wide range of works that are substantive and fixed in a medium. Some jurisdictions also recognize "moral rights" of the creator of a work, such as the right to be credited for the work. The Statute of Anne 1709, full title "An Act for the Encouragement of Learning, by vesting the Copies of Printed Books in the Authors or purchasers of such Copies, during the Times therein mentioned", is now seen as the origin of copyright law. Since the 19th Century copyright is described under the umbrella term intellectual property along with patents and trademarks. Copyright has been internationally standardized, lasting between fifty and one hundred years from the author's death, or a shorter period for anonymous or corporate authorship. Generally, copyright is enforced as a civil matter, though some jurisdictions do apply criminal sanctions8.

 

Copyright infringement (or copyright violation) is the unauthorized or prohibited use of works covered by copyright law, in a way that violates one of the copyright owner's exclusive rights, such as the right to reproduce or perform the copyrighted work, or to make derivative works. For electronic and audio-visual media, unauthorized reproduction and distribution is also commonly referred to as piracy. An early reference to piracy in the context of copyright infringement was made by Daniel Defoe in 1703 when he said of his novel The True-Born Englishman that "Its being Printed again and again, by Pyrates". The practice of labeling the act of infringement as "piracy" predates statutory copyright law. Prior to the Statute of Anne 1709, the Stationers' Company of London in 1557 received a Royal Charter giving the company a monopoly on publication and tasking it with enforcing the charter. Those who violated the charter were labeled pirates as early as 1603. The legal basis for this usage dates from the same era, and has been consistently applied until the present time. Critics of the use of the term "piracy" to describe such practices contend that it is pejorative and unfairly equates copyright infringement with more sinister activity9.

 

Figure-2

 

Cryptomnesia occurs when a forgotten memory returns without its being recognised as such by the subject, who believes it is something new and original. It is a memory bias whereby a person may falsely recall generating a thought, an idea, a song, or a joke, not deliberately engaging in plagiarism but rather experiencing a memory as if it were a new inspiration. Sentences in scientific papers that are identical to sentences from some of the references used to write the paper often stem from cryptomnesia10,11.

 

Essay mill (or paper mill) is a ghostwriting service that sells essays and other homework writing to university and college students. Since plagiarism is a form of academic dishonesty or academic fraud, universities and colleges may investigate papers suspected to be from an essay mill by using Internet plagiarism detection software, which compares essays against a database of known essay mill essays and by orally testing students on the contents of their papers. However, many essay mills guarantee that a unique essay will be composed by a ghost author and pre-screened with plagiarism detection software before delivery, and as such will be undetectable as an essay mill product12.

 

Fair use is a doctrine in United States copyright law that allows limited use of copyrighted material without requiring permission from the rights holders, such as for commentary, criticism, news reporting, research, teaching or scholarship. It provides for the legal, non-licensed citation or incorporation of copyrighted material in another author's work under a four-factor balancing test. The term fair use originated in the United States. A similar principle, fair dealing, exists in some other common law jurisdictions. Civil law jurisdictions have other limitations and exceptions to copyright13.

 

Joke thievery is the act of performing and taking credit for comic material written by another person without their consent. This is a form of plagiarism and sometimes can be copyright infringement. A common term for joke thievery is "hacking", which is derived from the term, "Hack-neyed" (Meaning, "over used and thus cheapened, or trite").

 

Journalism scandals are high-profile incidents or acts, whether intentional or accidental, that run contrary to the generally accepted ethics and standards of journalism, or otherwise violate the 'ideal' mission of journalism: to report news events and issues accurately and fairly14.

 

Ghostwriter is a professional writer who is paid to write books, articles, stories, reports, or other texts that are officially credited to another person. Celebrities, executives, and political leaders often hire ghostwriters to draft or edit autobiographies, magazine articles, or other written material. In music, ghostwriters are used in film score composition, as well as for writing songs and lyrics for popular music styles ranging from country to hip-hop. Ghostwriters may have varying degrees of involvement in the production of a finished work; while some ghostwriters are hired to edit and clean up a rough draft, in other cases, ghostwriters do most of the writing based on an outline provided by the credited author. For some projects, ghostwriters will do a substantial amount of research, as in the case of a ghostwriter who is hired to write an autobiography for a well-known person. Ghostwriters are also hired to write fiction in the style of an existing author, often as a way of increasing the number of books that can be published by a popular author (e.g., Tom Clancy, James Patterson). Ghostwriters will often spend from several months to a full year researching, writing, and editing nonfiction works for a client, and they are paid either per page, with a flat fee, or a percentage of the royalties of the sales, or some combination thereof. The ghostwriter is sometimes acknowledged by the author or publisher for his or her writing services15.

 

Music plagiarism is the use or close imitation of another author's music while representing it as one's own original work. Plagiarism in music now occurs in two contexts – with a musical idea (that is, a melody or motif) or sampling (taking a portion of one sound recording and reusing it in a different song)16.

 

Personal boundaries are guidelines, rules or limits that a person creates to identify for themselves what are reasonable, safe and permissible ways for other people to behave around them and how they will respond when someone steps outside those limits17.

 

Scientific misconduct is the violation of the standard codes of scholarly conduct and ethical behavior in professional scientific research18. A Lancet review on Handling of Scientific Misconduct in Scandinavian countries provides the following sample definitions: (reproduced in The COPE report 1999)

·         Danish Definition: "Intention(al) or gross negligence leading to fabrication of the scientific message or a false credit or emphasis given to a scientist"

·         Swedish Definition: "Intention (al) distortion of the research process by fabrication of data, text, hypothesis, or methods from another researcher's manuscript form or publication; or distortion of the research process in other ways."

The consequences of scientific misconduct can be severe at a personal level for both perpetrators and any individual who exposes it. In addition there are public health implications attached to the promotion of medical or other interventions based on dubious research findings.

 

 

Figure-3

Plagiarism, as defined in the 1995 Random House Compact Unabridged Dictionary, is the use or close imitation of the language and thoughts of another author and the representation of them as one's own original work. Within academia, plagiarism by students, professors, or researchers is considered academic dishonesty or academic fraud, and offenders are subject to academic censure, up to and including expulsion. In journalism, plagiarism is considered a breach of journalistic ethics, and reporters caught plagiarizing typically face disciplinary measures ranging from suspension to termination of employment. Some individuals caught plagiarizing in academic or journalistic contexts claim that they plagiarized unintentionally, by failing to include quotations or give the appropriate citation. While plagiarism in scholarship and journalism has a centuries-old history, the development of the Internet, where articles appear as electronic text, has made the physical act of copying the work of others much easier.

 

Plagiarism is not the same as copyright infringement. While both terms may apply to a particular act, they are different transgressions. Copyright infringement is a violation of the rights of a copyright holder, when material protected by copyright is used without consent. On the other hand, plagiarism is concerned with the unearned increment to the plagiarizing author's reputation that is achieved through false claims of authorship19.

 

Etymology:

English Plagiarism (1615–25), earlier plagiary (1590–1600), derives from Latin plagiārius, "kidnapper", equivalent to plagium, "kidnapping", which contains Latin plaga ("snare", "net"), based on the Indo-European root *-plak, "to weave" (seen for instance in Greek plekein, Latin plectere, both meaning "to weave").

 

Many students feel pressured to complete papers well and quickly, and with the accessibility of new technology (the Internet) students can plagiarize by copying and pasting information from other sources. This is often easily detected by teachers for several reasons. First, students' choices of sources are frequently unoriginal; instructors may receive the same passage copied from a popular source from several students. Second, it is often easy to tell whether a student used his or her own "voice." Third, students may choose sources which are inappropriate, inaccurate, or off-topic. Fourth, lecturers may insist that submitted work is first submitted to an online plagiarism detector. In the academic world, plagiarism by students is a very serious offense that can result in punishments such as a failing grade on the particular assignment (typically at the high school level) or for the course (typically at the college or university level). For cases of repeated plagiarism, or for cases in which a student commits severe plagiarism (e.g., submitting a copied piece of writing as original work), a student may be suspended or expelled. In many universities, academic degrees or awards may be revoked as a penalty for plagiarism.

 

There is little academic research into the frequency of plagiarism in high schools. Much of the research investigated plagiarism at the post-secondary level4. Of the forms of cheating, (including plagiarism, inventing data, and cheating during an exam) students admit to plagiarism more than any other. However, this figure decreases considerably when students are asked about the frequency of "serious" plagiarism (such as copying most of an assignment or purchasing a complete paper from a website). Recent use of plagiarism detection software (see below) gives a more accurate picture of this activity's prevalence.

 

For professors and researchers, plagiarism is punished by sanctions ranging from suspension to termination, along with the loss of credibility and integrity5. Charges of plagiarism against students and professors are typically heard by internal disciplinary committees, which students and professors have agreed to be bound by20.

 

Journalism:

Since journalism's main currency is public trust, a reporter's failure to honestly acknowledge their sources undercuts a newspaper or television news show's integrity and undermines its credibility. Journalists accused of plagiarism are often suspended from their reporting tasks while the charges are being investigated by the news organization. The ease with which electronic text can be reproduced from online sources has lured a number of reporters into acts of plagiarism: Journalists have been caught "copying-and-pasting" articles and text from a number of websites21.

 

Online plagiarism:

Content scraping is a phenomenon of copy and pasting material from Internet websites, affecting both established sites and blogs. Free online tools are becoming available to help identify plagiarism, and there is a range of approaches that attempt to limit online copying, such as disabling right clicking and placing warning banners regarding copyrights on web pages. Instances of plagiarism that involve copyright violation may be addressed by the rightful content owners sending a DMCA removal notice to the offending site-owner, or to the ISP that is hosting the offending site. Plagiarism is not only the mere copying of text, but also the presentation of another's ideas as one's own, regardless of the specific words or constructs used to express that idea. In contrast, many so-called plagiarism detection services can only detect blatant word-for-word copies of text.\

 

Figure-4

 

Other contexts:

Generally, although plagiarism is often loosely referred to as theft or stealing, it has not been set as a criminal matter in the courts. Likewise, plagiarism has no standing as a criminal offense in the common law. Instead, claims of plagiarism are a civil law matter, which an aggrieved person can resolve by launching a lawsuit. Acts that may constitute plagiarism are in some instances treated as copyright infringement, unfair competition, or a violation of the doctrine of moral rights. The increased availability of intellectual property due to a rise in technology has furthered the debate as to whether copyright offences are criminal.

 

Chris just found some good stuff on the Web for his science report about sharks. He highlights a paragraph that explains that most sharks grow to be only 3 to 4 feet long and can't hurt people. Chris copies it and pastes it into his report. He quickly changes the font so it matches the rest of the report and continues his research.

 

Uh-oh. Chris just made a big mistake. Do you know what he did? He committed plagiarism (say: play-juh-rih-zem). Plagiarism is when you use someone else's words or ideas and pass them off as your own. It's not allowed in school, college, or beyond, so it's a good idea to learn the proper way to use resources, such as websites, books, and magazines.

 

Plagiarism is a form of cheating, but it's a little complicated so a kid might do it without understanding that it's wrong. Chris should have given the author and the website credit for the information. Why? Because Chris didn't know this information before he came to the website. These aren't his thoughts or ideas.

 

Defenses:

A famous passage of Laurence Sterne's 1767 Tristram Shandy, condemns plagiarism by resorting to plagiarism. Oliver Goldsmith commented:

Sterne's Writings, in which it is clearly shown, that he, whose manner and style were so long thought original, was, in fact, the most unhesitating plagiarist who ever cribbed from his predecessors in order to garnish his own pages? It must be owned, at the same time, that Sterne selects the materials/ of his mosaic work with so much art, places them so well, and polishes them so highly, that in most cases we are disposed to pardon the want of originality, in consideration of the exquisite talent with which the borrowed materials are wrought up into the new form22.

 

Figure-5

 

On December 6, 2006, Thomas Pynchon joined a campaign by many other major authors to clear Ian McEwan of plagiarism charges by sending a typed letter to his British publisher, which was published in the Daily Telegraph newspaper. Playwright Wilson Mizner said "If you copy from one author, it's plagiarism. If you copy from two, it's research". American author Jonathan Lethem delivered a passionate defense of the use of plagiarism in art in his 2007 essay "The ecstasy of influence: A plagiarism" in Harper's. He wrote: "The kernel, the soul—let us go further and say the substance, the bulk, the actual and valuable material of all human utterances—is plagiarism" and "Don't pirate my editions; do plunder my visions. The name of the game is Give All. You, reader, are welcome to my stories. They were never mine in the first place, but I gave them to you".

 

Self-plagiarism also known as "recycling fraud" is the reuse of significant, identical, or nearly identical portions of one’s own work without acknowledging that one is doing so or without citing the original work. Articles of this nature are often referred to as duplicate or multiple publications. In addition to the ethical issue, this can be illegal if copyright of the prior work has been transferred to another entity. Typically, self-plagiarism is only considered to be a serious ethical issue in settings where a publication is asserted to consist of new material, such as in academic publishing or educational assignments23. It does not apply (except in the legal sense) to public-interest texts, such as social, professional, and cultural opinions usually published in newspapers and magazines.

 

In academic fields, self-plagiarism is when an author reuses portions of their own published and copyrighted work in subsequent publications, but without attributing the previous publication. Identifying self-plagiarism is often difficult because limited reuse of material is both legally accepted (as fair use) and ethically accepted. It is common for university researchers to rephrase and republish their own work, tailoring it for different academic journals and newspaper articles, to disseminate their work to the widest possible interested public. However, it must be borne in mind that these researchers also obey limits: If half an article is the same as a previous one, it will usually be rejected. One of the functions of the process of peer review in academic writing is to prevent this type of "recycling".

 

The concept of self-plagiarism:

The concept of "self-plagiarism" has been challenged as self-contradictory or an oxymoron. For example, Stephanie J. Bird argues that self-plagiarism is a misnomer, since by definition plagiarism concerns the use of others' material24. However, the phrase is used to refer to specific forms of potentially unethical publication. Bird identifies the ethical issues sometimes called "self-plagiarism" as those of "dual or redundant publication." She also notes that in an educational context, "self-plagiarism" may refer to the case of a student who resubmits "the same essay for credit in two different courses." As David B. Resnik clarifies, "Self-plagiarism involves dishonesty but not intellectual theft".

 

According to Patrick M. Scanlon:

"Self-plagiarism" is a term with some specialized currency. Most prominently, it is used in discussions of research and publishing integrity in biomedicine, where heavy publish-or-perish demands have led to a rash of duplicate and “salami-slicing” publication, the reporting of a single study's results in "least publishable units" within multiple articles. Roig (2002) offers a useful classification system including four types of self-plagiarism: duplicate publication of an article in more than one journal; partitioning of one study into multiple publications, often called salami-slicing; text recycling; and copyright infringement.

 

Self-plagiarism and codes of ethics:

Some academic journals have codes of ethics which specifically refer to self-plagiarism. For example, the Journal of International Business Studies. Some professional organizations like the Association for Computing Machinery (ACM) have created policies that deal specifically with self-plagiarism. Other organisations do not make specific reference to self-plagiarism: The American Political Science Association (APSA) has published a code of ethics which describes plagiarism as "deliberate appropriation of the works of others represented as one's own." It does not make any reference to self-plagiarism. It does say that when a thesis or dissertation is published "in whole or in part", the author is "not ordinarily under an ethical obligation to acknowledge its origins"24.

 

Figure-6

 

The American Society for Public Administration (ASPA) has published a code of ethics which says its members are committed to: "Ensure that others receive credit for their work and contributions," but it does not make any reference to self-plagiarism.

 

Factors that justify reuse:

Pamela Samuelson in 1994 identified several factors which excuse reuse of one's previously published work without the culpability of self-plagiarism25. She relates each of these factors specifically to the ethical issue of self-plagiarism, as distinct from the legal issue of fair use of copyright, which she deals with separately. Among other factors which may excuse reuse of previously published material Samuelson lists the following:

1.                   The previous work needs to be restated in order to lay the groundwork for the contribution in the second work.

2.                   The previous work needs to be restated in order to lay the groundwork for a new contribution in the second work.

3.                   Portions of the previous work must be repeated in order to deal with new evidence or arguments.

4.                   The audience for each work is so different that publishing the same work in different places was necessary to get the message out.

5.                   The author thinks they said it so well the first time that it makes no sense to say it differently a second time.

 

Samuelson states she has relied on the "different audience" rationale when attempting to bridge interdisciplinary communities. She refers to writing for different legal and technical communities, saying: "there are often paragraphs or sequences of paragraphs that can be bodily lifted from one article to the other. And, in truth, I lift them." She refers to her own practice of converting "a technical article into a law review article with relatively few changes--adding footnotes and one substantive section" for a different audience.

 

Samuelson describes misrepresentation as the basis of self-plagiarism. She seems less concerned about reuse of descriptive materials than ideas and analytical content. She also states “Although it seems not to have been raised in any of the self-plagiarism cases, copyrights law’s fair use defense would likely provide a shield against many potential publisher claims of copyright infringement against authors who reused portions of their previous works."

 

As a practical issue:

In addition to legal and ethical concerns, plagiarism is frequently also a practical issue, in that it is frequently useful to consult the sources used by an author, and plagiarism makes this more difficult. There are a number of reasons why this is useful:

·         An author may commit an error in how they interpret or use a source, and consulting the original source allows these errors to be detected.

·         Authors generally only supply the portions of prior works that are directly relevant to the work at hand. Other portions of their sources are likely to be relevant to later extensions and generalizations of their work.

·         As modern automated indexing methods become prevalent, references between works provide valuable information about their authoritativeness and how closely works are related; this helps to locate relevant works.

 

Organizational publications:

Plagiarism is presumably not an issue when organizations issue collective unsigned works since they do not assign credit for originality to particular people. For example, the American Historical Association's "Statement on Standards of Professional Conduct" (2005) regarding textbooks and reference books states that, since textbooks and encyclopedias are summaries of other scholars' work, they are not bound by the same exacting standards of attribution as original research and may be allowed a greater "extent of dependence" on other works. However, even such a book does not make use of words, phrases, or paragraphs from another text or follow too closely the other text's arrangement and organization, and the authors of such texts are also expected to "acknowledge the sources of recent or distinctive findings and interpretations, those not yet a part of the common understanding of the profession"26.

 

Within an organization, in its own working documents, standards are looser but not non-existent. If someone helped with a report, they may expect to be credited. If a paragraph comes from a law report, a citation is expected to be written down. Technical manuals routinely copy facts from other manuals without attribution, because they assume a common spirit of scientific endeavor (as evidenced, for example, in free and open source software projects) in which scientists freely share their work.

 

The Microsoft Manual of Style for Technical Publications Third Edition (2003) by Microsoft does not even mention plagiarism, nor does Science and Technical Writing: A Manual of Style, Second Edition (2000) by Philip Rubens. The line between permissible literary and impermissible source code plagiarism, though, is apparently quite fine. As with any technical field, computer programming makes use of what others have contributed to the general knowledge.

 

Search Wikiversity

Wikiversity has learning materials about Plagiarism

·                     Academic dishonesty

·                     Assemblage

·                     Contract cheating

·                     Copyscape (website for detecting Internet plagiarism)

·                     Copyright

·                     Copyright infringement

·                     Credit (creative arts)

·                     Cryptomnesia

·                     Essay mill

·                     Fair use

·                     Joke thievery

·                     Journalism scandals (plagiarism, fabrication, omission)

·                     Ghostwriter

·                     List of plagiarism controversies

·                     Multiple publication

·                     Musical plagiarism

·                     Personal boundaries

·                     Plagiarism detection

·                     Scientific misconduct

·                     Source criticism

 

Plagiarism detection is the process of locating instances of plagiarism within a work or document. The widespread use of computers and the advent of the Internet has made it easier to plagiarize the work of others. Most cases of plagiarism are found in academia, where documents are typically essays or reports. However, plagiarism can be found in virtually any field, including scientific papers, art designs, and source code. Detection can be either manual or computer-assisted. Manual detection requires substantial effort and excellent memory, and is impractical in cases where too many documents must be compared, or original documents are not available for comparison. Computer-assisted detection allows vast collections of documents to be compared to each other, making successful detection much more likely27.

 

Use of search engines:

An internet search engine can be used to look for certain keywords or key sentences from a suspected document on the World Wide Web. This method can be highly effective when used on small and characteristic fragments, for instance a poem or a poetic translation. Although it can easily detect blatant cases, it is less effective when the plagiarizer has mixed multiple small fragments from different sources, and will not return any relevant results if the search engine has not indexed the original source or sources. Also, considerable effort is required to investigate each suspected case28.

 

Figure-7

 

Plagiarism detection systems:

A plagiarism detection system compares suspect documents to a large collection (corpus) of other documents and attempts to match parts of the suspect document to parts of those in the corpus. As with search engines, no plagiarism can be detected unless the corpus contains the documents from which the suspect has copied29.

 

Academic program plagiarism:

Plagiarism in computer code is also frequent, and requires different tools than those found in textual document plagiarism. Significant research has been dedicated to academic source-code plagiarism. A distinctive aspect of source-code plagiarism is that there are no essay mills, such as can be found in traditional plagiarism. Since most programming assignments expect students to write programs with very specific requirements, it is very difficult to find existing programs that meet them. Since integrating external code is often harder than writing it from scratch, most plagiarizing students choose to do so from their peers.

According to Roy and Cordy, source-code similarity detection algorithms can be classified as based on either

·         Strings - look for exact textual matches of segments, for instance five-word runs. Fast, but can be confused by renaming identifiers.

·         Tokens - as with strings, but using a lexer to convert the program into tokens first. This discards whitespace, comments, and identifier names, making the system more robust to simple text replacements. Most academic plagiarism detection systems work at this level, using different algorithms to measure the similarity between token sequences.

·         Parse Trees - build and compare parse trees. This allows higher-level similarities to be detected. For instance, tree comparison can normalize conditional statements, and detect equivalent constructs as similar to each other.

·         Program Dependency Graphs (PDGs) - a PDG captures the actual flow of control in a program, and allows much higher-level equivalences to be located, at a greater expense in complexity and calculation time.

·         Metrics - metrics capture 'scores' of code segments according to certain criteria; for instance, "the number of loops and conditionals", or "the number of different variables used". Metrics are simple to calculate and can be compared quickly, but can also lead to false positives: two fragments with the same scores on a set of metrics may do entirely different things.

·         Hybrid approaches - for instance, parse trees + suffix trees can combine the detection capability of parse trees with the speed afforded by suffix trees, a type of string-matching data structure.

 

Academic text-document plagiarism:

General design of academic plagiarism detection systems geared for text documents include a number of factors:

 

Most large-scale plagiarism detection systems use large, internal databases (in addition to other resources) that grow with each additional document submitted for analysis. However, this feature is considered by some as a violation of student copyright30.

 

The previous classification was developed for code refactoring, and not for academic plagiarism detection (an important goal of refactoring is to avoid duplicate code, referred to as code clones in the literature). The above approaches are effective against different levels of similarity; low-level similarity refers to identical text, while high-level similarity can be due to similar specifications. In an academic setting, when all students are expected to code to the same specifications, functionally equivalent code (with high-level similarity) is entirely expected, and only low-level similarity is considered as proof of cheating31.

 

Factor

Description and alternatives

Scope of search

In the public internet, using search engines / Institutional databases / Local, system-specific database.

Analysis time

Delay between the time a document is submitted and the time when results are made available.

Document capacity / Batch processing

Number of documents the system can process per unit of time.

Check intensity

How often and for which types of document fragments (paragraphs, sentences, fixed-length word sequences) does the system query external resources, such as search engines.

Comparison algorithm type

The algorithms that define the way the system uses to compare documents against each other.

Precision and Recall

Number of documents correctly flagged as plagiarized compared to the total number of flagged documents, and to the total number of documents that were actually plagiarized. High precision means that few false positives were found, and high recall means that few false negatives were left undetected.

 

Figure-8

 

Multiple publications:

Duplicate publication refers to publishing the same intellectual material more than once, by the author or publisher. It does not refer to the unauthorized republication by someone else, which constitutes plagiarism, copyright violation, or both.

There are several forms of duplicate publication:

 

Legitimate derivatives:

This is deliberate republication in another format, such as the simultaneous publication of a motion picture and a tie-in book. In doing this, it is necessary to respect copyright, for the rights to a derivative of the original work remains with the author of the work, or the publisher or other party to whom the author has assigned the copyright32.

 

Legitimate reformatting:

Is another kind of copyright violation? There are two basic kinds of self-plagiarism, when the similar (or identical) articles appear one by one (republishing) and simultaneously (multiple submission)33.

 

Republishing of very similar works:

As a research-paper is an implicit claim of furthering knowledge, the researcher must state what exactly the claim of novelty is. This would let the editor rate the article in view of their policy. For example, all/most would reject a paper if already published in another journal, although may tolerate (as IEEE does) a re-edited [and expanded] conference paper.

 

However, the line is blurred when safety issues are involved. For example, if an author publishes a study which shows that a particular product or design is faulty, then it is important that the information is spread as widely as possible. Such re-publication is justified in the public interest.

 

Multiple submission to journals:

Multiple submissions are not plagiarism, but it is considered as serious academic misbehavior. Even when a publication fee is paid, it nonetheless wastes the most important resource in academic publishing: the time and work of the referees and the editors, and contributes to the problem it is intended to solve, the slow speed of editorial review. And there is the unfortunate possibility that more than one journal will accept it. As there is no time for feedback from readers, the same errors appear in various journals34,35.

 

Defense of multiple submission:

Duplicate submission can be defended. The slowness of academic editing is so great that if an author waits until the decision of the first publisher is known, the submission to the second journal may take place a whole year later. (Some Scientific journals, for example, JOSAB, may keep a paper during more than one month to analyzing possible candidates for the reviewing). Yet that is a waste of resource, in the case of those journals which do not get the paper to publish.

 

A researcher, after obtaining and verifying his extraordinary result, wants to publish it in Physical Review Letters but is afraid that the paper will be rejected, and his competitors will obtain and publish the result earlier. So, the researcher submits the same result to several national scientific journal(s) of low impact factor over the world: 'Proceedings of Springfield University', ' Revista Científica de Guacatelamala', 'Le Courier Scientific de la Republic Democratic Cannibas', 'Научный Бюллетень Myxосранского Технологического Института', 'Journal der Angewandten Chemieinstituts von Kuyzad', 保險海套大学の仕訳 (6 examples of low-impact factor journals, which actually do not exist), and so on.

 

That is, if 15 journals nod that paper, while only one of them may publish, that is a waste—for 14 of them. Most (all?) journals and popular magazines already tell the potential author not to do that. Alternatively, for an application to the graduate school, the university does not ban that. If that model (request a fee) is/were popular also in the publication field, the "waste of resource" would relate to only the money from out of the pocket of the applicant.

 

Journal republishing:

It occasionally happens that author(s) publishes the same article twice, whether in the same or different journals. For example, compare papers and; these articles differ with only titles36.

 

Exposure of multiple publications:

With the advancement of the internet, there are now several tools available to aid in the detection of plagiarism and multiple publications within biomedical literature. One tool developed in 2006 by researchers in Dr. Harold Garner's laboratory at University of Texas Southwestern Medical Center at Dallas is Déjà Vu, an open-access database containing several thousand instances of duplicate publication.

 

Plagiarism Detector requirements:

http://www.plagiarism-detector.com/setup/pd_setup.exe

·         Microsoft Windows 2000 SP 0,1,2,3,4

·         Microsoft Windows XP, SP 0,1,2,3

·         Microsoft Windows Vista, SP 0,1

·         Installation requires Administrative privileges!

·         Dot NET 2.0 Framework is required.

·         Microsoft Office 2000 and higher is required to    access MS Word documents.

·         Internet access ir required. Broadband connection is           recommended.

·         new! Microsoft Windows7

·         Microsoft Internet Explorer 6.0 or 7.0, Firefox 1.5 or          2.0, Mozilla 1.7

·         Intel® Pentium® III or equivalent processor

·         100 Mb of free storage space

·         128MB of RAM (256MB recommended for large               documents).

REFERENCES:

1.        Donald L. McCabe, Linda Klebe Trevino, and Kenneth D. Butterfield, "Academic Integrity in Honor Code and Non-Honor Code Environments: A Qualitative Investigation", The Journal of Higher Education 70, no. 2 (March-April 1999), 213.

2.        Donald L. McCabe, Kenneth D. Butterfield, and Linda Klebe Trevino, "Faculty and Academic Integrity: The Influence of Current Honor Codes and Past Honor Code Experiences," Research in Higher Education 44, no. 3 (June 2003), 368.

3.        Selber and Johnson-Eilola, Plagiarism, Originality, Assemblage, Computers and Composition, Vol. 24, No. 4. (2007), pp. 375-403.

4.        "Student cheats contract out work". BBC/bbc.com. 2006-06-12.

5.        Liz Lightfoot (2006-06-13). "Cheating students put assignments out to tender on the internet". Telegraph/telegraph.co.uk. http://www.telegraph.co.uk/news/main.jhtml?xml=/news/2006/06/13/ncheat13.xml and sSheet=/news/2006/06/13/ixuknews.html/. http://www.igi-pub.com/referenc

6.        http://plagiat.htw-berlin.de/software/2008/

7.        Dowd, Raymond J. (2006). Copyright Litigation Handbook (1st ed. ed.). Thomson West.

8.        Castle Rock Entertainment, Inc. v. Carol Publishing Group, 150 F.3d 132, 140 (2nd Cir. 1998).

9.        Taylor, F..K. (1965).Cryptomnesia and plagiarism. British Journal of Psychiatry, 111, 1111–1118.

10.     Brown, A. S.,  and  Halliday, H. E. (1991). Cryptomnesia and source memory difficulties. American Journal of Psychology, 104, 475–490.

11.     James Page. 2004. 'Cyber-pseudepigraphy: A New Challenge for Higher Education Policy and Management'. Journal of Higher Education Policy and Management. 26(3):429-433.

12.     Depoorter, Ben; Parisi, Francesco (2002). "Fair Use and Copyright Protection: A Price Theory Explanation". International Review of Law and Economics 21 (4): 453–473

13.     http://stason.org/TULARC/art/hack-stand-up-comedy

14.     Studdert et al., (2004) Financial Conflicts of Interest in Physicians' Relationships with the Pharmaceutical Industry — Self-Regulation in the Shadow of Federal Prosecution, NEJM 351:1891-2000

15.     J. Michael Keyes, "Musical Musings: The Case for Rethinking Music Copyright Protection", 10 Mich. Telecomm. Tech. L. Rev. 407 (2004)

16.     http://www.outofthefog.net/CommonNonBehaviors/Boundaries.html

17.     Nylenna, Magne; Daniel Andersen, Gisela Dahiquist, Matti Sarvas, Asbjørn Aakvaag (July 3, 1999). "Handling of scientific dishonesty in the Nordic countries". The Lancet 354

18.     Kock, N.,  and  Davison, R. (2003). Dealing with plagiarism in the IS research community: A look at factors that drive plagiarism and ways to address them. MIS Quarterly, 27(4), 511-532

19.     Clarke, R. (2006). Plagiarism by academics: More complex than it seems. Journal of the Association for Information Systems, 7(2), 91-121

20.     http://www.famousplagiarists.com/

21.     See for example Dellavalle, Robert P., Banks, Marcus A. and Ellis, Jeffrey I. (2007). "Frequently asked questions regarding self-plagiarism: How to avoid recycling fraud." Journal of the American Academy of Dermatology, Vol. 57 (3), September, pp.527. doi:10.1016/j.physletb.2003.10.071

22.     Broome, Marion E. (2004). "Self-plagiarism: oxymoron, fair use, or scientific misconduct?" Nursing Outlook, Vol. 52 (6), November, pp.273-274.

23.     Scanlon, Patrick M. (2007). "Song from myself: an anatomy of self-plagiarism." Plagiary: cross-disciplinary studies in plagiarism, fabrication and falsification, Vol. 2 (1), pp.1-1

24.     "A Case Study on Computer Programs", Global dimensions of intellectual property rights in science and technology, Part 3, Editors Mitchel B. Wallerstein, Mary Ellen Mogee, Roberta A. Schoen, National Academies Press, 1993

25.     http://www.historians.org/PUBS/Free/ProfessionalStandards.cfm

26.     http://www.plagiarism-detector.com/

27.     Ross, Nancy; Wolfram, Dietmar (2000). "End user searching on the Internet: An analysis of term pair topics submitted to the Excite search engine". Journal of the American Society for Information Science 51 (10): 949–958.

28.     Xie, M.; et al. (1998). "Quality dimensions of Internet search engines". Journal of Information Science 24 (5): 365–372.

29.     http://www.plagiarismscanner.com/

30.     Anderson, G, L. (1999). Cyberplagiarism: A look at the Web term paper sites. College  and  Research Library News, 60, 371-73, 394.

31.     Auer, N. J.,  and  Krupar, E. M. (2001). Mouse click plagiarism: The role of technology in plagiarism and the librarian's role in combatting it. Library Trends, 49, 415-433.

32.     A.Giesen (2004). "Thin-disk solid-state lasers". Proceedings of SPIE 5620: 112–127

33.     http://www.daveyp.com/blog/

34.     http://en.wikipedia.org/wiki/Multiple_publication

35.     http://www.djreprints.com/licensing/syndication.html

 

 

Received on 25.11.2010

Accepted on 12.12.2010        

© A & V Publication all right reserved

Research J. Science and Tech.  2(6): Nov. -Dec. 2010: 134-144