Plagiarism: A Serious Malpractice in Latest
Infrastructure through Webhunting
Palakben
K. Parikh, Julee P. Soni, Ravi N. Patel, Urviben Y. Patel, Kaumil N. Modi, Hiren M. Marvaniya and Dhrubo Jyoti Sen
Department
of Pharmaceutical Chemistry, Shri Sarvajanik
Pharmacy College, Gujarat Technological University, Arvind
Baug, Mehsana-384001, Gujarat,
ABSTRACT:
Modern wired generation has a
serious hobby in webhunting through net-surfing cum
browsing. Huge data collection on various topics and further re-formatting to
present in desired platform is a new style between new generations to old
generation generates the coinage malpractice of plagiarism. Strategies of
Select-Copy-Paste are the three muskatieers to build
a new entity which is an assembled chapter on new pages of presentable format
of new article engineered by manipulated fabrication under the umbrella of
malpractice. Some detectors have been launched to control over this malpractice
by latest software.
INTRODUCTION:
Malpractice or Dishonesty or Misconduct is any type of cheating
that occurs in relation to a formal exercise. It can include:
·
Plagiarism: The adoption or reproduction of original creations of
another author (person, collective, organization, community or other type of
author, including anonymous authors) without due acknowledgment.
·
Fabrication: The falsification of data, information, or citations
in any formal academic exercise.
·
Deception: Providing false information to an instructor
concerning a formal academic exercise—e.g.,
giving a false excuse for missing a deadline or falsely claiming to have
submitted work.
·
Cheating: Any attempt to give or obtain assistance in a formal
academic exercise (like an examination) without due acknowledgment.
·
Bribery: or paid services. Giving certain test answers for
money.
·
Sabotage: Acting to prevent others from completing their work.
This includes cutting pages out of library books or willfully disrupting the
experiments of others.
·
Professorial misconduct: Professorial acts that are academically fraudulent equate to
academic fraud.
Academic dishonesty has been documented in most every
type of educational setting, from elementary
school to graduate school, and has been met with
varying degrees of approbation throughout history. Today, educated society
tends to take a very negative view of academic dishonesty1,2.
Assemblage
refers to a text "built primarily and explicitly from existing texts in
order to solve a writing or communication problem in a new context". The
concept was first proposed by Johndan Johnson-Eilola (author of Datacloud) and Stuart Selber
in the journal, Computers and Composition, in 2007.
The notion of assemblages builds on remix and remix
practices, which blur distinctions between invented and borrowed work3.
Contract
cheating is a form of academic dishonesty in which students
get others to complete their coursework for them by putting it out to tender.
The term was coined in a 2006 study by Thomas Lancaster and Robert Clarke at the University of Central England in Birmingham
(now known as Birmingham City
University)4,5.
Dealing with contract cheating:
Every
approach for dealing with contract cheating recognises
the distinction between contract cheating and "plagiarism" (uncited copying from books, web etc) and
"collusion" (copying of the work of other students in the same cohort)6.
Contract
cheating can be successfully prevented if all of the following actions are
taken:
·
Do not re-use assignments: Too easy
for students to copy from a previous year; common assignments appear on
"essay mill" sites.
·
Individualise assignments: Harder for students to collude; easier to identify which student posted
the assignment.
·
Vivas and tests to contribute to marks: Makes "outsourcing" a less easy option to
pass a module; provides evidence for "non-originality". (A viva is an
examination which is given orally in universities.)
·
Monitor known sites: And tell
students that you are doing so; look for trends in the characteristics of
contract cheating
·
Change academic regulations: Until
changed, regulations only recognise plagiarism and
collusion; types of evidence needed in contract cheating cases are different.
Figure-1
Copyscape is an online plagiarism detection service that checks whether
similar text content appears elsewhere
on the web. It was launched in 2004 by Indigo
Stream Technologies; Ltd. Copyscape is used by
content owners to detect cases of "content theft", in which content
is copied without permission from one site to another. It is also used by
content publishers to detect cases of content fraud, in which
old content is repackaged and sold as new original content7.
Given a URL of the original content, Copyscape returns a list of web pages that contain similar
text to all or parts of this content. It also shows the matching text
highlighted on the found web page. Copyscape banners
can be placed on a web page to warn potential plagiarists not to steal
content. Copyscape
also provides two paid services: Copysentry monitors
the web and sends notifications by email when new copies are
found, and Copyscape Premium verifies the originality
of content purchased by online content publishers.
Copyscape uses the Google Web
API to power its searches. Copyscape uses
a set of algorithms
to identify copied content that has been modified from its original form.
Independent plagiarism software tests conducted by Professor Debora
Weber-Wulff of the University of
Applied Sciences in Berlin found Copyscape
Premium to be the best performing service.
Copyscape finds online copies of textual content, but not of images or other media.
Copyscape is not able to determine whether a copy is
authorized or unauthorized, nor is it able to determine which
of two websites copied the other. Both of these determinations are left up to
users. Contacting the offending site to have content removed by DMCA notice is also left
up to the user.
Copyright is
the set of exclusive rights granted to the author or
creator of an original work, including the right to copy, distribute and adapt
the work. These rights can be licensed, transferred and/or assigned. Copyright
lasts for a certain time period after which the work is said to enter the public domain.
Copyright applies to a wide range of works that are substantive and fixed in a
medium. Some jurisdictions also recognize "moral rights" of the creator of a
work, such as the right to be credited for the work. The Statute of
Anne 1709, full title "An Act for the Encouragement of
Learning, by vesting the Copies of Printed Books in the Authors or purchasers
of such Copies, during the Times therein mentioned", is now seen as the
origin of copyright law. Since the 19th Century copyright is described under
the umbrella term intellectual property along with patents
and trademarks.
Copyright has been internationally standardized, lasting between fifty and one
hundred years from the author's death, or a shorter period for anonymous or
corporate authorship. Generally, copyright is enforced as a civil matter, though some jurisdictions do
apply criminal
sanctions8.
Copyright infringement (or copyright
violation) is the unauthorized or prohibited use of works covered by copyright law,
in a way that violates one of the copyright owner's exclusive
rights, such as the right to reproduce or perform the copyrighted
work, or to make derivative works. For electronic
and audio-visual media, unauthorized reproduction and distribution is also
commonly referred to as piracy. An early reference to piracy in the context of
copyright infringement was made by Daniel Defoe
in 1703 when he said of his novel The True-Born Englishman
that "Its being Printed
again and again, by Pyrates". The practice of
labeling the act of infringement as "piracy" predates statutory
copyright law. Prior to the Statute of
Anne 1709, the Stationers' Company of London in 1557 received
a Royal Charter
giving the company a monopoly on publication and tasking it with enforcing the
charter. Those who violated the charter were labeled pirates as early as 1603.
The legal basis for this usage dates from the same era, and has been
consistently applied until the present time. Critics of the use of the term
"piracy" to describe such practices contend that it is pejorative and
unfairly equates copyright infringement with more sinister activity9.
Figure-2
Cryptomnesia occurs when a forgotten memory returns without its
being recognised as such by the subject, who believes
it is something new and original. It is a memory bias
whereby a person may falsely recall generating a thought, an idea, a song, or a
joke, not deliberately engaging in plagiarism
but rather experiencing a memory as if it were a new inspiration. Sentences in
scientific papers that are identical to sentences from some of the references
used to write the paper often stem from cryptomnesia10,11.
Essay mill (or
paper mill) is a ghostwriting
service that sells essays
and other homework
writing to university and college students. Since plagiarism
is a form of academic dishonesty or academic
fraud, universities and colleges may investigate papers suspected to
be from an essay mill by using Internet plagiarism detection software, which
compares essays against a database of known essay mill essays and by orally testing
students on the contents of their papers. However, many essay mills guarantee
that a unique essay will be composed by a ghost author and pre-screened with
plagiarism detection software before delivery, and as such will be undetectable
as an essay mill product12.
Fair use is a doctrine
in United States copyright law that allows
limited use of copyrighted material without requiring permission from the
rights holders, such as for commentary, criticism, news reporting, research,
teaching or scholarship. It provides for the legal, non-licensed citation or
incorporation of copyrighted material in another author's work under a
four-factor balancing test. The term
fair use originated in the
United States. A similar principle, fair dealing,
exists in some other common law jurisdictions. Civil law jurisdictions have other limitations and exceptions to
copyright13.
Joke thievery
is the act of performing and taking credit for comic material written by
another person without their consent. This is a form of plagiarism
and sometimes can be copyright infringement. A common term for
joke thievery is "hacking", which is derived from the
term, "Hack-neyed" (Meaning, "over used and thus cheapened, or trite").
Journalism scandals are high-profile incidents or acts, whether intentional or accidental,
that run contrary to the generally accepted ethics and standards of journalism,
or otherwise violate the 'ideal' mission of journalism:
to report news events and issues accurately and fairly14.
Ghostwriter is
a professional writer
who is paid to write books, articles, stories, reports, or other texts that are
officially credited to another person. Celebrities,
executives, and political leaders often hire ghostwriters to draft or edit autobiographies,
magazine articles, or other written material. In music, ghostwriters are used
in film score composition, as well as for writing songs and lyrics for popular
music styles ranging from country to hip-hop.
Ghostwriters may have varying degrees of involvement in the production of a
finished work; while some ghostwriters are hired to edit and clean up a rough
draft, in other cases, ghostwriters do most of the writing based on an outline
provided by the credited author. For some projects, ghostwriters will do a
substantial amount of research, as in the case of a ghostwriter who is hired to
write an autobiography for a well-known person. Ghostwriters are also hired to
write fiction in the style of an existing author, often as a way of increasing
the number of books that can be published by a popular author (e.g., Tom
Clancy, James Patterson). Ghostwriters will often spend from several months to
a full year researching, writing, and editing nonfiction
works for a client, and they are paid either per page, with a flat fee, or a
percentage of the royalties of the sales, or some combination thereof. The
ghostwriter is sometimes acknowledged by the author or publisher for his or her
writing services15.
Music plagiarism
is the use or close imitation of another author's music while representing it
as one's own original work. Plagiarism in music now occurs in two contexts – with a musical idea (that is, a melody or motif)
or sampling (taking a portion
of one sound recording and reusing it in a different song)16.
Personal boundaries are guidelines, rules or limits that a person creates to identify for
themselves what are reasonable, safe and permissible ways for other people to
behave around them and how they will respond when someone steps outside those
limits17.
Scientific
misconduct is the violation of the
standard codes of scholarly conduct and ethical behavior
in professional
scientific
research18. A Lancet
review on Handling of Scientific
Misconduct in Scandinavian countries provides the following sample
definitions: (reproduced in The COPE report 1999)
·
Danish Definition:
"Intention(al) or gross negligence leading to fabrication of the
scientific message or a false credit or emphasis given to a scientist"
·
Swedish
Definition: "Intention (al) distortion of the research process by
fabrication of data, text, hypothesis, or methods from another researcher's
manuscript form or publication; or distortion of the research process in other
ways."
The consequences of scientific misconduct can be severe
at a personal level for both perpetrators and any individual who exposes it. In
addition there are public health implications attached to the promotion of
medical or other interventions based on dubious research findings.
Figure-3
Plagiarism, as defined in the 1995 Random House Compact Unabridged Dictionary, is the use or close
imitation of the language and thoughts of another author and the representation
of them as one's own original work. Within academia,
plagiarism by students, professors, or researchers is considered academic dishonesty or academic fraud, and
offenders are subject to academic censure, up to and including expulsion. In journalism,
plagiarism is considered a breach of journalistic ethics, and reporters caught plagiarizing
typically face disciplinary measures ranging from suspension to termination of
employment. Some individuals caught plagiarizing in academic or journalistic
contexts claim that they plagiarized unintentionally, by failing to include quotations
or give the appropriate citation. While plagiarism in scholarship and journalism has a
centuries-old history, the development of the Internet,
where articles appear as electronic text, has made the physical act of copying
the work of others much easier.
Plagiarism is not the same as copyright infringement. While both terms
may apply to a particular act, they are different transgressions. Copyright
infringement is a violation of the rights of a copyright holder, when material
protected by copyright is used without consent. On the other hand, plagiarism
is concerned with the unearned increment to the plagiarizing author's reputation
that is achieved through false claims of authorship19.
Etymology:
English
Plagiarism (1615–25), earlier plagiary (1590–1600), derives from Latin plagiārius,
"kidnapper", equivalent to plagium, "kidnapping", which contains Latin plaga
("snare", "net"), based on the Indo-European root *-plak, "to weave" (seen for
instance in Greek plekein, Latin plectere,
both meaning "to weave").
Many
students feel pressured to complete papers well and quickly, and with the
accessibility of new technology (the Internet) students can plagiarize by
copying and pasting information from other sources. This is often easily
detected by teachers for several reasons. First, students' choices of sources
are frequently unoriginal; instructors may receive the same passage copied from
a popular source from several students. Second, it is often easy to tell
whether a student used his or her own "voice." Third, students may
choose sources which are inappropriate, inaccurate, or off-topic. Fourth,
lecturers may insist that submitted work is first submitted to an online
plagiarism detector. In the academic world, plagiarism by students is a very
serious offense that can result in punishments such as a failing grade on the
particular assignment (typically at the high school level) or for the course
(typically at the college or university level). For cases of repeated
plagiarism, or for cases in which a student commits severe plagiarism (e.g.,
submitting a copied piece of writing as original work), a student may be
suspended or expelled. In many universities, academic degrees or awards may be
revoked as a penalty for plagiarism.
There
is little academic research into the frequency of plagiarism in high schools.
Much of the research investigated plagiarism at the post-secondary level4.
Of the forms of cheating, (including plagiarism, inventing data, and cheating
during an exam) students admit to plagiarism more than any other. However, this
figure decreases considerably when students are asked about the frequency of
"serious" plagiarism (such as copying most of an assignment or
purchasing a complete paper from a website). Recent use of plagiarism detection
software (see below) gives a more accurate picture of this activity's
prevalence.
For
professors and researchers, plagiarism is punished by sanctions ranging from
suspension to termination, along with the loss of credibility and integrity5.
Charges of plagiarism against students and professors are typically heard by
internal disciplinary committees, which students and professors have agreed to
be bound by20.
Journalism:
Since
journalism's main currency is public trust, a reporter's failure to honestly
acknowledge their sources undercuts a newspaper or television news show's
integrity and undermines its credibility. Journalists accused of plagiarism are
often suspended from their reporting tasks while the charges are being
investigated by the news organization. The ease with which electronic text can
be reproduced from online sources has lured a number of reporters into acts of
plagiarism: Journalists have been caught "copying-and-pasting"
articles and text from a number of websites21.
Online plagiarism:
Content
scraping is a phenomenon of copy and pasting material from Internet websites,
affecting both established sites and blogs. Free online tools are becoming
available to help identify plagiarism, and there is a range of approaches that
attempt to limit online copying, such as disabling right clicking and placing warning banners
regarding copyrights on web pages. Instances of plagiarism that involve
copyright violation may be addressed by the rightful content owners sending a DMCA removal notice to the
offending site-owner, or to the ISP that is hosting the offending site. Plagiarism is not only
the mere copying of text, but also the presentation of another's ideas as one's
own, regardless of the specific words or constructs used to express that idea.
In contrast, many so-called plagiarism detection services can only detect
blatant word-for-word copies of text.\
Figure-4
Other contexts:
Generally,
although plagiarism is often loosely referred to as theft or stealing, it has
not been set as a criminal matter in the courts. Likewise, plagiarism has no
standing as a criminal offense in
the common law.
Instead, claims of plagiarism are a civil law matter, which an aggrieved
person can resolve by launching a lawsuit. Acts that may constitute plagiarism
are in some instances treated as copyright infringement, unfair competition, or a violation of the
doctrine of moral rights. The increased availability of intellectual property due to a rise in
technology has furthered the debate as to whether copyright offences are
criminal.
Chris just found some good stuff on the Web for his
science report about sharks. He highlights a paragraph that explains that most
sharks grow to be only 3 to 4 feet long and can't hurt people. Chris copies it
and pastes it into his report. He quickly changes the font so it matches the
rest of the report and continues his research.
Uh-oh. Chris just made a big mistake. Do you know what he
did? He committed plagiarism (say: play-juh-rih-zem). Plagiarism is when you
use someone else's words or ideas and pass them off as your own. It's not allowed
in school, college, or beyond, so it's a good idea to learn the proper way to
use resources, such as websites, books, and magazines.
Plagiarism is a form of cheating,
but it's a little complicated so a kid might do it without understanding that
it's wrong. Chris should have given the author and the website credit for the
information. Why? Because Chris didn't know this information before he came to
the website. These aren't his thoughts or ideas.
Defenses:
A
famous passage of Laurence Sterne's 1767 Tristram Shandy, condemns plagiarism by resorting to plagiarism. Oliver
Goldsmith commented:
Sterne's
Writings, in which it is clearly shown, that he, whose manner and style were so
long thought original, was, in fact, the most unhesitating plagiarist who ever
cribbed from his predecessors in order to garnish his own pages? It must be
owned, at the same time, that Sterne selects the materials/ of his mosaic work
with so much art, places them so well, and polishes them so highly, that in
most cases we are disposed to pardon the want of originality, in consideration
of the exquisite talent with which the borrowed materials are wrought up into
the new form22.
Figure-5
On
December 6, 2006, Thomas Pynchon joined a campaign by many
other major authors to clear Ian McEwan of
plagiarism charges by sending a typed letter to his British publisher, which
was published in the Daily
Telegraph newspaper. Playwright Wilson Mizner said "If you copy from one author, it's plagiarism. If you copy from two, it's research". American author Jonathan
Lethem delivered a passionate defense of the use of plagiarism in
art in his 2007 essay "The ecstasy of influence: A plagiarism" in Harper's. He wrote: "The
kernel, the soul—let us go further and say the substance, the bulk, the actual
and valuable material of all human utterances—is plagiarism" and
"Don't pirate my editions; do plunder my visions. The name of the game is
Give All. You, reader, are welcome to my stories. They were never mine in the first place, but I gave them to you".
Self-plagiarism also known as "recycling fraud" is the reuse
of significant, identical, or nearly identical portions of one’s own work
without acknowledging that one is doing so or without citing the original work.
Articles of this nature are often referred to as duplicate or multiple publications. In
addition to the ethical issue, this can be illegal if copyright of the prior
work has been transferred to another entity. Typically, self-plagiarism is only
considered to be a serious ethical issue in settings where a publication is
asserted to consist of new material, such as in academic publishing or
educational assignments23. It does not apply (except in the legal
sense) to public-interest texts, such as social, professional, and cultural
opinions usually published in newspapers and magazines.
In
academic fields, self-plagiarism is when an author reuses portions of their own
published and copyrighted work in subsequent publications, but without
attributing the previous publication. Identifying self-plagiarism is often
difficult because limited reuse of material is both legally accepted (as fair use)
and ethically accepted. It is common for university researchers to rephrase and
republish their own work, tailoring it for different academic journals and
newspaper articles, to disseminate their work to the widest possible interested
public. However, it must be borne in mind that these researchers also obey
limits: If half an article is the same as a previous one, it will usually be rejected.
One of the functions of the process of peer review
in academic writing is to prevent this type of "recycling".
The concept of self-plagiarism:
The
concept of "self-plagiarism" has been challenged as
self-contradictory or an oxymoron. For example, Stephanie J. Bird argues that
self-plagiarism is a misnomer, since by definition plagiarism concerns the use
of others' material24. However, the phrase is used to refer to
specific forms of potentially unethical publication. Bird identifies the
ethical issues sometimes called "self-plagiarism" as those of
"dual or redundant publication." She also notes that in an educational
context, "self-plagiarism" may refer to the case of a student who
resubmits "the same essay for credit in two different courses." As
David B. Resnik clarifies, "Self-plagiarism
involves dishonesty but not intellectual theft".
According to Patrick M. Scanlon:
"Self-plagiarism"
is a term with some specialized currency. Most prominently, it is used in
discussions of research and publishing integrity in biomedicine, where heavy
publish-or-perish demands have led to a rash of duplicate and “salami-slicing”
publication, the reporting of a single study's results in "least publishable units"
within multiple articles. Roig (2002) offers a useful
classification system including four types of self-plagiarism: duplicate
publication of an article in more than one journal; partitioning of one study
into multiple publications, often called salami-slicing; text recycling; and
copyright infringement.
Self-plagiarism and codes of ethics:
Some
academic journals have codes of ethics which specifically refer to
self-plagiarism. For example, the Journal of International Business Studies.
Some professional organizations like the Association for Computing Machinery
(ACM) have created policies that deal specifically with self-plagiarism. Other organisations do not make specific reference to
self-plagiarism: The American Political Science Association (APSA) has
published a code of ethics which describes plagiarism as "deliberate
appropriation of the works of others represented as one's own." It does
not make any reference to self-plagiarism. It does say that when a thesis or
dissertation is published "in whole or in part", the author is
"not ordinarily under an ethical obligation to acknowledge its
origins"24.
Figure-6
The
American Society for Public Administration (ASPA) has published a code of
ethics which says its members are committed to: "Ensure that others
receive credit for their work and contributions," but it does not make any
reference to self-plagiarism.
Factors that justify reuse:
Pamela
Samuelson in 1994 identified several factors which excuse reuse of
one's previously published work without the culpability of self-plagiarism25.
She relates each of these factors specifically to the ethical issue of
self-plagiarism, as distinct from the legal issue of fair use of copyright,
which she deals with separately. Among other factors which may excuse reuse of
previously published material Samuelson lists the following:
1.
The previous work
needs to be restated in order to lay the groundwork for the contribution in the
second work.
2.
The previous work
needs to be restated in order to lay the groundwork for a new contribution in
the second work.
3.
Portions of the
previous work must be repeated in order to deal with new evidence or arguments.
4.
The audience for
each work is so different that publishing the same work in different places was
necessary to get the message out.
5.
The author thinks
they said it so well the first time that it makes no sense to say it
differently a second time.
Samuelson
states she has relied on the "different audience" rationale when
attempting to bridge interdisciplinary communities. She refers to writing for
different legal and technical communities, saying: "there are often
paragraphs or sequences of paragraphs that can be bodily lifted from one
article to the other. And, in truth, I lift them." She refers to her own
practice of converting "a technical article into a law review article with
relatively few changes--adding footnotes and one substantive section" for
a different audience.
Samuelson
describes misrepresentation as the basis of self-plagiarism. She seems less
concerned about reuse of descriptive materials than ideas and analytical
content. She also states “Although it seems not to have been raised in any of
the self-plagiarism cases, copyrights law’s fair use defense would likely
provide a shield against many potential publisher claims of copyright
infringement against authors who reused portions of their previous works."
As a practical issue:
In
addition to legal and ethical concerns, plagiarism is frequently also a
practical issue, in that it is frequently useful to consult the sources used by
an author, and plagiarism makes this more difficult. There are a number of
reasons why this is useful:
·
An author may
commit an error in how they interpret or use a source, and consulting the
original source allows these errors to be detected.
·
Authors generally
only supply the portions of prior works that are directly relevant to the work
at hand. Other portions of their sources are likely to be relevant to later
extensions and generalizations of their work.
·
As modern
automated indexing methods become prevalent, references between works provide
valuable information about their authoritativeness and how closely works are
related; this helps to locate relevant works.
Organizational
publications:
Plagiarism
is presumably not an issue when organizations issue collective unsigned works
since they do not assign credit for originality to particular people. For
example, the American Historical Association's
"Statement on Standards of Professional Conduct" (2005) regarding
textbooks and reference books states that, since textbooks and encyclopedias
are summaries of other scholars' work, they are not bound by the same exacting
standards of attribution as original research and may be allowed a greater
"extent of dependence" on other works. However, even such a book does
not make use of words, phrases, or paragraphs from another text or follow too
closely the other text's arrangement and organization, and the authors of such
texts are also expected to "acknowledge the sources of recent or
distinctive findings and interpretations, those not yet a part of the common
understanding of the profession"26.
Within
an organization, in its own working documents, standards are looser but not
non-existent. If someone helped with a report, they may expect to be credited.
If a paragraph comes from a law report, a citation is expected to be written
down. Technical manuals routinely copy facts from other manuals without
attribution, because they assume a common spirit of scientific endeavor (as
evidenced, for example, in free and open source software projects) in which
scientists freely share their work.
The
Microsoft Manual of Style for
Technical Publications Third Edition (2003) by Microsoft does not even
mention plagiarism, nor does Science
and Technical Writing: A Manual of Style, Second Edition (2000) by
Philip Rubens. The line between permissible literary and impermissible source
code plagiarism, though, is apparently quite fine. As with any technical field,
computer programming makes use of what others have contributed to the general
knowledge.
Wikiversity
has learning materials about Plagiarism |
·
Copyscape (website for detecting Internet plagiarism)
·
Fair use
·
Journalism scandals (plagiarism, fabrication,
omission)
·
List of plagiarism controversies
Plagiarism detection is the process of locating instances of plagiarism
within a work or document. The widespread use of computers and the advent of
the Internet has made it easier to plagiarize the work
of others. Most cases of plagiarism are found in academia, where documents are
typically essays or reports. However, plagiarism can be found in virtually any
field, including scientific papers, art designs, and source code. Detection can
be either manual or computer-assisted. Manual detection requires substantial
effort and excellent memory, and is impractical in cases where too many
documents must be compared, or original documents are not available for
comparison. Computer-assisted detection allows vast collections of documents to
be compared to each other, making successful detection much more likely27.
Use of search engines:
An internet search engine
can be used to look for certain keywords or key sentences from a suspected
document on the World Wide Web. This method can be highly effective when used
on small and characteristic fragments, for instance a poem or a poetic
translation. Although it can easily detect blatant cases, it is less effective
when the plagiarizer has mixed multiple small fragments from different sources,
and will not return any relevant results if the search engine has not indexed
the original source or sources. Also, considerable effort is required to
investigate each suspected case28.
Figure-7
Plagiarism detection systems:
A
plagiarism detection system compares suspect documents to a large collection
(corpus) of other documents and attempts to match parts of the suspect document
to parts of those in the corpus. As with search engines, no plagiarism can be
detected unless the corpus contains the documents from which the suspect has
copied29.
Academic program plagiarism:
Plagiarism
in computer code is also frequent, and requires different tools than those
found in textual document plagiarism. Significant research has been dedicated
to academic source-code plagiarism. A distinctive aspect of source-code
plagiarism is that there are no essay mills,
such as can be found in traditional plagiarism. Since most programming
assignments expect students to write programs with very specific requirements,
it is very difficult to find existing programs that meet them. Since
integrating external code is often harder than writing it from scratch, most
plagiarizing students choose to do so from their peers.
According
to Roy and Cordy, source-code similarity detection
algorithms can be classified as based on either
·
Strings - look for
exact textual matches of segments, for instance five-word runs. Fast, but can
be confused by renaming identifiers.
·
Tokens - as with
strings, but using a lexer to convert the program into tokens first. This
discards whitespace, comments, and identifier names, making the system more
robust to simple text replacements. Most academic plagiarism detection systems
work at this level, using different algorithms to measure the similarity
between token sequences.
·
Parse Trees
- build and compare parse trees. This allows higher-level similarities to be
detected. For instance, tree comparison can normalize conditional statements,
and detect equivalent constructs as similar to each other.
·
Program
Dependency Graphs (PDGs) - a PDG captures the actual flow of control
in a program, and allows much higher-level equivalences to be located, at a
greater expense in complexity and calculation time.
·
Metrics - metrics
capture 'scores' of code segments according to certain criteria; for instance,
"the number of loops and conditionals", or "the number of
different variables used". Metrics are simple to calculate and can be
compared quickly, but can also lead to false positives: two fragments with the
same scores on a set of metrics may do entirely different things.
·
Hybrid approaches
- for instance, parse trees + suffix trees can combine the detection
capability of parse trees with the speed afforded by suffix trees, a type of
string-matching data structure.
Academic text-document plagiarism:
General
design of academic plagiarism detection systems geared for text documents
include a number of factors:
Most
large-scale plagiarism detection systems use large, internal databases (in
addition to other resources) that grow with each additional document submitted
for analysis. However, this feature is considered by some as a violation of student copyright30.
The
previous classification was developed for code
refactoring, and not for academic plagiarism detection (an important
goal of refactoring is to avoid duplicate code, referred to as code clones in the
literature). The above approaches are effective against different levels of
similarity; low-level similarity refers to identical text, while high-level
similarity can be due to similar specifications. In an academic setting, when
all students are expected to code to the same specifications, functionally
equivalent code (with high-level similarity) is entirely expected, and only
low-level similarity is considered as proof of cheating31.
Factor |
Description and alternatives |
Scope of search |
In the public internet, using search
engines / Institutional databases / Local, system-specific database. |
Analysis time |
Delay between the time a document is submitted
and the time when results are made available. |
Document capacity / Batch processing |
Number of documents the system can
process per unit of time. |
Check intensity |
How often and for which types of
document fragments (paragraphs, sentences, fixed-length word sequences) does
the system query external resources, such as search engines. |
Comparison algorithm type |
The algorithms that define the way the
system uses to compare documents against each other. |
Precision and Recall |
Number of documents correctly flagged as
plagiarized compared to the total number of flagged documents, and to the
total number of documents that were actually plagiarized. High precision
means that few false
positives were found, and high recall means that few false negatives were
left undetected. |
Figure-8
Duplicate publication refers to publishing the same intellectual material
more than once, by the author or publisher. It does not refer to the unauthorized republication
by someone else, which constitutes plagiarism,
copyright violation, or both.
There are several forms of duplicate publication:
Legitimate derivatives:
This is deliberate republication in another format,
such as the simultaneous publication of a motion picture and a tie-in book. In
doing this, it is necessary to respect copyright, for the rights to a
derivative of the original work remains with the author of the work, or the
publisher or other party to whom the author has assigned the copyright32.
Is another kind of copyright violation? There are two
basic kinds of self-plagiarism, when the similar (or identical) articles appear
one by one (republishing) and simultaneously (multiple submission)33.
As a research-paper is an implicit claim of furthering
knowledge, the researcher must state what exactly the claim of novelty is. This
would let the editor rate the article in view of their policy. For example,
all/most would reject a paper if already published in another journal, although
may tolerate (as IEEE does) a re-edited [and expanded] conference paper.
However, the line is blurred when safety issues
are involved. For example, if an author publishes a study which shows that a
particular product or design is faulty, then it is important that the
information is spread as widely as possible. Such re-publication is justified
in the public interest.
Multiple submissions are not plagiarism, but it is
considered as serious academic misbehavior. Even when a publication fee is
paid, it nonetheless wastes the most important resource in academic publishing:
the time and work of the referees and the editors, and contributes to the
problem it is intended to solve, the slow speed of editorial review. And there
is the unfortunate possibility that more than one journal will accept it. As
there is no time for feedback from readers, the same errors appear in various journals34,35.
Duplicate submission can be defended. The slowness of
academic editing is so great that if an author waits until the decision of the
first publisher is known, the submission to the second journal may take place a
whole year later. (Some Scientific journals, for example, JOSAB, may keep a paper
during more than one month to analyzing possible candidates for the reviewing).
Yet that is a waste of resource, in the case of those journals which do not get
the paper to publish.
A researcher, after obtaining and verifying his
extraordinary result, wants to publish it in Physical Review Letters but is afraid that
the paper will be rejected, and his competitors will obtain and publish the
result earlier. So, the researcher submits the same result to several national scientific journal(s) of low impact factor
over the world: 'Proceedings of Springfield University', ' Revista Científica de Guacatelamala', 'Le Courier
Scientific de la Republic Democratic Cannibas',
'Научный Бюллетень
Myxосранского
Технологического
Института',
'Journal der Angewandten Chemieinstituts von Kuyzad', 保險海套大学の仕訳
(6 examples of low-impact factor journals, which actually do not exist), and so
on.
That is, if 15 journals nod that paper, while only one
of them may publish, that is a waste—for 14 of them. Most (all?) journals and
popular magazines already tell the potential author not to do that.
Alternatively, for an application to the graduate school, the university does
not ban that. If that model (request a fee) is/were popular also in the publication
field, the "waste of resource" would relate to only the money from
out of the pocket of the applicant.
It occasionally happens that author(s) publishes the
same article twice, whether in the same or different journals. For example,
compare papers and; these articles differ with only titles36.
With the advancement of the internet, there are now
several tools available to aid in the detection of plagiarism
and multiple publications within biomedical literature. One tool developed in
2006 by researchers in Dr. Harold Garner's laboratory at University of
Texas Southwestern Medical Center at Dallas is Déjà Vu, an open-access database
containing several thousand instances of duplicate publication.
Plagiarism
Detector requirements:
http://www.plagiarism-detector.com/setup/pd_setup.exe
·
Microsoft Windows
2000 SP 0,1,2,3,4
·
Microsoft Windows
XP, SP 0,1,2,3
·
Microsoft Windows
Vista, SP 0,1
·
Installation
requires Administrative privileges!
·
Dot NET 2.0
Framework is required.
·
Microsoft Office
2000 and higher is required to access
MS Word documents.
·
Internet access ir required. Broadband connection is recommended.
·
new! Microsoft
Windows7
·
Microsoft Internet
Explorer 6.0 or 7.0, Firefox 1.5 or 2.0,
Mozilla 1.7
·
Intel® Pentium®
III or equivalent processor
·
100 Mb of free
storage space
·
128MB of RAM
(256MB recommended for large documents).
REFERENCES:
1.
Donald L. McCabe, Linda Klebe Trevino, and
Kenneth D. Butterfield, "Academic Integrity in Honor Code and Non-Honor
Code Environments: A Qualitative Investigation", The Journal of Higher Education 70, no. 2 (March-April 1999), 213.
2.
Donald L. McCabe, Kenneth D. Butterfield, and Linda Klebe
Trevino, "Faculty and Academic Integrity: The Influence of Current Honor
Codes and Past Honor Code Experiences," Research in Higher Education 44, no. 3 (June 2003), 368.
3.
Selber and Johnson-Eilola,
Plagiarism, Originality, Assemblage, Computers and Composition, Vol. 24, No. 4.
(2007), pp. 375-403.
4.
"Student
cheats contract out work". BBC/bbc.com. 2006-06-12.
5.
Liz Lightfoot (2006-06-13). "Cheating
students put assignments out to tender on the internet".
Telegraph/telegraph.co.uk. http://www.telegraph.co.uk/news/main.jhtml?xml=/news/2006/06/13/ncheat13.xml
and sSheet=/news/2006/06/13/ixuknews.html/. http://www.igi-pub.com/referenc
6.
http://plagiat.htw-berlin.de/software/2008/
7.
Dowd, Raymond J. (2006). Copyright Litigation Handbook (1st
ed. ed.). Thomson West.
8.
Castle Rock
Entertainment, Inc. v. Carol Publishing Group, 150 F.3d 132, 140 (2nd Cir.
1998).
9.
Taylor, F..K. (1965).Cryptomnesia
and plagiarism. British Journal of Psychiatry, 111, 1111–1118.
10.
Brown, A. S., and Halliday, H. E.
(1991). Cryptomnesia and source memory difficulties.
American Journal of Psychology, 104, 475–490.
11.
James Page. 2004. 'Cyber-pseudepigraphy: A New
Challenge for Higher Education Policy and Management'. Journal of Higher Education Policy and Management.
26(3):429-433.
12. Depoorter, Ben; Parisi,
Francesco (2002). "Fair Use and Copyright Protection: A Price Theory
Explanation". International
Review of Law and Economics 21
(4): 453–473
13.
http://stason.org/TULARC/art/hack-stand-up-comedy
14.
Studdert et al., (2004) Financial Conflicts of Interest in
Physicians' Relationships with the Pharmaceutical Industry — Self-Regulation in
the Shadow of Federal Prosecution, NEJM 351:1891-2000
15.
J. Michael Keyes, "Musical Musings: The Case for Rethinking Music
Copyright Protection", 10 Mich. Telecomm. Tech. L. Rev. 407 (2004)
16.
http://www.outofthefog.net/CommonNonBehaviors/Boundaries.html
17. Nylenna, Magne; Daniel
Andersen, Gisela Dahiquist, Matti
Sarvas, Asbjørn Aakvaag (July 3, 1999). "Handling of scientific
dishonesty in the Nordic countries". The Lancet
354
18.
Kock, N., and Davison, R. (2003). Dealing with
plagiarism in the IS research community: A look at factors that drive
plagiarism and ways to address them. MIS Quarterly, 27(4), 511-532
19.
Clarke, R. (2006). Plagiarism by
academics: More complex than it seems. Journal of the Association for Information Systems, 7(2), 91-121
20.
http://www.famousplagiarists.com/
21.
See for example Dellavalle, Robert P., Banks,
Marcus A. and Ellis, Jeffrey I. (2007). "Frequently asked questions
regarding self-plagiarism: How to avoid recycling fraud." Journal of the American Academy of
Dermatology, Vol. 57 (3), September, pp.527.
doi:10.1016/j.physletb.2003.10.071
22.
Broome, Marion E. (2004). "Self-plagiarism: oxymoron, fair use, or
scientific misconduct?" Nursing
Outlook, Vol. 52 (6), November, pp.273-274.
24.
"A Case
Study on Computer Programs", Global dimensions of intellectual property rights in science and
technology, Part 3, Editors Mitchel B. Wallerstein, Mary Ellen Mogee,
Roberta A. Schoen, National Academies Press, 1993
25.
http://www.historians.org/PUBS/Free/ProfessionalStandards.cfm
26.
http://www.plagiarism-detector.com/
27. Ross,
Nancy; Wolfram, Dietmar (2000). "End user searching
on the Internet: An analysis of term pair topics submitted to the Excite search
engine". Journal of the American
Society for Information Science 51
(10): 949–958.
28. Xie, M.; et
al. (1998). "Quality dimensions of Internet search engines". Journal of Information Science 24 (5): 365–372.
29.
http://www.plagiarismscanner.com/
30.
Anderson, G, L. (1999). Cyberplagiarism: A
look at the Web term paper sites. College and Research Library News, 60, 371-73, 394.
31.
Auer, N. J., and Krupar, E. M.
(2001). Mouse click plagiarism: The role of technology
in plagiarism and the librarian's role in combatting
it. Library Trends, 49,
415-433.
32. A.Giesen (2004). "Thin-disk
solid-state lasers". Proceedings
of SPIE
5620: 112–127
33.
http://www.daveyp.com/blog/
34.
http://en.wikipedia.org/wiki/Multiple_publication
35.
http://www.djreprints.com/licensing/syndication.html
Received
on 25.11.2010
Accepted on 12.12.2010
© A & V Publication all right reserved
Research
J. Science and Tech. 2(6): Nov.
-Dec. 2010: 134-144