The Lie Detector: Explorations in the Automatic Recognition of Deceptive Language Page: 4
The following text was automatically extracted from the image on this page using optical character recognition software:
Training Test NB SVM
DEATH PENALTY + BEST FRIEND ABORTION 62.0% 61.0%
ABORTION + BEST FRIEND DEATH PENALTY 58.7% 58.7%
ABORTION + DEATH PENALTY BEST FRIEND 58.7% 53.6%
AVERAGE 59.8% 57.8%
Table 3: Cross-topic classification results
Class Score Sample words
METAPH 1.71 god, die, sacred, mercy, sin, dead, hell, soul, lord, sins
YOU 1.53 you, thou
OTHER 1.47 she, her, they, his, them, him, herself, himself, themselves
HUMANS 1.31 person, child, human, baby, man, girl, humans, individual, male, person, adult
CERTAIN 1.24 always, all, very, truly, completely, totally
OPTIM 0.57 best, ready, hope, accepts, accept, determined, accepted, won, super
I 0.59 I, myself, mine
FRIENDS 0.63 friend, companion, body
SELF 0.64 our, myself, mine, ours
INSIGHT 0.65 believe, think, know, see, understand, found, thought, feels, admit
Table 4: Dominant word classes in deceptive text, along with sample words.
human-related word classes (YOU, OTHER, HU-
MANS) represent detachment from the self, as if
trying not to have the own self involved in the
lies. Instead, the classes of words that are closely
connected to the self (I, FRIENDS, SELF) are lack-
ing from deceptive text, being dominant instead in
truthful statements, where the speaker is comfort-
able with identifying herself with the statements
Also interesting is the fact that words related
to certainty (CERTAIN) are more dominant in de-
ceptive texts, which is probably explained by the
need of the speaker to explicitly use truth-related
words as a means to emphasize the (fake) "truth"
and thus hide the lies. Instead, belief-oriented vo-
cabulary (INSIGHT), such as believe, feel, think,
is more frequently encountered in truthful state-
ments, where the presence of the real truth does
not require truth-related words for emphasis.
In this paper, we explored automatic techniques
for the recognition of deceptive language in writ-
ten texts. Through experiments carried out on
three data sets, we showed that truthful and ly-
ing texts are separable, and this property holds
for different data sets. An analysis of classes of
salient features indicated some interesting patterns
of word usage in deceptive texts, including detach-
ment from the self and vocabulary that emphasizes
certainty. In future work, we plan to explore the
role played by affect and the possible integration
of automatic emotion analysis into the recognition
of deceptive language.
B. DePaulo, J. Lindsay, B. Malone, L. Muhlenbruck,
K. Charlton, and H. Cooper. 2003. Cues to decep-
tion. Psychological Bulletin, 129(1):74-118.
J. Hirschberg, S. Benus, J. Brenier, F. Enos, S. Fried-
man, S. Gilman, C. Girand, M. Graciarena,
A. Kathol, L. Michaelis, B. Pellom, E. Shriberg,
and A. Stolcke. 2005. Distinguishing decep-
tive from non-deceptive speech. In Proceedings of
INTERSPEECH-2005, Lisbon, Portugal.
M. Koppel, S. Argamon, and A. Shimoni. 2002. Au-
tomatically categorizing written texts by author gen-
der. Literary and Linguistic Computing, 4(17):401-
R. Mihalcea and C. Strapparava. 2006. Learning to
laugh (automatically): Computational models for
humor recognition. Computational Intelligence,
M. Newman, J. Pennebaker, D. Berry, and J. Richards.
2003. Lying words: Predicting deception from lin-
guistic styles. Personality and Social Psychology
J. Pennebaker and M. Francis. 1999. Linguistic in-
quiry and word count: LIWC. Erlbaum Publishers.
R. Snow, B. O'Connor, D. Jurafsky, and A. Ng. 2008.
Cheap and fast - but is it good? evaluating non-
expert annotations for natural language tasks. In
Proceedings of the Conference on Empirical Meth-
ods in Natural Language Processing, Honolulu,
L. Zhou, J Burgoon, J. Nunamaker, and D. Twitchell.
2004. Automating linguistics-based cues for detect-
ing deception in text-based asynchronous computer-
mediated communication. Group Decision and Ne-
This paper can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Paper.
Mihalcea, Rada, 1974- & Strapparava, Carlo, 1962-. The Lie Detector: Explorations in the Automatic Recognition of Deceptive Language, paper, 2009; (https://digital.library.unt.edu/ark:/67531/metadc31019/m1/4/: accessed March 24, 2019), University of North Texas Libraries, Digital Library, https://digital.library.unt.edu; crediting UNT College of Engineering.