Online deception is a major problem and especially prevalent in textual material, examples ranging from e-mail spam through fake online reviews to fake news. Modern natural language processing (NLP) techniques allow classifying text based on complex textual feature correlations involving word- or character-sequences, potentially appended with additional grammatical or semantic information. This project aims at developing and investigating NLP methods to detect different types of textual deception, with a particular emphasis on hate speech and trolling. In particular, we identify weaknesses in existing detection schemes by devising attacks against them, and aim to improve the methods to be more resistant toward such attacks.

As NLP methods present not only a tool for improving security but a privacy breach by allowing profiling or deanonymization, we also study counter-methods to avoid detection. Our goal here is to develop automatic methods to avoid author profiling by transforming text in ways that remove or alter relevant category-indicative features.

An additional line of study concerns automatic generation of deceptive text, as exemplified by e.g. the production of fake online reviews for crowdturfing purposes. We are developing methods for generating text that take into account both grammar and semantic factors to maximize the text’s appropriateness to the context as well as believability for a human reader. Our goal is to estimate the extent to which attacks generated in this way can be mitigated by automatic detection methods.


Quick links:


Aalto University:

  • N. Asokan (Professor)
  • Tommi Gröndahl (Doctoral candidate)
  • Mika Juuti (Doctoral candidate)

Waseda University/National Institute of Information and Communications Technology (NICT), Japan:

  • Bo Sun

University of Padua, Italy:

  • Luca Pajola (Master’s student)