Online deception is a major problem and especially prevalent in textual material, examples ranging from e-mail spam through fake online reviews to fake news. Modern natural language processing (NLP) techniques allow classifying text based on complex textual feature correlations involving word- or character-sequences, potentially appended with additional grammatical or semantic information. This project aims at developing and investigating NLP methods for both offense and defense involving different types of textual deception. In particular, we identify weaknesses in existing detection schemes by devising attacks against them, and aim to improve the methods to be more resistant toward such attacks. Our current and recent work include
- Hate-speech detection: Hateful and offensive speech in social networks and other online fora is a persistent and toxic problem. We explore the effectiveness of text analysis techniques in detecting hate-speech. Our paper on hate speech detection systematically and empirically comparing state of the art hate-speech detection techniques appears at the ACM Workshop on Artificial Intelligence and Security.
- Automatic generation of restaurant reviews: On-line reviews play an important role in swaying people’s opinions about products and services. We explored techniques for generating review text that take into account both grammar and semantic factors to maximize the text’s appropriateness to the context as well as believability for a human reader. Our goal is to estimate the extent to which attacks generated in this way can be mitigated by automatic detection methods. Our paper on automatic fake restaurant review generation appeared at the European Symposium on Research in Computer Security (ESORICS 2018)
- Author identification and deanonymization: As NLP methods present not only a tool for improving security but a privacy breach by allowing profiling or deanonymization, we also study counter-methods to avoid detection. Our goal here is to develop automatic methods to avoid author profiling by transforming text in ways that remove or alter relevant category-indicative features. Our paper on writing style transfer appears in Privacy Enhancing Technologies (PETS2020).
- Data augmentation for improving toxic language classification: Toxic language datasets tend to be small and imbalanced. We investigate the effect of training set augmentation via synthetic data generated from original toxic documents. Our paper comparing multiple techniques and classifier architectures has been accepted to Findings of ACL: EMNLP 2020.
- September 2020: Our paper A little goes a long way: Improving toxic language classification despite data scarcity has been accepted to Findings of ACL: EMNLP 2020. arXiv preprint: https://arxiv.org/abs/2009.12344
- June 2020: Our paper Effective writing style transfer via combinatorial paraphrasing has been accepted to Privacy Enhancing Technologies Symposium (PETS2020). arXiv preprint https://arxiv.org/abs/1905.13464
- May 2019: Our paper Text Analysis in Adversarial Settings: Does Deception Leave a Stylistic by Trace? by Tommi Gröndahl and N. Asokan has been accepted to ACM Computing Surveys (ACM CSUR) 2019. arXiv preprint arxiv.org/abs/1902.08939
- September 2018: Our paper on automatic fake restaurant review generation is presented at the European Symposium on Research in Computer Security (ESORICS 2018)
- August 2018: Our paper on hate speech detection has been accepted for presentation and publication at the ACM Workshop on Artificial Intelligence and Security.
- N. Asokan (Professor)
- Tommi Gröndahl (Doctoral candidate)
- Mika Juuti (PhD, postdoctoral researcher)
- Andrei Kazlouski (Research Assistant)
- Luca Pajola (Research Assistant)