Machine-Learning-Based Spam Mail Detector

Panem Charanarur, Harsh Jain, G. Srinivasa Rao, Debabrata Samanta, Sandeep Singh Sengar*, Chaminda Thushara Hewage

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

The proliferation of spam emails, a predominant form of online harassment, has elevated the significance of email in daily life. As a consequence, a substantial portion of individuals remain vulnerable to fraudulent activities. Despite Gmail’s “spam mail filtration system,” its effectiveness is not absolute. It occasionally misclassifies legitimate messages, leading to their confinement in the spam folder, or overlooks potentially harmful spam emails. This results in the occurrence of false positives. This research scrutinizes the historical data, cookies, caches, Session Restores, flash artifacts, and super cookies of Internet Explorer, Firefox, and Chrome on the Windows 10 platform. Data was collected through Google, Firefox, and Internet Explorer, operating within the Windows 10 environment. It has been observed that browsers store user behavior data on the host computer’s hard drive. The implications of this study hold substantial value for computer forensics researchers, law enforcement professionals, and digital forensics experts. The study leverages Python, alongside pertinent libraries such as pandas, Numpy, Matplotlib, scikit-learn, and flask, to facilitate its investigation. The experiment result and analysis show KN and NB algorithms have the best accuracy and precision score compared to other Algorithms.

Original languageEnglish
Article number858
JournalSN Computer Science
Volume4
Issue number6
DOIs
Publication statusPublished - 8 Nov 2023

Keywords

  • Cookies
  • E-mail
  • Inbox
  • Spam mail
  • Trap
  • Trash folder

Cite this