Unsafe websites consist of malicious as well as inappropriate sites, such as those hosting questionable or offensive content. Website reputation systems are intended to help ordinary users steer away from these unsafe sites. However, the process of assigning safety ratings for websites typically involves humans. Consequently it is time consuming, costly and not scalable. This has resulted in two major problems: (i) a significant proportion of the web space remains unrated and (ii) there is an unacceptable time lag before new websites are rated. In this project, we investigate whether we can efficiently and effectively predict the eventual reputation rating of an unrated website. We also investigate whether similar techniques can be used to automatically categorize websites.

Results

  • Paper: LookAhead: Augmenting Crowdsourced Website Reputation Systems With Predictive ModelingFull version: at arXiv
  • Poster
  • talk