Topick: Accurate Topic Distillation for User Streams

ABSTRACT

Users of today's information networks need to digest large amounts of data. Therefore, tools that ease the task of filtering the relevant content are becoming necessary. One way to achieve this is to identify the users who generate content in a certain topic of interest. However, due to the diversity and ambiguity of the shared information, assigning users to topics in an automatic fashion is challenging. We propose   Topick , a system that leverages state of the art techniques and tools to automatically distill high-level topics for a given user.   Topick  exploits both the user stream and her profile information to accurately identify the most relevant topics. The results are synthesised as a set of stars associated to each topic, designed to give an intuition about the topics encompassed in the user streams and the confidence in the results. Our prototype achieves a precision of 70% or more, with a recall of 60%, relative to manual labeling.
 
 To get a sense of how  Topick works you can try it here, watch the video demo, or check the paper. For each user query, Topick uses:
  •   Twitter's API to get users' latest tweets, and
  •    to analyze users' profile URLs. 

Note that  Topick  does not store user tweets, might not classify users that tweet in other languages than English, preforms only text analysis, and recognizes only Twitter screen names (i.e., Twitter ids).