Compare articles, find the Real Reason behind peaks


The emergence of large-scale online digital systems that harvest and mediate access to collective human behavior and knowledge will have an unprecedented impact on social science research.

However, many researchers lack the expertise and computational resources necessary to transform the raw data resulting from these systems into meaningful metrics that can be employed in scientific modeling and analysis.

When expertise and computational resources are coincident for individual researchers or research groups, the resulting fruits of their labor are often siloed and seldom shared to the widespread scientific community.

In this project, we aim to develop a set of tools that will provide researchers with real-time access to meaningful, rich and context-inclusive metrics on emerging events throughout the world.

To accomplish this, we will provide real-time analysis of raw data on the content, context, production and consumption of information on Wikipedia.

We will provide API access to high-level measures on emerging events as well as interface tools for visualization, exploration and analysis.

Our end goal is to allow scientific researchers the ability to easily pull the high-level output of these tools and incorporate or embed it into their existing research analysis methodologies without requiring acquisition of advanced technical skills.

We have built a prototype proof of concept event detection and exploration system using processed data from Wikipedia activity, updated on an hourly basis, since 2008.

We characterize consumption and production of information through hourly traffic and page edit statistics for all (over 32 million) Wikipedia pages as well as the content (including structured history or revision) and create exportable efficient data formats.

The current tool allows for real-time (hourly) indexed search, variable scale visualization, and comparison of information consumption and production statistics for pages and is comparable to google trends in scope, yet provides multiple additional options (such as the ability to compare cross-cultural trends across language-matched analogous pages), a more dynamic interface and immediate easy data export.

For any help, question or idea

please contact us


  • Use the following examples on the right to see interesting use cases