Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Published on Jun 4, 2014
Common Crawl meets MIA -- Gathering and Crunching Open Web Data.
As the largest and most diverse collection of information in human history, the Web grants us tremendous insight if we can only understand it better. For example, Web crawl data can be used to spot trends and identify patterns in politics, economics, health, popular culture and many other aspects of life. It provides an immensely rich corpus for scientific research, technological advancement, and innovative new businesses. It is crucial for our information-based society that the Web be openly accessible to anyone who desires to utilize it.
In this Data Talk, we present two projects that set off to democratize the access to public Web data and provide the means of analysis to virtually anyone. Be our guest when Lisa Green and Jordan Mendelson present Common Crawl, a Web crawl made publicly accessible for further research and dissemination. In a second talk, Peter Adolphs introduces MIA, a Cloud-based platform for analyzing Web-scale data sets with a toolbox of natural language processing algorithms.