Upload

Loading icon Loading...

This video is unavailable.

Googlewhacks for Fun and Profit

Sign in to YouTube

Sign in with your Google Account (YouTube, Google+, Gmail, Orkut, Picasa, or Chrome) to like GoogleTechTalks's video.

Sign in to YouTube

Sign in with your Google Account (YouTube, Google+, Gmail, Orkut, Picasa, or Chrome) to dislike GoogleTechTalks's video.

Sign in to YouTube

Sign in with your Google Account (YouTube, Google+, Gmail, Orkut, Picasa, or Chrome) to add GoogleTechTalks's video to your playlist.

Uploaded on Sep 22, 2008

Googlewhacks for Fun and Profit-
Jonathan Lansey

We study the number of Internet search results returned from multi-word queries based on the number of results returned when each word is searched for individually. We derive a model to describe search result values for multi-word queries using the total number of pages indexed by Google and by applying the Zipf power law to the word per page distribution on the internet and Heaps' law for unique word counts. Based on data from 351 word pairs each with exactly one hit when searched for together, and a Zipf law coefficient determined in other studies, we approximate the Heaps' law coefficient for the indexed world wide web (about 8 billion pages) to be beta=.52. Previous studies used under 20,000 pages. We demonstrate the validity of our method by using a different set of word pairs and with word triplets. We demonstrate through examples how the model can be used to analyze automatically the relatedness of word pairs assigning each a value we call "Strength of Associativity." . We then use our model to compare the index sizes of competing search giants Yahoo and Google.

Loading icon Loading...

Loading icon Loading...

Loading icon Loading...

The interactive transcript could not be loaded.

Loading icon Loading...

Loading icon Loading...

Ratings have been disabled for this video.
Rating is available when the video has been rented.
This feature is not available right now. Please try again later.

Loading icon Loading...

Loading...
Working...
to add this to Watch Later

Add to