Collecting, Analysing, and Exploiting Failure Data from...

Loading...

Sign in or sign up now!
Alert icon
Upgrade to the latest Flash Player for improved playback performance. Upgrade now or more info.
1,676
Loading...
Alert icon
Sign in or sign up now!
Alert icon

Uploaded by on Oct 8, 2007

Google Tech Talks
October 27, 2006

ABSTRACT
Component failure in large-scale IT installations is becoming an ever larger problem as the number of processors, memory chips, and disks in a single cluster approaches a million. Yet, virtually no data on failures in real systems is publicly available, forcing researchers to base their work on anecdotes and back of the envelope calculations. In this talk, we will present results from our analysis of failure data from 26 large-scale production systems at three different organizations, including two high-performance computing sites and one large internet service provider. Our results indicate that several commonly made assumptions about failures might not...

Category:

Howto & Style

Tags:

License:

Standard YouTube License

  • likes, 0 dislikes

Link to this comment:

Share to:
see all

All Comments (1)

Sign In or Sign Up now to post a comment!
  • Rooticians can benefit from watching this. It needs an introduction; the speaker does not tell the audience what the lecture will cover.

Loading...
Alert icon
0 / 00Unsaved Playlist Return to active list
    1. Your queue is empty. Add videos to your queue using this button:
      or sign in to load a different list.
    Loading...Loading...Saving...
    • Clear all videos from this list
    • Learn more