The Little Engine(s) That Could: Scaling Online Social Networks

Loading...

Sign in or sign up now!
Alert icon
Upgrade to the latest Flash Player for improved playback performance. Upgrade now or more info.
5,147
Loading...
Alert icon
Sign in or sign up now!
Alert icon
There is no Interactive Transcript.

Uploaded by on Mar 16, 2011

Google Tech Talk (see below)
June 17, 2010

Presented by Josep M. Pujol.

ABSTRACT

The difficulty of partitioning social graphs has introduced new system design challenges for scaling of Online Social Networks (OSNs). Vertical scaling by resorting to full replication can be a costly proposition. Scaling horizontally by partitioning and distributing data among multiple servers using, for e.g., key-value stores using DHTs, can suffer from expensive inter-server communication and other performance issues. Such challenges have often led to costly re-architecting efforts for popular OSNs like Twitter and Facebook.

We design, implement, and evaluate SPAR, a Social Partitioning and Replication middle-ware that mediates between the application and the database layer of an OSN. SPAR exploits the underlying social graph structure to partition user data and selectively replicate users to ensure that users have their neighbors' data co-located on their machine. The gains from this are multi-fold: application
developers can assume local semantics, i.e., develop as they would for a single machine; scalability is achieved by adding commodity machines with low memory and network I/O requirements; and N+K redundancy is achieved at a fraction of the cost.

We provide a complete system design, extensive evaluation based on datasets from Twitter, Orkut, and Facebook, and a working implementation. We show that SPAR performs well in terms of reducing the overhead, and dealing with high dynamics experienced by an OSN gracefully. We implement a Twitter like application and evaluate SPAR with MySQL and Cassandra using real datasets and show significant gains in terms of req/s and reduction in network traffic.

Speaker Info:

Josep M. Pujol is a member of the Telefonica Research Labs in Barcelona http://research.tid.es/ working on the intersection of social networks, search and system scalability. Prior to Telefonica he was a post-doc at the University of Michigan affiliated to the Center for the Study of Complex Systems and the Department of Epidemiology where he worked on modeling infection transmission and dose-response models. Josep earned the PhD from the Universitat Politecnica de Catalunya with his dissertation on the effects of social structure in artificial societies. Further information is available at http://research.tid.es/jmps/

Category:

Science & Technology

Tags:

License:

Standard YouTube License

  • likes, 5 dislikes

Link to this comment:

Share to:
see all

All Comments (6)

Sign In or Sign Up now to post a comment!
  • Very impressive. Is there a publicly available implementation (open source or not)?

  • Is there any software to manage data distribution via torrent or otherwise? How do you manage non public data when you are using distributed clients for storage?

  • What happens at recovery time? When 40%, 60% or more of your subscribers go offline in one day, due to data access issues etc?

  • Interesting topic but like yeah you know like BORING!!!

  • If he says like one time I'm going to explode :S

  • Long Videos FTW :D

Loading...

Alert icon
0 / 00Unsaved Playlist Return to active list
    1. Your queue is empty. Add videos to your queue using this button:
      or sign in to load a different list.
    Loading...Loading...Saving...
    • Clear all videos from this list
    • Learn more