Upload

Loading icon Loading...

This video is unavailable.

The RPiCluster

Sign in to YouTube

Sign in with your Google Account (YouTube, Google+, Gmail, Orkut, Picasa, or Chrome) to like Josh Kiepert's video.

Sign in to YouTube

Sign in with your Google Account (YouTube, Google+, Gmail, Orkut, Picasa, or Chrome) to dislike Josh Kiepert's video.

Sign in to YouTube

Sign in with your Google Account (YouTube, Google+, Gmail, Orkut, Picasa, or Chrome) to add Josh Kiepert's video to your playlist.

Published on May 17, 2013

Documentation, Source code, and EagleCAD designs: https://bitbucket.org/jkiepert/rpiclu...

Summary:
The RPiCluster is a 33 node Beowulf cluster built using Raspberry Pis (RPis). During my dissertation work at Boise State University I had need of a cluster to run a distributed simulation I've been developing. The RPiCluster is the result. Each of the 33 RPi is overclocked to 1GHz and is running Arch Linux. This demo shows the RPiCluster running a parallel program I developed using MPI to control all of the RGB LEDs installed on each of the nodes.

The Whole Story:
The RPiCluster project was started in Spring 2013 in response to a need during my PhD dissertation research. My research is currently focused on developing a novel data sharing system for wireless sensor networks to facilitate in-network collaborative processing of sensor data. In the process of developing this system it became clear that perhaps the most expedient way to test many of the ideas was to create a distributed simulation rather than developing directly on the final target embedded hardware. Thus, I began developing a distributed simulation in which each simulation node would behave like a wireless sensor node (along with inherent communications limitations), and as such, interact with all other simulation nodes within a LAN. This approach provided true asynchronous behavior and actual network communication between nodes which enabled better emulation of real wireless sensor network behavior.

So, why I would want to build a Beowulf cluster using Raspberry Pis? The Raspberry Pi has a relatively slow CPU by modern standards. It has limited RAM, slow USB-based 10/100 Ethernet, and its operating system runs directly on a SD card. None of these "features" are ideal for a cluster computer! Well, there are several reasons. First, when your dissertation work requires the use of a cluster it is nice to ensure that there is one available all the time. Second, RPis provide a unique feature in that they have external low-level hardware interfaces for embedded systems use, such as I2 C, SPI, UART, and GPIO. This is very useful to electrical engineers (like myself) requiring testing of embedded hardware on a large scale. Third, having user-only access to a cluster (which is the case for most student-accessible systems) is fine if the cluster has all the necessary tools installed. If not however, you must then work with the cluster administrator to get things working. Thus, by building my own cluster I could directly outfit it with anything I might need. Finally, RPis are cheap! The RPi platform has to be one of the cheapest ways to create a cluster of 32 nodes. The cost for an RPi with an 8GB SD card is ~$45. For comparison, each node in one of the clusters available to students here at BSU, was about $1,250. So, for not much more than the price of one PC-based node, I could create a 32 node Raspberry Pi cluster!

Update: While the BeagleBone Black was not available when I started this project, I would have chosen it rather than the Raspberry Pi had it been available. It is the same cost once you include an SD card, but it has onboard 2GB of flash storage for the operating system. It also uses a Cortex-A8 ARM processor running at 1GHz.

Cluster Performance:
I measured basic computing performance in a number of ways (see the paper). MPI performance was measured using HPL (http://www.netlib.org/benchmark/hpl/). The RPiCluster achieved 10+ GFLOPS peak, with 32-nodes running HPL. The single 3.1GHz Xeon E3-1225 (quad-core) system, I used for comparison, showed about 40 GFLOPS peak (when the HPL problem was optimized for Xeon system).

When I run the HPL problem that achieves 10 GFLOPS on the RPiCluster, the Xeon system achieves about 2 GFLOPS. This is because the HPL problem size is so large that it causes paging on the Xeon system. The Xeon system has 8GB of RAM (~6GB usable after OS, etc) whereas the RPiCluster has about 16GB of RAM (~15GB usable after OS, etc).

More information: http://coen.boisestate.edu/ece/raspbe...

  • Category

  • License

    Standard YouTube License

Loading icon Loading...

Loading icon Loading...

Loading icon Loading...

The interactive transcript could not be loaded.

Loading icon Loading...

Loading icon Loading...

Ratings have been disabled for this video.
Rating is available when the video has been rented.
This feature is not available right now. Please try again later.

Loading icon Loading...

Advertisement
Loading...
Working...
to add this to Watch Later

Add to