UH Researcher Contributes to Popular Software Package Open MPI 2.0.0

Contributions Helped Increase Speed of Reading and Writing Files

Today’s biggest challenges in science are being solved, in part, by high performance computers. From predicting the weather to simulating drug design, high performance computers provide the speed and power to answer these big questions.

Edgar Gabriel, associate professor of computer science, made major contributions to the Open MPI software, which is used in high performance computers.Edgar Gabriel, associate professor of computer science, made major contributions to the Open MPI software, which is used in high performance computers.Edgar Gabriel, associate professor of computer science in the College of Natural Sciences and Mathematics, contributed to the release of Open MPI 2.0.0, a software that allows the different components within a high performance computer to communicate with each other.

This new release includes major contributions by Gabriel’s research group which are the result of six years of research and development. Open MPI 2.0.0 contains features that account for newer technologies while still retaining a user-friendly interface.

High-Performance Computers: A Conglomeration of Computers

High performance computers are a conglomeration of many, many computers, with each computer called a node. Getting high performance computers to operate efficiently is a huge logistical challenge.

Solving problems using high performance computers requires breaking up the initial data into smaller subproblems, then allocating these subproblems to different nodes, which then run individual calculations. Then, the results from the many, many calculations get bundled up and released in the form of answers. The ability to run these problems in parallel is what gives high performance computers their speed and power.

The calculations being performed by different nodes often have to communicate with each other, as solving one subproblem may depend on results from another subproblem.

For example, calculations for weather predictions are accomplished by breaking the predictions down into smaller areas, then calculating weather changes in each small area. However, since weather patterns move from one area to another, the calculations in each area are in part dependent on the changes in the weather nearby.

Open MPI, which stands for Open Message Passing Interface, coordinates all of this: directing the initial data into the nodes, facilitating communication among nodes during the calculation process, and then writing the results into an output file.

Reading and Writing Files: A Limiting Factor in Speed

“One of the limiting factors in a high performance computers’ speed is the ability to read input files and write result files quickly. Otherwise this step takes up a big chunk of the overall time,” Gabriel said. “One of our contributions was to develop new techniques to solve this problem.”

Gabriel was involved in the initial development of Open MPI in 2003, an experiences he describes as an “enormously satisfying” gathering of experts “all banging their heads against each other and against the wall to agree on something.”

His research for the 2.0.0 release focused on maximizing the efficiency in two areas: reading the initial input files and writing the result files.

Accessing Data in the Right Order Improves Performance

“If you access data in the right order, then you will have significantly better performance,” Gabriel said. “If you don’t have the right order, then you end up wasting a lot of time because the disk has to rotate first into the correct position.”

On a single computer, accessing files in the wrong order simply leads to a short stalling; the computer will freeze up or slow down. In a high-performance computer, with thousands of nodes, accessing data in the wrong order leads to huge delays in speed.

Gabriel’s contributions focused on organizing and optimizing how high performance computers access data. This included organizing the input of initial data into individual nodes, as well as the output of results.

“The key difference in our approach was to take advantage of the fact that we have the access pattern of an entire group of nodes,” Gabriel said. This approach has led to increased efficiency of high-performance computers, by helping to reduce the time it takes to read input files and write output files.

“Our computers are getting bigger and faster, while the problems that we trying to solve are getting more and more complex. As these high-performance computers get larger, avoiding these steps where the computer slows down, such as reading large input files, is actually getting more challenging,” Gabriel said.

Contributions also included the work of many students in Gabriel’s research group, including four graduate students devoted full time to the Open MPI project over multiple years. Other major contributors to Open MPI 2.0.0 were Los Alamos National Laboratory, University of Tennessee, Cisco Systems, Intel and IBM.

- Rachel Fairbank, College of Natural Sciences and Mathematics