Research Areas - Scientific Computing Research in the Computer Science Department at Carnegie Mellon
Scientists seek to understand nature by using a mix of theory, experimentation, and computer modeling. Theorists explain things using mathematical models such as partial differential equations. Experimentalists measure natural phenomena and collect and analyze the resulting experimental data. Computer modelers develop computer programs that produce synthetic data, which can then be collected and analyzed.
Figure 1: The computer modeling process.
Starting with some real life phenomenon, called the physical system, computer modeling consists of a series of transformations from one intermediate model to another (Figure 1). The physical system is first abstracted into a physical model that identifies the scale and scope of the phenomenon of interest. The physical model is then transformed into some mathematical model, typically continuous partial differential equations, that often do not have closed form solutions. In these cases, the mathematical model must be transformed into some discrete numerical model (such as a mesh) that can be solved on a computer. The numerical model is then transformed into a computer model, which is a program (often called a solver or a simulation) that produces synthetic data that approximates the behavior of the physical system. Finally, the raw synthetic data from the solver is validated against experimental data, reduced, and transformed into a more compact representation (such as a graph or an image) that hopefully provides the modeler with some insight into the original physical system.
Scientific computing is the research area that tackles problems associated with the activities along the bottom row of Figure 1: (a) building numerical models, (b) building computer models, and (c) querying and analyzing experimental and synthetic data. By nature, scientific computing is broad and multi-disciplinary, drawing on aspects of physics, applied mathematics, algorithms, computational geometry, parallel and distributed computing, database systems, graphics, signal processing, graphics, and visualization.
Researchers in CSD span all aspects of scientific computing. Blelloch, Miller, and O’Hallaron work on problems of representing and generating massive hexahedral and tetrahedral meshes. James is pioneering new methods for building special data-driven numerical models that enable him to quickly and accurately simulate complex moving and colliding objects. O’Hallaron develops software tools that simplify the construction of large-scale parallel finite element simulations. Blelloch and Miller are developing new Lagrangian methods for creating meshes that move along with the objects they are modeling. Ailamaki (adjunct) and O’Hallaron work on new techniques for representing, indexing, and querying meshes and data that are stored in databases.
Much of our research is performed under the umbrella of larger projects that provide both a specific application focus, and a social and funding mechanism for working with partners in the domain sciences and engineering. For example, O’Hallaron has worked with earth scientists and civil engineers for over a decade on the Quake project, which is pioneering techniques for modeling the motion of the ground during strong earthquakes in regions like the Los Angeles basin. Blelloch and Miller work with mathematicians, engineers and physicians on the Sangria project, which is pioneering methods for modeling the flow of blood. Ailamaki (adjunct) is working with astronomers and astrophysicists on database techniques for the World-Wide Telescope (WWT), a virtual observatory that federates astronomy and astrophysics databases at a global scale. James works with engineers at Boeing on the interactive simulation of the parts of the 777 aircraft that jiggle, such as wires and hoses.
There are a number of broad, crosscutting themes that run throughout our work:
Modeling things that move. Although not part of any grand scheme, we tend as a group to be interested in things that move. For example, modeling the motion of the ground during earthquakes (O’Hallaron), modeling blood flow with moving meshes (Blelloch and Miller), modeling colliding chairs, stampeding animal herds, and fields of wind-blown irises (James).
Working closely with domain scientists and engineers. It is difficult to make an impact in scientific computing without developing close ties with the domain scientists and engineers who will be the ultimate customers. As we shall see, our group is very good at establishing these ties with other Carnegie Mellon departments as well as with external groups.
Exploiting the memory hierarchy. With processor speeds charging ahead of disk and memory speeds, the memory hierarchy is having an increasingly important effect on the performance of scientific computing programs. The importance of this trend is refiected in our work, including mesh representations with good locality of reference (Blelloch), computational database systems where the entire scientific computing process works by querying and updating databases (O’Hallaron and Ailamaki (adjunct)), and automatic cache-friendly schema design for astronomy databases (Ailamaki (adjunct)).
1 Earthquake Ground Motion Modeling (Quake)
O’Hallaron, Bielak (CEE), and Ghattas (CEE) founded the Quake project in 1993, and have worked together continuously since then. The ambitious goal of the project is to perform accurate 3D simulations of the ground during strong earthquakes in the Los Angeles Basin of Southern California. The group is currently partnered with the Southern California Earthquake Center (SCEC), a consortium of dozens of universities and government agencies centered at the University of Southern California.
The Quake project has had a significant impact on both the earth science and the computer science communities. O’Hallaron and Shewchuk (a student co-advised by O’Hallaron and Miller and now a tenured associate professor at Berkeley), developed a toolkit called Archimedes that automates the process of building parallel unstructured finite element models. Earth scientists have used it to successfully simulate the 1994 Northridge and 1995 Kobe quakes, among others. Because of their accuracy, Archimedes models are now recognized as the standard reference models by the SCEC ground motion community. As part of his award-winning 1997 thesis, Shewchuk developed a 2D mesh generator called Triangle that became the world standard mesh generator, used by thousands around the globe, eventually winning the 2003 Wilkinson Prize for Numerical Software. O’Hallaron developed a benchmark program (183.EQUAKE) based on the Archimedes solver that was included in the infiuential SPEC CPU2000 benchmark suite. Later, parallelized versions of 183.EQUAKE were included in SPEC OMP2001 benchmark suites.
The members of the Quake project received the 1998 SCS Alan Newell Award and the 2004/2005 CIT Outstanding Research, and in 2003 received the prestigious Gordon Bell Prize for Special Achievement, which is awarded from an international competition at the SC conference.
2 Blood Flow Modeling (Sangria)
Ghattas, Miller, Blelloch, and Walkington founded the Sangria Project, which brings together applied mathematicians, biochemists, bioengineers, computational fiuid dynamicists, computer scientists, continuum mechanicists, hemorheologists, numerical analysts, and transplant surgeons at Carnegie Mellon University, the University of Pittsburgh Medical Center (UPMC), and the University of Washington (Antaki, Blelloch, Ghattas, Griffith, Kamenava, Miller, Turkiyyah, Walkington, Wu). The project aims to develop and apply advanced parallel geometric and numerical algorithms and software for simulating complex flows with dynamic interfaces, in particular the flow of blood, which is a mixture of interacting gel-filled solid cells and fiuid plasma. A major challenge faced in simulating such flows is resolving the moving interfaces between the cells and the plasma. Lagrangian methods are ideally suited for such problems, since interfaces are naturally represented and propagated. However, the motion results in meshes whose nodes and elements move with the objects they are modeling, dynamically changing their shapes. Such meshes can become hopelessly distorted over time unless they are regularly regenerated.
Blelloch and Miller and their colleagues have developed groundbreaking new techniques for dynamically regenerating these moving meshes. Their results are exciting because they enable domain scientists to study important problems such as the potential for clotting in new artificial heart designs.
The results from the Sangria project build upon years of theoretical and algorithms research, most recently under the umbrella of the Aladdin Center. Miller is a top researcher in spectral graph theory. Blelloch has worked for a number of years on methods for parallel Delaunay triangularization and compact and efficient ways to represent and query unstructured tetrahedral meshes.
3 Computational Database Systems (CoDS)
This is a recent spinoff (2004) from the Quake project, funded by the NSF SEI+II program, which is developing tools and techniques for performing the entire scientific computing process directly on databases (Ailamaki (adjunct), Ganger, and O’Hallaron). One of the breakthroughs is a computational database system, called Weaver, that is able to generate massive octree meshes with billions of elements on a desktop Linux box in a few hours. For many years, the Quake project was stuck at meshes with 100 million elements; it was impossible to generate larger ones because existing mesh generators required that the entire mesh be stored in main memory. Weaver smashed that barrier and enabled the project to perform the simulations that led to the 2003 Gordon Bell Award.
Recently, we developed a system called Hercules, a novel, massively parallel, end-to-end system for octree-based finite element simulations that includes an in situ parallel mesh generator, mesh partitioner and load balancer, solver, and visualization system in one integrated package. The system is currently deployed at SCEC and at PSC. Some of us (O’Hallaron and Ailamaki (adjunct)) are also conducting new research on algorithms and indexing schemes that will allow for efficient queries of massive unstructured 3D tetrahedral meshes.
4 Automated Scientific Data Management
In astronomical datasets such as the Sloan Digital Sky Survey, the data resides in a massive table with hundreds of columns and millions of rows, where each row represents an astronomical object. Queries of such large tables are inefficient unless there are enough redundant data structures (e.g., indexes) built on it, which may double or triple the size of the database. Working with astronomers and astrophysicists (Connelly and Nicol), Ailamaki (adjunct) has developed a novel tool, called Autopart, that automatically designs the appropriate schema for a scientific database given some representative queries, thus alleviating astronomers of having to become database experts in order to query their data.
A recent (2004) continuation of the Autopart effort, funded by NSF SEI+II, involves Ailamaki (adjunct) and computer scientists and astronomers from Johns Hopkins University (Burns, Szalay, and Thakar). The aim is to provide smart caching of data for World-Wide Telescope (WWT), a virtual observatory that federates astronomy and astrophysics databases at a global scale, with the goal of making it available to everyone. In its current form, however, increasing the number of sites and users in the WWT leads inevitably to a network crisis. To avert the crisis, the project is developing caching technology based on bypass-yield caching and self-organizing database storage. Ailamaki (adjunct) is leading the effort in self-organizing database storage, which automates storage management and database organization, turning the cache into an administration-free appliance.
5 Data-Driven Approaches for Accelerating the Physical Simulation of Deformable Systems
James is developing novel and powerful data-driven approaches for accelerating the simulation of systems whose objects move and jiggle and collide. The basic idea is to precompute a numerical model that is tailored to a specific application, and then use that model to speed up the simulation and rendering steps. For example, depending on the application, the precomputed models might be impulse response functions, polynomials in some reduced coordinate space, precomputed databases, or parametric models. Such models can then be applied to applications such as non-linear dynamics, collision detection, global illumination, hardware rendering, sound processsing, haptic rendering, and even distributed computing (by reducing the amount of information that needs to be transferred over the wire).
Since arriving at Carnegie Mellon three years ago, James has successfully applied this data-driven technique to a number of applications. Using a precomputed dynamics database and reduced-coordinate motion libraries of plants blowing under various wind conditions, he is able to visualize beautiful outdoor scenes such as fields with thousands of wind-blown flowers, under dynamic wind, lighting and viewing conditions, all at interactive rates. In another application, he uses precomputed reduced-coordinate parametric shape models to simulate a cascade of thousands of fiexible, colliding lawn chairs in several hours on a personal computer, instead of months. He also combines collision detection with reduced coordinate cubic polynomials to enable force-feedback rendering of distributed contact with complex large-deformation systems at kilohertz rates.
The work has already had significant impact in academia and industry. All of the results were reported at SIGGRAPH, the dominant graphics conference. James is currently working with Boeing commercial aerospace on the interactive simulation of parts in the 777 aircraft that deform, such as wires and hoses. The collision detection work was featured in the Aug 2004 edition of Wired Magazine.
|CSD Home Webteam ^ Top SCS Home|