Research Areas - Formal Methods Research in the Computer Science Department at Carnegie Mellon
CSD faculty: Jonathan Aldrich (ISR), Stephen Brookes, David Brumley (ECE), Randy Bryant, Edmund Clarke, Anupam Datta (ECE), David Garlan, Robert Harper, Frank Pfenning, Andre Platzer, John Reynolds, Reid Simmons, Jeannette Wing,
Logical errors in computer hardware and software can have significant economic and societal impact. The Pentium floating point error cost Intel an estimated 500 million dollars. In 2002 NIST reported that software errors cost the US economy roughly 60 billion dollars per year.
Errors in embedded systems that are used more and more in safety-critical applications, such as “drive-by-wire” and implantable medical devices, can lead to loss of human life.
The formal methods group at Carnegie Mellon aims to help hardware and software engineers build more reliable systems. Formal methods are mathematically-based techniques for specifying and verifying system properties. Specification is the process of describing a system and its desired properties; verification, the process of showing that a system satisfies these properties. The mathematics underlying specification and verification draw on formal logic and discrete mathematics. The precision of mathematics lets us reveal ambiguity, incompleteness, and inconsistency in a system’s design. Its mathematical basis gives formal methods a crucial advantage over informal ones: mechanization. Our techniques scale beyond what people can do by hand and avoid fallibility of human reasoning.
At Carnegie Mellon we promote lightweight formal methods. Rather than try to specify all properties of an entire system and attempt to do a complete proof of correctness, we advocate specifying critical properties of a critical part of a system and focus on finding errors, rather than prove correctness. “Spec ’n Check” is our mantra.
At Carnegie Mellon our primary strength in formal methods is model checking. Model checking is a technique that relies on building a finite model of a system and checking that a desired property holds in that model. Roughly speaking, the check is performed as an exhaustive state space search that is guaranteed to terminate since the model is finite. The technical challenge in model checking is in devising algorithms and data structures that allow us to handle large search spaces. Model checking is fast, automatic, and supports partial specifications; above all, its tour de force is that it produces counterexamples, which usually represent subtle errors in design, and thus can be used to aid in debugging. Model checking is a proven success in hardware and protocol verification; the current trend is to apply model checking to software.
1 Current Research
In the mid-80s Carnegie Mellon researchers made a major breakthrough in the state space explosion challenge of model checking. The full story is in our Distinguishing Characteristics section. In a nutshell, McMillan, a Carnegie Mellon graduate student, used Bryant’s ordered binary decision diagrams to represent the states and state transitions in Clarke’s model checking technique. This enabled model checkers to go from handling 104 states to 1020 states and beyond. BDDs became the data structure of choice for building symbolic model checkers throughout the next two decades of research.
At Carnegie Mellon, we continue to investigate new algorithms and data structures (Bryant, Clarke, Simmons, Wing), not just for pushing the limits of model checking, but toward the grander goal of checking the more and more complex systems we face in the future.
Our techniques include: binary moment diagrams, bounded model checking, compositional reasoning, counterexample explanation, counterexample-guided abstraction and refinement, generation of graphs of counterexamples, model checking real-time systems, parameterized model checking, predicate abstraction, and probabilistic verification of discrete-event systems. We also are looking at integrating other verification techniques (Bryant, Clarke) such as boolean satisfiability solvers, decision procedures, and proof checking with our work.
In the area of specification, we develop new models, logics, and languages (Brookes, Reynolds, Wing) for expressing system properties and system behavior. Most recently we have been investigating how to extend Reynolds’s separation logic, used for reasoning about mutable data structures, to Brookes’s trace semantics, used for modeling concurrent programs. (See Section on Principles of Programming for further details on trace semantics and separation logics.)
Our strength in model checking built from an original focus on verifying digital hardware. Using symbolic model checking we were the first to find nontrivial errors in an IEEE standard when we verified the cache coherence protocol in the IEEE Futurebus+ Standard. We showed how the corrected Pentium floating point division algorithm could be verified using model checking. We continue to develop techniques for verifying circuits by symbolic simulation, with levels of abstraction ranging from transistors to very high-level representations (Bryant, Clarke). More recently, we have developed more abstract representations of hardware, where words of data are viewed as symbolic (possibly integer) values, rather than collections of bits. This representation enables a verification tool to focus on the complexities of the control logic while using a vastly simpler view of data. With our approach we can verify out-of-order microprocessors and other hardware systems that are far beyond tools such as finite-state model checkers. Outcomes of this research include: a new class of decision procedures for fragments of first-order logic based on an eager encoding into propositional logic; the verification tool UCLID for modeling and verifying infinite-state systems; and new approaches to predicate abstraction and predicate discovery that have widespread application.
Model Checking Software Systems:
Given the success of model checking hardware systems, we have turned our attention to software systems. Early work (Wing) on verifying protocols in distributed file systems, networking protocols, and failure recovery protocols for RAID systems has now evolved into two more ambitious thrusts: model checking source code (Bryant, Clarke) and model checking software architectures (Garlan, Simmons). We are currently developing software model checkers: one for Microsoft bytecode that will support C, C++, C#, and Java; CBMC, a bounded model checker for C; and Magic, for modular analysis of concurrent C programs. We have applied CBMC to C models of ASICS provided by nVIDIA, part of a train controller from General Electric, and cryptographic algorithms. We have applied Magic to the Micro-C operating system, OpenSSL, an industrial interprocess communication module, and a metal casting controller.
Our long-standing interest in the design of languages for formally specifying software systems has moved from the level of specifying the behavior of program modules (Wing) to that of specifying software architectures (Garlan, Simmons). Wright is a formal architectural description language, based on process algebras, for describing the interactions between software components. Acme is a language and environment for rapid customization to architectural styles. We have applied such formalisms to describing task-level architectures for robots; debugging architectural standards such as the High-Level Architecture standard for distributed simulation; bridging component mismatches such as component wrappers; and reasoning about architectural styles such as publish-subscribe and self-healing systems.
Our current targets of application are in two areas: embedded and autonomous systems (Clarke, Garlan, Simmons, Wing) and security (Bryant, Clarke, Wing). Embedded and autonomous systems perceive and interact with a physical, and often adverse environment, e.g., Mars. They sense continuous variables such as air temperature and water pressure, and in response, digitally control devices. We face two technical challenges in verifying such systems: modeling their environment and modeling the interaction between a discrete system and a continuous system. Modeling an environment with a high degree of uncertainty has led us to invent stochastic probabilistic model checking and to study Markov models and game-theoretic models. Modeling interactions between a system and its environment has led us to explore hybrid systems which model both discrete modes and continuous dynamics. In this domain, we are applying our work to systems such as automotive drive-by-wire, chemical plants, insulin and infusion pumps, and mobile robots.
We have had a long-standing interest in the use of formal methods for security, dating to early work in the 80s on extending temporal logic for reasoning about security (Wing), to work in the 90s on the use of model checking and theory generation to reason about authentication and e-commerce protocols (Clarke, Wing), to work in the 00s on the use of model checking to generate attack graphs automatically (Wing). More recently, we have turned our attention to software security (Bryant, Clarke, Wing). We are looking at code-level vulnerabilities such as buffer overruns and format string errors, as well as design-level vulnerabilities such as software composition flaws. We are also studying the use of model checking to generate proof-carrying code; as a first step toward identifying a fragment of separation logic to handle C programs with pointers, we developed a proof checker for separation logic.
The formal methods group at Carnegie Mellon has collectively designed from scratch and regularly teach: four of the five core courses of the Carnegie Mellon Master’s of Software Engineering program, an undergraduate and graduate course on model checking, and a graduate course on separation logic. Clarke co-authored with Grumberg and Peled the book, Model Checking; and Garlan has co-authored two books on software architecture (one with Shaw and the other with SEI colleagues).
The Specification and Verification Center at Carnegie Mellon is directed by Wing, in collaboration with Clarke, Garlan, Simmons, and Krogh of ECE. Our current funding comes from ARO, the Gigascale Systems Research Center, NASA, NSF, ONR, the Semiconductor Research Consortium, plus small amounts from corporate sponsors such as General Motors and Microsoft. Through a large NASA grant, we recently joined forces with Aldrich and Scherlis of ISRI since our approaches, formal methods and program analysis, complement each other and since our target application domains, embedded systems and space software, share common dependability requirements.
At Carnegie Mellon, the formal methods group overlaps with the faculty groups in programming languages and software engineering. We also have outside collaborations with colleagues at Dortmund University, the Federal University of Minas Gerais (Brazil), the Food and Drug Administration, Ford Motor Company, Honeywell, the Jet Propulsion Laboratory, Microsoft, NASA/Ames, Queens University (Canada), the Swiss Federal Institute of Technology (ETH Zurich), the Technion (Israel), the Technical University of Munich, the Software Engineering Institute, University of Southern California, and the University of Washington.
As society becomes more and more dependent on software, companies and government agencies are recognizing the need for more dependable systems. Software companies such as Microsoft and traditional engineering companies such as Honeywell have embarked on “Software Excellence” initiatives. Our position of strength and leadership in formal methods at Carnegie Mellon attracts the attention of such companies who seek best and better practices in software engineering; thus we expect our work to play a key role in such initiatives. While we pursue making fundamental technical advances in the field, we also eagerly work with companies and government agencies to transfer our technology. We expect to see a growing demand for our research results, our students, and our educational programs in the future.
The major technical challenge to our field is the computational complexity caused by the state explosion inherent in analyzing hardware and software systems. Sources of this state explosion are concurrency, rich data types, sophisticated programming language features, and interactions with an unpredictable environment. One already fruitful direction we are taking is to integrate our formal analyses with traditional program analysis and automated theorem proving techniques.
The most pragmatic challenge remains: getting engineers to use formal methods. Toward this challenge, we need to design more perspicuous specification languages, to develop a suite of tools that together have greater coverage of a system’s behavior, and finally, to educate the next generation of software and hardware engineers by introducing formal methods into the standard computer science and engineering curricula.
|CSD Home Webteam ^ Top SCS Home|