Doctoral Thesis Proposal - Margarida Ferreira June 2, 2025 10:00am — 11:30am Location: In Person and Virtual - ET - Traffic21 Classroom, Gates Hillman 6501 and Zoom Speaker: MARGARIDA FERREIRA , Ph.D. Student, Computer Science Department, Carnegie Mellon University https://marghrid.github.io/ Synthesis of Stateful Programs from Execution Traces Execution traces are a valuable source of information in modern computing systems. They are continuously collected and used for system debugging, monitoring, and optimization. They capture behavior across diverse scenarios, from routine operations to edge cases. This thesis investigates how execution traces can serve as specifications for program synthesis, enabling reverse engineering and analysis of complex systems and automation of traditionally manual tasks without explicit user input.This proposal presents three synthesis frameworks, Abagnale, Syren, and HyGLAD, that illustrate the challenges of this problem on multiple applications and how we overcome them. Abagnale reverse-engineers the behavior of congestion control algorithms (CCAs) from network traces. Network traces contain no information about the implementation of the CCA, displaying only the effects of their executions in the network. Thus, Abagnale must simulate each candidate solution in the same network conditions to assess if they exhibit the same behavior. To capture all different behaviors, we work with traces showing hundreds of executions, making trace filtering and parallelization paramount to Abagnale's viability. Syren allows users to generate arbitrary programs from partial traces that contain some of the function calls made by the program. Syren uses optimizing rewrites to introduce control flow in the program. These optimizing rewrites track the data used in the functions visible in the trace, which is then used to generate function calls not visible in the trace using an example-based syntax-guided synthesizer. HyGLAD synthesizes regex-based anomaly filters that flag deviations from a system's expected behavior from execution logs. In this case, our goal is not to reverse-engineer the system itself but to synthesize a model of its execution.As future work, we propose to develop a fourth synthesis approach to automate data-aware business processes. We will use logs collected from human-executed processes as traces and synthesize implementations that model the task logic, filtering out inconsistencies and errors unavoidable in human-generated logs.Thesis CommitteeRuben Martins (Co-chair)Inês Lynce (Co-Chair, Instituto Superior Técnico)Justine SherryFraser BrownJoão F. Ferreira (Instituto Superior Técnico)Nate Foster (Cornell University)Additional InformationIn Person and Zoom Participation. See announcement. Add event to Google Add event to iCal