5th Year Thesis Presentation - Frank Chen September 4, 2025 1:30pm — 3:00pm Location: In Person - Newell-Simon 3305 Speaker: FRANK CHEN , Master's Student Computer Science Department Carnegie Mellon University https://www.linkedin.com/in/zhuofan-chen-4167191b1/ Towards Utilizing Cached Context for Faster and Smarter Code Agents Recent advances in Large Language Models (LLMs) have enabled the development of coding agents that can autonomously perform tasks such as code generation, debugging, patch validation, etc. Industry-proprietary systems (e.g., Colab AI, Claude Code, Cursor Agent) and open-source frameworks (e.g., Gemini Cli, SWE-Agent, AutoCodeRover, Agentless) have demonstrated both practicality and popularity in real-world software engineering workflows. Despite these successes, existing agentic frameworks face two common challenges: providing accurate and complete context information and ensuring low-latency patch generation under heavy workloads. Although prior work has proposed partial solutions addressing either context retrieval or latency, little attention has been paid to the joint optimization of both aspects. Ideally, joint optimization should enhance both performance and speed without incurring additional cost, which is particularly critical in modern fast-paced, iterative software engineering environments. This thesis investigates integrating existing state-of-the-art approaches to these challenges, specifically RepoGraph for smart context retrieval on the frontend and CacheBlend for fast inference on the backend. To verify feasibility of an integration, we then evaluate performance of the frontend across multiple LLMs under realistic code-agent scenarios, and measure the latency improvement of the backend against systems such as vLLM and SGLang on trace generated by frontend under the same benchmark. The results highlight the trade-offs between cost, efficiency, and performance and argue for the necessity of integrated solutions that achieve balance between the factors mentioned. Finally, we suggest a preliminary design for an end-to-end system that combines the benefits of RepoGraph and CacheBlend via a simple adapter module, along with optimizations in the original algorithms. Overall, the findings suggest promising directions toward building a robust and production-ready coding agent that is both fast and high-performing.Thesis CommitteeRashmi K. Vinayak (Chair)Zhihao JiaAdditional Information For More Information: amalloy@cs.cmu.edu Add event to Google Add event to iCal