Jignesh Patel

Computer Science Department Faculty - Jignesh Patel

Professor

ORCiD

Office 9103 Gates and Hillman Centers

Email jignesh@cmu.edu

Phone (412) 268-1453

Department
Computer Science Department

Administrative Support Person
Patricia Loring

Research Interests
Systems
Data-Intensive and Cloud Computing
Databases

Advisees
Christos Laspias

CSD Courses Taught

15445 - Spring, 2024

15645 - Spring, 2024

Biography

Jignesh Patel is a professor in the Computer Science Department at Carnegie Mellon University. His research focuses on data management, emphasizing both system efficiency (e.g., scalable data platforms) and human efficiency (e.g., designing LLM-based query interfaces). His papers have been recognized as the best papers at top database conferences, including SIGMOD and VLDB. He is a fellow of the AAAS, ACM, and IEEE organizations. He has also received several teaching awards, and he is a co-founder of four startups.

Research/Teaching Statement

My research is in database management, and it is organized into two themes. The first theme focuses on designing data processing methods that work synergistically with hardware to deliver high performance to database applications. The second theme focuses on expanding the scope of data platforms, and one aspect of this theme is using a no-code approach to make data science accessible to a broad set of users.

A common aspect of my research is building actual systems to fully understand the end-to-end impact of the underlying research.

High-Performance Database Systems: Harnessing Hardware-Software Synergies

Database systems have a long history of working synergistically with hardware to deliver high performance. For most of this five-decade history, these synergies were realized in a decoupled manner. For its part, the hardware community was operating under the umbrella of Moore’s Law and continually producing faster processors, and denser and faster storage. Once these devices were ready, the database community adapted their software to these new devices. This mode of operation worked well for decades as hardware was getting faster at a breakneck speed. If database applications demanded twice the performance every 2-3 years and the underlying hardware improved at the same rate, then the database industry could meet application demands by simply ensuring that the database software used the new hardware effectively.

The slowing down of Moore’s Law over the last decade has shifted this dynamic. Without a dramatic new approach, we are headed to a future in which performance-hungry database applications will either use more and more resources (using parallel database technologies) or will need to find a way to live with lower performance. Neither of these options is satisfactory, as the former is not a path to endless scaling (further, it is expensive both from the cost and the energy efficiency perspectives), and the latter is not acceptable in many application settings.

In my research group, we are exploring an alternative path, namely, to reimagine the algorithms that are used inside a database engine so that they can work at “bare metal speeds.” We couple this approach with a hardware-software collaborative approach in which we are working closely with architects to rethink the entire database stack on new hardware that aims to package computing and storage closely. Both these research thrusts build on each other and are synergistic in driving towards a simple vision: to deliver to database applications the effect of Moore’s Law but through a true hardware-software synergistic approach.

Democratizing Data Analytics

The other theme of my research is focused on the productivity of end users of data platforms. A key current project focuses on data science users, in which we allow a user to express a range of data science/analytics tasks using a no-code approach. The appealing aspect of this perspective is that it presents a natural way for end users to express a broad range of complex tasks, including data wrangling, structured query processing (SQL), machine learning, and visualization using a unified no-code interface. This research has become the central focus of a startup called DataChat.