Friday, October 9, 2015 - 12:00pm to 1:00pm
Location:Traffic 21 Classroom 6501 Gates & Hillman Centers
Speaker:THOMAS TAUBER-MARSHALL, Ph.D. Student http://www.cs.cmu.edu/~twmarsha/
Many big data applications are performed on data sets that are constantly changing, such as calculating the PageRank of every page on the internet. Rather than rerunning the entire calculation repeatedly to keep the results up to date, which can be expensive, incremental computation allows you to update the output of previous calculations with minimal effort. ThomasDB is a distributed, fault tolerant system designed to make it is easy to make many suitable big data calculations incremental. ThomasDB accomplishes this through a technique known as self-adjusting computation, which tracks relationships between data and code in a structure called a dynamic dependency graph, allowing ThomasDB to identify which portions of the execution are affected by changes to the data set and to only re-execute those parts. Joint work with Andy Pavlo and Umut Acar. In partial fulfillment of the speaking requirement.