My research goal is to develop science and technology to account for information flows in complex systems, including big data systems and cryptographic protocols.
A specific focus is on accountability in big data systems that employ machine learning. We are developing theories and tools that can be used to provide oversight of complex information processing ecosystems (including big data systems) to ensure that they respect privacy, and other desirable values in the personal data protection area, such as fairness and transparency. This includes foundations, methods, and tools for detection of violations, explanations for decisions by machine learning systems, attribution or responsibility-assignment for the violations, and correction of responsible entities to avoid future violations. The technical work is informed by and applied to significant practical privacy problems in a broad range of sectors, including Web and healthcare privacy.
Significant recent results include the following:
- Algorithmic transparency via Quantitative Input Influence -- an approach to measuring causal influence of features on decisions of a machine learnt classifier [IEEE S & P 2016]
- The first statistically rigorous methodology for information flow experiments to discover personal data use by black-box Web services [CSF 2015]. The AdFisher tool that implements an augmented version of this methodology to enable discovery at scale; and its application in the first study to demonstrate statististically significant evidence of discrimination in online behavioral advertising [PETS 2015] (see also the FAQ on this study and AdFisher)
- The first complete logical specification of all disclosure-related clauses of the HIPAA Privacy Rule for healthcare privacy [WPES2010] and audit algorithms that apply to it and, more generally, to a rich class of policies (fragments of metric first-order temporal logic) [CCS 2011, CAV 2014, CCS 2015]
- The first formal semantics for purpose restrictions on information use and associated audit algorithms[IEEE S & P 2012, ESORICS 2013]
- A formalization of privacy as contextual integrity [IEEE S & P 2006] (see also the White House's Consumer Privacy Bill of Rights)