Lantern: A Query Language for Visual Concept Retrieval
Modern visual data analytics increasingly rely on sources of large, unlabeled visual datasets such home webcams, vehicle dashcams, and satellites. However, most modern tools for Big Data analytics lack primitives that are both fluent and efficient for understanding visual data. This thesis presents Lantern, a query language and runtime for describing and finding "visual concepts" in databases of images and videos. I define visual concepts as compositions of “things” (e.g. objects, faces) and spatial relations between things (e.g. above, around). I implemented a runtime for both finding visual concepts in images and tracking concepts across videos. The system scales horizontally by scheduling image processing operations across a cluster and vertically by enabling the user to accelerate their tasks over GPUs and multicore CPUs. I evaluated my prototype on several applications including object detection error analysis, face detection and blurring in video, and interactive queries for visual data exploration.