Alexander Hauptmann

Research Professor

Office 5519 Gates & Hillman Centers

Email alex@cs.cmu.edu

Phone (412) 268-1448

Department
Language Technologies Institute

Website
http://www-2.cs.cmu.edu/~alex/

Research Statement

My research interests revolve around the integration of text, image video and audio analysis. In the Informedia Project we have built the News-on-Demand application, which is an instantiation of the Informedia Digital Video Library idea, based completely on automatic methods for processing television and radio news. Through the combination of the strengths of speech recognition, natural language processing, information retrieval and interface design, the system is able to overcome some of the shortfalls inherent in each of the component technologies.

My goal is to utilize large corpora of "found data", that is data that is already available through the Internet or other readily accessible open sources, to improve speech and natural language processing by exploiting advantages across different modalities. It has become clear in recent years that large volumes of text, image, video and audio can be easily stored and made available for research and applications. However, most of these text, image, video and audio sources were not produced with computer processing in mind. My intention is to design and build intelligent, understanding programs that help process data from these sources and make the data useful for other applications. This data can be used to improve speech recognition, image understanding, natural language processing, machine learning as well as information retrieval. The challenge is to find the right data, process it into suitable form for training, learning or re-use and build mechanisms that can successfully utilize this data.

Speech and multimedia technology is about to make a major impact on our daily interaction with computers. What is needed at this point are clear demonstrations of the advantages of integrated speech and multimedia interfaces.

Recent Publications

Alexander Hauptmann ( 2022 ) Proceedings - 2022 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, WACVW 2022, Argus++: Robust Real-time Activity Detection for Unconstrained Video Streams with Overlapping Cube Proposals , , Page(s): 112- 121 .

Alexander Hauptmann ( 2022 ) IEEE Transactions on Pattern Analysis and Machine Intelligence, Contrastive Adaptation Network for Single- and Multi-Source Domain Adaptation , Vol: 44 Issue: ( 4 ) , Page(s): 1793- 1804 .

Alexander Hauptmann ( 2022 ) Neurocomputing, Deep Discrete Cross-Modal Hashing with Multiple Supervision , Vol: 486 , Page(s): 215- 224 .

Alexander Hauptmann ( 2022 ) IEEE Transactions on Pattern Analysis and Machine Intelligence, TN-ZSTAD: Transferable Network for Zero-Shot Temporal Activity Detection ,

Qian Y, Kang G, Yu L, Liu W, Hauptmann AG ( 2022 ) Proceedings - 2022 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, WACVW 2022, TRM:Temporal Relocation Module for Video Recognition , Page(s): 151 - 160