Visual Inference Lab, TU Darmstadthttps://www.visinf.tu-darmstadt.de
Researcher at Visual Inference Lab, working on robust semantic analysis of heterogeneous scenes from multi-modal data streams
for PATRIP Foundation: Planning, implementation and maintaining of a data exchange platform
at several companies: Administration of Linux/Unix servers, setup and management of high availability servers and loadbalancers
xGQA: Cross-Lingual Visual Question Answering
Jonas Pfeiffer, Gregor Geigle, Aishwarya Kamath, Jan-Martin O. Steitz, Stefan Roth, Ivan Vulić, and Iryna Gurevych,
in Findings of the Association for Computational Linguistics (ACL), 2022.
Recent advances in multimodal vision and language modeling have predominantly focused on the English language, mostly due to the lack of multilingual multimodal datasets to steer modeling …
TxT: Crossmodal End-to-End Learning with Transformers
Jan-Martin O. Steitz, Jonas Pfeiffer, Iryna Gurevych, and Stefan Roth,
in Proc. of the 43rd DAGM German Conference on Pattern Recognition (GCPR), 2021, Best Paper Honorable Mention.
Reasoning over multiple modalities, e.g. in Visual Question Answering (VQA), requires an alignment of semantic concepts across domains. Despite the widespread success of end-to-end learning, …
Multi-view X-ray R-CNN
Jan-Martin O. Steitz, Faraz Saeedan, and Stefan Roth,
in Proc. of the 40th German Conference on Pattern Recognition (GCPR), 2018.
Motivated by the detection of prohibited objects in carry-on luggage as a part of avionic security screening, we develop a CNN-based object detection approach for multi-view X-ray image data. Our …