J-M Steitz

Skills

Python

Julia

C/C++ and CUDA

JavaScript and SCSS

Experience

2020 – today

Visual Inference Lab, TU Darmstadt

https://www.visinf.tu-darmstadt.de

Researcher at Visual Inference Lab, working on robust semantic analysis of heterogeneous scenes from multi-modal data streams

2015 – today

IT-Consultant

http://www.patrip.org

for PATRIP Foundation: Planning, implementation and maintaining of a data exchange platform

2004 – 2009

SysAdmin

at several companies: Administration of Linux/Unix servers, setup and management of high availability servers and loadbalancers

Publications

Adapters Strike Back

Jan-Martin O. Steitz, and Stefan Roth,
in Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024.

Adapters provide an efficient and lightweight mechanism for adapting trained transformer models to a variety of different tasks. However they have often been found to be outperformed by other …

OpenAccess Code Poster

xGQA: Cross-Lingual Visual Question Answering

Jonas Pfeiffer, Gregor Geigle, Aishwarya Kamath, Jan-Martin O. Steitz, Stefan Roth, Ivan Vulić, and Iryna Gurevych,
in Findings of the Association for Computational Linguistics (ACL), 2022.

Recent advances in multimodal vision and language modeling have predominantly focused on the English language, mostly due to the lack of multilingual multimodal datasets to steer modeling …

Preprint DOI

TxT: Crossmodal End-to-End Learning with Transformers

Jan-Martin O. Steitz, Jonas Pfeiffer, Iryna Gurevych, and Stefan Roth,
in Proc. of the 43rd DAGM German Conference on Pattern Recognition (GCPR), 2021, Best Paper Honorable Mention.

Reasoning over multiple modalities, e.g. in Visual Question Answering (VQA), requires an alignment of semantic concepts across domains. Despite the widespread success of end-to-end learning, …

Preprint DOI Talk video

Browse All