Subhaditya's KB

❯

❯

FLAVA

Sep 18, 20241 min read

architecture

FLAVA

FLAVA: a Foundational Language and Vision Alignment Model
foundational vision and language alignment model that performs well on all three target modalities: 1) vision, 2) language, and 3) vision & language
use a single holistic universal model, as a “foundation”, that targets all modalities at once
wide range of 35 tasks spanning these target modalities

Graph View

Backlinks

architecture

Created with Quartz v4.3.1 © 2025

GitHub