From language models to multimodal models
Language models have remarkable qualities. Their ability to analyze complex human language queries, which comes from training on the immense volumes of textual data accessible on the Internet, was enough to provoke enthusiasm. However, these algorithms model only one component of human perception: text. Multimodal models aim to overcome this limitation by natively processing different