CovMatch: Cross-Covariance Guided Multimodal Dataset Distillation with Trainable Text Encoder
CovMatch introduces a novel distillation framework that jointly optimizes image and text encoders to achieve robust cross-modal alignment using synthetic dat...