IPAB Workshop - 11/12/25 | IPAB | School of Informatics

Speaker: Zhenzhang Ye

Title: Training-free Identification of Class-Discriminative Patch Sets in CLIP for Few-Shot Image Recognition

Abstract: Few-shot classification performance of large vision–language models (VLMs) varies widely across datasets, especially when class names provide weak or ambiguous semantic cues. We introduce a training-free approach that integrates local discriminative patches into VLMs to reduce reliance on textual prompts. Our method identifies patches with high intra-class consistency and low inter-class ambiguity, forming a Class-Discriminative Patch Set (CDPS) for each category. Using CDPS, we enhance VLM recognition through a hybrid classifier that combines global text-conditioned features with local patch-based similarity.