1 link tagged with all of: evaluation + zero-shot + semantic-segmentation
Click any tag below to further narrow down your results
Links
TextRegion is a training-free framework that generates text-aligned region tokens using frozen image-text models and segmentation masks, achieving remarkable zero-shot performance in tasks like semantic segmentation and multi-object grounding. The framework allows for direct evaluation and inference on custom images, provided users follow the setup and dataset preparation guidelines. It builds on various existing models and is available for use and citation under the MIT License.