Advanced image segmentation for research & editing.
Segment Anything by Meta AI is an AI model designed for computer vision research that enables users to segment objects in any image with a single click. The model uses a promptable segmentation system with zero-shot generalization to unfamiliar objects and images without requiring additional training. The system can take a wide range of input prompts specifying what to segment in an image, including interactive points and boxes, and can generate multiple valid masks for ambiguous prompts. The output masks can be used as inputs to other AI systems, tracked in videos, used for image editing applications, and lifted to 3D or used for creative tasks. The model is designed to be efficient enough to power the data engine, with a one-time image encoder and a lightweight mask decoder that can run in a web browser in just a few milliseconds per prompt. The image encoder requires a GPU for efficient inference, while the prompt encoder and mask decoder can run directly with PyTorch or be converted to ONNX and run efficiently on CPU or GPU across a variety of platforms that support ONNX runtime. The model was trained on the SA-1B dataset, consisting of over 11 million licensed and privacy-preserving images, resulting in over 1.1 billion segmentation masks collected.