Open-SAT: LLM-Enhanced Query Embedding for Satellite Image Retrieval
Researchers propose Open-SAT, a training-free algorithm that refines query embeddings using large language models (LLMs) to improve open-vocabulary object retrieval in satellite imagery. The method addresses the challenge of aligning natural language queries with satellite images, where vision-language models like CLIP often struggle. Open-SAT operates at inference time, leveraging LLMs to refine text embeddings and a vector database for efficient retrieval. The approach does not require additional training, making it practical for real-world applications. The paper is available on arXiv under ID 2605.05344.
Key facts
- Open-SAT is a training-free query embedding refinement algorithm.
- It uses LLMs to refine text embeddings at inference time.
- The method improves alignment between user queries and satellite imagery.
- Vision-language models like CLIP are used for image embeddings.
- A vector database stores image embeddings for efficient retrieval.
- The approach addresses open-vocabulary object retrieval challenges.
- The paper is from arXiv with ID 2605.05344.
- The algorithm does not require additional training.
Entities
Institutions
- arXiv