VectorArk: New VLM Model for Practical Image Vectorization
A new vision-language model (VLM) named VectorArk has been developed by researchers to enhance image vectorization effectively. In contrast to earlier VLM approaches that excelled only in synthetic settings, VectorArk adapts well to real-world images, including those with unknown rasterization techniques or created by text-to-image generators. It utilizes a rounded polygon format that facilitates learning and yields smooth, aesthetically pleasing shapes. Furthermore, a degradation model improves its resilience against various imperfect inputs. Experimental results indicate that VectorArk outperforms previous methods in terms of geometric completeness and artifact reduction across several datasets. The research paper can be found on arXiv with the identifier 2605.24398.
Key facts
- VectorArk is a VLM-based model for image vectorization.
- It uses a rounded polygon representation for smooth primitives.
- A degradation model improves robustness to real-world inputs.
- Outperforms previous methods on geometric completeness and artifact suppression.
- Tested across multiple datasets.
- Addresses poor generalization of prior VLM methods to real-world images.
- Designed for images with unknown rasterization or text-to-image outputs.
- Paper available on arXiv (2605.24398).
Entities
Institutions
- arXiv