LLaVA - Large Language and Vision Assistant
Open-source vision-language model for conversational image understanding.
When to use LLaVA
Use when:
- Building vision-language chatbots
- Visual question answering (VQA)
- Image description and captioning
- Multi-turn image conversations
- Visual instruction following
- Document understanding with images
Metrics:
- 23,000+ GitHub stars
- GPT-4V level capabilities (targeted)
- Apache 2.0 License
- Multiple model sizes (7B-34B para
[Description truncada. Veja o README completo no GitHub.]