ARTFEED — Contemporary Art Intelligence

Gemma 4 VLA Demo Runs on Jetson Orin Nano Super

ai-technology · 2026-04-24

NVIDIA's Asier Arranz published a tutorial demonstrating Gemma 4, a vision-language-action (VLA) model, running on a Jetson Orin Nano Super (8 GB). The demo uses a Logitech C920 webcam and USB keyboard for voice interaction. The model decides autonomously whether to use vision based on user questions, without keyword triggers. The setup requires llama.cpp with Gemma 4 GGUF and a vision projector (mmproj). The tutorial covers system packages, Python environment, RAM optimization, model serving, and demo execution. A Docker-based text-only alternative is also provided for Jetson Orin. The project is available on GitHub under asierarranz/Google_Gemma.

Key facts

  • Gemma 4 VLA demo runs on Jetson Orin Nano Super (8 GB).
  • Model decides autonomously when to use vision based on context.
  • Hardware includes Logitech C920 webcam and USB keyboard.
  • Uses llama.cpp with Gemma 4 GGUF and vision projector (mmproj).
  • Tutorial covers system packages, Python environment, RAM optimization.
  • Docker-based text-only alternative available for Jetson Orin.
  • Project on GitHub: asierarranz/Google_Gemma.
  • First run downloads Parakeet STT, Kokoro TTS, and voice prompts.

Entities

Artists

  • Asier Arranz

Institutions

  • NVIDIA
  • Hugging Face
  • Jetson AI Lab
  • GitHub

Sources