Skip to main content

Overview

MoondreamService provides local image analysis and question-answering capabilities using the Moondream model. It runs entirely on your local machine, supporting various hardware acceleration options including CUDA, Intel XPU, and Apple MPS for privacy-focused computer vision applications.

Moondream Vision API Reference

Pipecat’s API methods for Moondream vision integration

Example Implementation

Browse examples using Moondream vision

Moondream Documentation

Official Moondream model documentation

Hugging Face Model

Access Moondream model on Hugging Face

Installation

To use Moondream services, install the required dependencies:
pip install "pipecat-ai[moondream]"

Prerequisites

Local Model Setup

Before using Moondream vision services, you need:
  1. Model Download: First run will automatically download the Moondream model from Hugging Face
  2. Hardware Configuration: Set up CUDA, Intel XPU, or Apple MPS for optimal performance
  3. Storage Space: Ensure sufficient disk space for model files
  4. Memory Requirements: Adequate RAM/VRAM for model inference

Hardware Acceleration

The service automatically detects and uses the best available hardware:
  • Intel XPU: Requires intel_extension_for_pytorch
  • NVIDIA CUDA: For GPU acceleration
  • Apple Metal (MPS): For Apple Silicon optimization
  • CPU: Fallback option for any system

Configuration Options

  • Model Selection: Choose Moondream model version and revision
  • Hardware Override: Force CPU usage if needed
  • Local Processing: Complete privacy with no external API calls
No API keys required - Moondream runs entirely locally for complete privacy and control.

Configuration

model
str
default:"vikhyatk/moondream2"
deprecated
Hugging Face model identifier for the Moondream model. Deprecated in v0.0.105. Use settings=MoondreamService.Settings(model=...) instead.
revision
str
default:"2025-01-09"
Specific model revision to use.
use_cpu
bool
default:"False"
Whether to force CPU usage instead of hardware acceleration. When False, the service automatically detects and uses the best available device (Intel XPU, CUDA, MPS, or CPU).
settings
MoondreamService.Settings
default:"None"
Runtime-configurable settings. See Settings below.

Settings

Runtime-configurable settings passed via the settings constructor argument using MoondreamService.Settings(...). See Service Settings for details.
ParameterTypeDefaultDescription
modelstrNOT_GIVENMoondream model identifier. (Inherited from base settings.)
NOT_GIVEN values are omitted, letting the service use its own defaults ("vikhyatk/moondream2" for model). Only parameters that are explicitly set are included.

Usage

Basic Setup

from pipecat.services.moondream import MoondreamService

vision = MoondreamService()

With Settings and CPU Override

vision = MoondreamService(
    revision="2025-01-09",
    use_cpu=True,
    settings=MoondreamService.Settings(
        model="vikhyatk/moondream2",
    ),
)
The deprecated model constructor parameter is replaced by Settings as of v0.0.105. Use Settings / settings= instead. See the Service Settings guide for migration details.

Notes

  • First-run download: The model is automatically downloaded from Hugging Face on first use. Ensure sufficient disk space and network access.
  • Hardware auto-detection: When use_cpu=False (the default), the service detects available hardware in this priority order: Intel XPU, NVIDIA CUDA, Apple Metal (MPS), then CPU.
  • Data types: CUDA and MPS use float16 for faster inference, while XPU and CPU use float32.
  • Blocking inference: Image analysis runs in a separate thread via asyncio.to_thread to avoid blocking the event loop.