feat(types): add Audio class and audio field to Message for multimodal models by Ghraven · Pull Request #654 · ollama/ollama-python

Ghraven · 2026-04-29T04:23:44Z

Summary

Adds a dedicated Audio class and audio field to Message, mirroring the existing Image pattern.

Closes #650

Motivation

Currently, audio data must be passed via the images key, which is confusing and blocks future models that support both images and audio simultaneously. This PR adds a first-class audio field so callers can pass audio data cleanly:

ollama.chat(
    model="gemma4:e2b",
    messages=[{
        "role": "user",
        "content": "Transcribe this",
        "audio": ["recording.wav"],   # clear, not crammed into images
    }]
)

Changes

ollama/_types.py

Added Audio(BaseModel) class after Image, with identical serialisation logic:
- Path / bytes → base64-encodes the data
- str path that exists on disk → base64-encodes the file
- str with a known audio extension that doesn't exist → raises ValueError with a clear message
- str that looks like existing base64 → passes through
- Unknown string → raises ValueError
- Supported extension check covers: mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm
Added audio: Optional[Sequence[Audio]] = None field to Message (after images), with the same docstring style as images.

Compatibility

No breaking changes — audio is optional and defaults to None
The serialisation behaviour is consistent with Image, so the wire format is already what the Ollama server expects (raw base64)

…l support

feat(types): add Audio class and audio field to Message for multimoda…

75fb010

…l support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(types): add Audio class and audio field to Message for multimodal models#654

feat(types): add Audio class and audio field to Message for multimodal models#654
Ghraven wants to merge 1 commit intoollama:mainfrom
Ghraven:feat/audio-message-field

Ghraven commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Ghraven commented Apr 29, 2026

Summary

Motivation

Changes

Compatibility

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant