Skip to content

Latest commit

 

History

History
93 lines (68 loc) · 7.2 KB

File metadata and controls

93 lines (68 loc) · 7.2 KB

Python SDK for Numaflow

Build black License Release Version

pynumaflow is the Python SDK for Numaflow, a Kubernetes-native stream processing framework. Write a Python function, wire it to a server class, and Numaflow handles the gRPC transport, autoscaling, and deployment — no boilerplate required. The SDK supports synchronous and asynchronous execution models, and both function-based and class-based handler styles.

Installation

pip install pynumaflow
Build & develop locally

This project uses uv for dependency management and packaging. To build the package locally, run the following command from the root of the project.

make setup

To run unit tests:

make test

To format code style using black and ruff:

make lint

Setup pre-commit hooks:

pre-commit install

Capabilities

The SDK covers the full range of Numaflow extension points. Each capability maps to a dedicated set of server classes and handler interfaces.

Tip

Each capability below links to working examples in both function-based and class-based handler styles. See the full examples directory for all implementations.

Description API Reference
User-Defined Functions (UDFs) Process and transform stream data — Map, Reduce, Reduce Stream, Map Stream, Batch Map, Accumulator Map · Reduce · Reduce Stream · Map Stream · Batch Map · Accumulator
User-Defined Sources (UDSource) Ingest data from custom sources with read, ack, pending, and partition handlers Sourcer · Source Transform
User-Defined Sinks (UDSink) Deliver data to custom destinations with per-message acknowledgment Sinker
Side Inputs Broadcast slow-changing reference data to UDF vertices without passing it through the pipeline Side Input

Choosing Your Server Type

Each functionality is served by a dedicated server class. Choose the server type that matches your workload characteristics:

Sync Async
Concurrency Model Multithreaded asyncio event loop
Handler Signature def handler(...) async def handler(...)
GIL Behaviour Subject to GIL Subject to GIL
Typical Workloads Stateless transforms I/O-bound operations

Server Class Reference

Functionality Server Class(es)
UDSource SourceAsyncServer
UDSink SinkServer, SinkAsyncServer
Side Input SideInputServer
Map MapServer, MapAsyncServer
Reduce ReduceAsyncServer
Reduce Stream ReduceStreamAsyncServer
Map Stream MapStreamAsyncServer
Batch Map BatchMapAsyncServer
Accumulator AccumulatorAsyncServer
Source Transform SourceTransformServer, SourceTransformAsyncServer

All server types accept handlers in two styles:

  • Function-based — pass a plain def or async def directly to the server. Best for simple, stateless logic.
  • Class-based — inherit from the corresponding base class (e.g., Mapper, Reducer, Sinker) and implement the handler method. Useful when your handler needs initialization arguments, internal state, or helper methods.

The linked examples above demonstrate both styles for each functionality.

Contributing

For SDK development workflow, testing against a live pipeline, and adding new examples, see the Developer Guide. For general contribution guidelines, see the Numaproj Contributing Guide.