otter-stream-onnx - Otter Streams SDK

Module Overview

The ONNX module provides production-ready integration with ONNX Runtime, supporting multiple execution providers (CPU, CUDA, TensorRT, DirectML) for high-performance neural network inference in streaming applications.

🎯

ONNX Inference Engine

Neural Networks

Production-ready ONNX Runtime inference with automatic provider selection and optimization.

CPU, CUDA, TensorRT, DirectML execution providers
Automatic tensor conversion and shape handling
Batch inference with configurable sizes
Thread-safe session management

🔧

Session Management

Core

Efficient ONNX Runtime session lifecycle management with metadata extraction.

InferenceSession - Wrapper for OrtSession
Load from file paths or byte arrays
Input/output metadata retrieval
Automatic resource cleanup

Implementing ONNX Inference

Step-by-step guide to integrate ONNX models into your Flink pipeline.

Add Maven Dependency

<dependency>
    <groupId>com.codedstreams</groupId>
    <artifactId>otter-stream-onnx</artifactId>
    <version>1.0.16</version>
</dependency>

Export Your PyTorch/TensorFlow Model to ONNX

# PyTorch to ONNX
import torch
dummy_input = torch.randn(1, 3, 224, 224)
torch.onnx.export(model, dummy_input, "model.onnx")

# TensorFlow to ONNX
python -m tf2onnx.convert --saved-model tensorflow_model \
    --output model.onnx

Configure and Initialize Engine

ModelConfig modelConfig = ModelConfig.builder()
    .modelId("image-classifier")
    .modelPath("s3://my-bucket/models/resnet50.onnx")
    .format(ModelFormat.ONNX)
    .modelOptions(Map.of(
        "providers", List.of("CUDAExecutionProvider", "CPUExecutionProvider"),
        "intra_op_num_threads", 4
    ))
    .build();

OnnxInferenceEngine engine = new OnnxInferenceEngine();
engine.initialize(modelConfig);

Perform Inference

Map inputs = new HashMap<>();
inputs.put("input", imagePixels);  // float[] array

InferenceResult result = engine.infer(inputs);
float[] predictions = result.getOutput("output");

Supported Execution Providers

Provider	Platform	Hardware	Performance
CUDAExecutionProvider	Linux, Windows	NVIDIA GPU	Highest
TensorRTExecutionProvider	Linux, Windows	NVIDIA GPU	Highest (Optimized)
CPUExecutionProvider	All	CPU	Good
DirectMLExecutionProvider	Windows	GPU (AMD/Intel/NVIDIA)	High

Quick Links

🔷 ONNX Engine Details 🔧 Session Management 📚 Full JavaDocs 🏠 Back to Home ⚙️ ← Core Module 🔷 TensorFlow Module →

Maven Dependency

<dependency>
    <groupId>com.codedstreams</groupId>
    <artifactId>otter-stream-onnx</artifactId>
    <version>1.0.16</version>
</dependency>