otter-stream-onnx
Neural Networks
High-performance ONNX Runtime integration for cross-platform neural network inference with GPU acceleration support.
Module Overview
The ONNX module provides production-ready integration with ONNX Runtime, supporting multiple execution providers (CPU, CUDA, TensorRT, DirectML) for high-performance neural network inference in streaming applications.
ONNX Inference Engine
Neural NetworksProduction-ready ONNX Runtime inference with automatic provider selection and optimization.
- CPU, CUDA, TensorRT, DirectML execution providers
- Automatic tensor conversion and shape handling
- Batch inference with configurable sizes
- Thread-safe session management
Session Management
CoreEfficient ONNX Runtime session lifecycle management with metadata extraction.
- InferenceSession - Wrapper for OrtSession
- Load from file paths or byte arrays
- Input/output metadata retrieval
- Automatic resource cleanup
Implementing ONNX Inference
Step-by-step guide to integrate ONNX models into your Flink pipeline.
-
Add Maven Dependency
<dependency> <groupId>com.codedstreams</groupId> <artifactId>otter-stream-onnx</artifactId> <version>1.0.16</version> </dependency> -
Export Your PyTorch/TensorFlow Model to ONNX
# PyTorch to ONNX import torch dummy_input = torch.randn(1, 3, 224, 224) torch.onnx.export(model, dummy_input, "model.onnx") # TensorFlow to ONNX python -m tf2onnx.convert --saved-model tensorflow_model \ --output model.onnx -
Configure and Initialize Engine
ModelConfig modelConfig = ModelConfig.builder() .modelId("image-classifier") .modelPath("s3://my-bucket/models/resnet50.onnx") .format(ModelFormat.ONNX) .modelOptions(Map.of( "providers", List.of("CUDAExecutionProvider", "CPUExecutionProvider"), "intra_op_num_threads", 4 )) .build(); OnnxInferenceEngine engine = new OnnxInferenceEngine(); engine.initialize(modelConfig); -
Perform Inference
Mapinputs = new HashMap<>(); inputs.put("input", imagePixels); // float[] array InferenceResult result = engine.infer(inputs); float[] predictions = result.getOutput("output");
Supported Execution Providers
| Provider | Platform | Hardware | Performance |
|---|---|---|---|
| CUDAExecutionProvider | Linux, Windows | NVIDIA GPU | Highest |
| TensorRTExecutionProvider | Linux, Windows | NVIDIA GPU | Highest (Optimized) |
| CPUExecutionProvider | All | CPU | Good |
| DirectMLExecutionProvider | Windows | GPU (AMD/Intel/NVIDIA) | High |
Maven Dependency
<dependency>
<groupId>com.codedstreams</groupId>
<artifactId>otter-stream-onnx</artifactId>
<version>1.0.16</version>
</dependency>