Dependency Graph
All engine modules depend on ml-inference-core. The SQL module depends on
ml-inference-core and optionally on whichever engine modules are on the runtime
classpath. Flink is always provided scope — it must never be bundled in the shaded JAR.
otter-streams (parent pom) | +-- ml-inference-core <-- InferenceEngine, InferenceResult, ModelCache, metrics | +-- otter-stream-sql <-- depends on: ml-inference-core | +-- otter-stream-onnx <-- depends on: ml-inference-core, onnxruntime +-- otter-stream-tensorflow <-- depends on: ml-inference-core, tensorflow-core-platform +-- otter-streams-xgboost <-- depends on: ml-inference-core, xgboost4j +-- otter-stream-pmml <-- depends on: ml-inference-core, jpmml-evaluator +-- otter-stream-pytorch <-- depends on: ml-inference-core, djl-pytorch-engine +-- otter-stream-remote <-- depends on: ml-inference-core, okhttp, aws-sdk | +-- otter-stream-examples (not published)
ml-inference-core
The foundation of the library. Every other module declares a compile dependency on this artifact. It contains no ML framework code — only the contracts, configuration objects, caching layer, and Flink async function wiring that all engines share.
Key classes
| Class / Interface | Package | Purpose |
|---|---|---|
InferenceEngine<C> | ...inference.engine | Top-level contract all engines implement. Typed on a config record. |
InferenceResult | ...inference.model | Holds output tensors, latency, and success flag. |
InferenceException | ...inference.exception | Checked exception thrown by engine.infer(). |
ModelCache | ...sql.loader | Caffeine-backed LRU singleton keyed by model name. |
InferenceConfig | ...inference.config | Builder for batch size, timeouts, retries, caching, metrics. |
AsyncModelInferenceFunction | ...inference.function | Flink AsyncFunction wrapper for use in DataStream pipelines. |
Maven
<dependency> <groupId>com.codedstreams</groupId> <artifactId>ml-inference-core</artifactId> <version>1.0.17</version> </dependency>
otter-stream-sql
Provides everything needed to use Otter Streams from Flink SQL without writing Java operators.
The module registers a scalar UDF, a lookup table function, and a full dynamic table connector
(ml-inference) that the Flink planner discovers via Java SPI.
Key classes
| Class | Type | SQL usage |
|---|---|---|
MLInferenceFunction | ScalarFunction | ml_score(features, 'model-name') |
MLInferenceLookupFunction | TableFunction | Temporal join source |
MLInferenceDynamicTableFactory | DynamicTableSourceFactory | 'connector' = 'ml-inference' |
MLInferenceDynamicTableSource | LookupTableSource | Lookup join runtime provider |
SqlInferenceConfig | Config record | Parses DDL WITH options |
*-flink-udf classifier artifact so that ml-inference-core and all
transitive runtime deps are bundled. See the Shaded JAR Build section.Maven
<dependency> <groupId>com.codedstreams</groupId> <artifactId>otter-stream-sql</artifactId> <version>1.0.17</version> </dependency>
otter-stream-onnx
Wraps Microsoft's ONNX Runtime Java API. Supports CPU, CUDA, and TensorRT execution providers
with automatic provider fallback. Input tensors are constructed from Map<String, Object>
with automatic float32 / int64 conversion.
Supported execution providers
| Provider | Hardware | Requires |
|---|---|---|
CPUExecutionProvider | Any CPU | Default — no extra dependencies |
CUDAExecutionProvider | NVIDIA GPU | CUDA 11.x + cuDNN |
TensorRTExecutionProvider | NVIDIA GPU | TensorRT 8+ |
Maven
<dependency> <groupId>com.codedstreams</groupId> <artifactId>otter-stream-onnx</artifactId> <version>1.0.17</version> </dependency>
otter-stream-tensorflow
Loads TensorFlow SavedModel directories and performs inference via the official TensorFlow Java API.
Signature names and tensor names are discovered automatically from the model's saved_model.pb.
tensorflow-core-platform includes native shared libraries (~400 MB per platform).
Use the -linux-x86_64 or -linux-gpu-x86_64 classifiers in production
to avoid bundling all platforms in the shaded JAR.Maven
<dependency> <groupId>com.codedstreams</groupId> <artifactId>otter-stream-tensorflow</artifactId> <version>1.0.17</version> </dependency>
otter-streams-xgboost
Loads XGBoost models using XGBoost4J. Converts feature maps to a DMatrix and invokes
Booster.predict(). Thread-safe via per-call DMatrix construction. Supports binary
classification (binary:logistic), multi-class, and regression objectives.
Supported file formats
| Format | Extension | Notes |
|---|---|---|
| Binary | .bin, .model | Fastest load time; not human-readable |
| JSON | .json | Portable; readable; XGBoost ≥ 1.0 |
| UBJSON | .ubj | Compact binary JSON; XGBoost ≥ 1.7 |
Maven
<dependency> <groupId>com.codedstreams</groupId> <artifactId>otter-streams-xgboost</artifactId> <version>1.0.17</version> </dependency>
otter-stream-pmml
Evaluates PMML 4.x documents using the JPMML-Evaluator library. Supports logistic regression, decision trees, random forests, gradient boosted trees, neural networks, and naïve Bayes. Preprocessing transformations defined inside the PMML document are applied automatically.
Maven
<dependency> <groupId>com.codedstreams</groupId> <artifactId>otter-stream-pmml</artifactId> <version>1.0.17</version> </dependency>
otter-stream-remote
Routes inference calls to external endpoints: REST APIs, AWS SageMaker, GCP Vertex AI, and Azure ML. Uses OkHttp for HTTP/1.1 and gRPC for binary protocol communication. Includes configurable retry, timeout, and circuit-breaker policies. Useful when the model is too large to embed in the Flink JAR or when a dedicated model server is preferred.
Maven
<dependency> <groupId>com.codedstreams</groupId> <artifactId>otter-stream-remote</artifactId> <version>1.0.17</version> </dependency>