Interface InferenceEngine<T>
-
- Type Parameters:
T- the underlying model type (e.g., SavedModelBundle for TensorFlow)
- All Known Implementing Classes:
HttpInferenceClient,LocalInferenceEngine,OnnxInferenceEngine,PmmlInferenceEngine,RemoteInferenceEngine,SageMakerInferenceClient,TensorFlowGraphDefEngine,TensorFlowInferenceEngine,TensorFlowSavedModelEngine,TorchScriptInferenceEngine,VertexAIInferenceClient,XGBoostInferenceEngine
public interface InferenceEngine<T>Core interface for ML inference engines in Otter Stream.All inference engines (TensorFlow, ONNX, PyTorch, XGBoost, etc.) implement this interface to provide a uniform API for model loading and prediction.
Lifecycle:
initialize(ModelConfig)- Load and prepare the modelisReady()- Verify engine is ready for inferenceinfer(Map)orinferBatch(Map[])- Make predictionsclose()- Release resources
Implementation Example:
{@code public class MyCustomEngine implements InferenceEngine{ private MyModel model; - Since:
- 1.0.0
- Author:
- Nestor Martourez, Sr Software and Data Streaming Engineer @ CodedStreams
- See Also:
LocalInferenceEngine
-
-
Nested Class Summary
Nested Classes Modifier and Type Interface Description static classInferenceEngine.EngineCapabilitiesDescribes the capabilities of an inference engine.
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description voidclose()Closes the inference engine and releases all resources.InferenceEngine.EngineCapabilitiesgetCapabilities()Gets the capabilities of this inference engine.ModelMetadatagetMetadata()Gets metadata about the loaded model.ModelConfiggetModelConfig()Gets the configuration used to initialize this engine.InferenceResultinfer(Map<String,Object> inputs)Performs inference on a single input.InferenceResultinferBatch(Map<String,Object>[] batchInputs)Performs batch inference on multiple inputs.voidinitialize(ModelConfig config)Initializes the inference engine with the given configuration.booleanisReady()Checks if the engine is ready for inference operations.
-
-
-
Method Detail
-
initialize
void initialize(ModelConfig config) throws InferenceException
Initializes the inference engine with the given configuration.Loads the model and prepares the engine for inference operations.
- Parameters:
config- model configuration- Throws:
InferenceException- if initialization fails
-
infer
InferenceResult infer(Map<String,Object> inputs) throws InferenceException
Performs inference on a single input.- Parameters:
inputs- map of input name to input value- Returns:
- inference result containing predictions
- Throws:
InferenceException- if inference fails
-
inferBatch
InferenceResult inferBatch(Map<String,Object>[] batchInputs) throws InferenceException
Performs batch inference on multiple inputs.Batch inference is typically more efficient than multiple single inferences.
- Parameters:
batchInputs- array of input maps- Returns:
- inference result containing batch predictions
- Throws:
InferenceException- if inference fails
-
getCapabilities
InferenceEngine.EngineCapabilities getCapabilities()
Gets the capabilities of this inference engine.- Returns:
- engine capabilities (batching, GPU support, etc.)
-
close
void close() throws InferenceExceptionCloses the inference engine and releases all resources.After calling this method, the engine should not be used again.
- Throws:
InferenceException- if cleanup fails
-
isReady
boolean isReady()
Checks if the engine is ready for inference operations.- Returns:
- true if engine is initialized and ready
-
getMetadata
ModelMetadata getMetadata()
Gets metadata about the loaded model.- Returns:
- model metadata including inputs, outputs, and format
-
getModelConfig
ModelConfig getModelConfig()
Gets the configuration used to initialize this engine.- Returns:
- model configuration
-
-