| BatchInferenceProcessor |
Batches inference requests for improved throughput.
|
| InferenceCircuitBreaker |
Circuit breaker to prevent cascading failures in inference operations.
|
| InferenceContext |
Context for inference execution with metadata.
|
| InferenceEngineFactory |
Factory for creating inference engines based on model format.
|
| InferenceMetricsCollector |
Collects and tracks inference metrics per model.
|
| InferenceMetricsCollector.ModelMetrics |
Metrics for a single model.
|
| InferenceRetryHandler |
Handles retry logic for failed inference operations.
|
| ModelHealthChecker |
Performs periodic health checks on loaded models.
|
| ModelHealthChecker.HealthStatus |
Health status for a single model.
|
| ModelSession |
Manages stateful inference sessions.
|
| ModelWarmupUtility |
Utility for warming up models to avoid cold start latency.
|
| TensorFlowGraphDefEngine |
TensorFlow GraphDef (frozen graph) inference engine.
|
| TensorFlowSavedModelEngine |
TensorFlow SavedModel inference engine.
|