Class InferenceConfig


  • public class InferenceConfig
    extends Object
    Comprehensive configuration for ML inference operations in Apache Flink streams.

    This class configures all aspects of inference including model settings, performance tuning (batching, timeouts), retry policies, and monitoring.

    Key Configuration Areas:

    • Model: Which model to use and how to load it
    • Performance: Batch size and timeout settings
    • Reliability: Retry logic for failed inferences
    • Monitoring: Metrics collection enablement
    • Engine: Custom engine-specific options

    Usage Example:

    
     // Configure TensorFlow inference
     ModelConfig modelConfig = ModelConfig.builder()
         .modelId("sentiment-model")
         .modelPath("/models/sentiment.pb")
         .format(ModelFormat.TENSORFLOW_SAVEDMODEL)
         .build();
    
     InferenceConfig config = InferenceConfig.builder()
         .modelConfig(modelConfig)
         .batchSize(32)                    // Process 32 records at once
         .timeout(Duration.ofSeconds(30))   // 30 second timeout
         .maxRetries(3)                     // Retry up to 3 times
         .enableMetrics(true)               // Collect performance metrics
         .build();
    
     // Use in Flink stream
     DataStream<Prediction> predictions = input
         .async(new AsyncModelInferenceFunction<>(config, engineFactory));
     
    Since:
    1.0.0
    Author:
    Nestor Martourez, Sr Software and Data Streaming Engineer @ CodedStreams
    See Also:
    ModelConfig
    • Constructor Detail

      • InferenceConfig

        public InferenceConfig​(ModelConfig modelConfig,
                               int batchSize,
                               long timeoutMs,
                               int maxRetries,
                               boolean enableMetrics,
                               Map<String,​Object> engineOptions)
        Constructs inference configuration.
        Parameters:
        modelConfig - model configuration
        batchSize - number of records to batch together
        timeoutMs - inference timeout in milliseconds
        maxRetries - maximum retry attempts for failed inferences
        enableMetrics - whether to collect metrics
        engineOptions - engine-specific configuration options
    • Method Detail

      • builder

        public static InferenceConfig.Builder builder()
        Creates a new builder for InferenceConfig.
        Returns:
        a new builder instance
      • getBatchSize

        public int getBatchSize()
      • getTimeoutMs

        public long getTimeoutMs()
      • getMaxRetries

        public int getMaxRetries()
      • isEnableMetrics

        public boolean isEnableMetrics()