Class HttpInferenceClient

  • All Implemented Interfaces:
    InferenceEngine<Void>

    public class HttpInferenceClient
    extends RemoteInferenceEngine
    HTTP-based remote inference client for REST API model endpoints.

    This engine sends inference requests to remote HTTP endpoints using REST APIs. It supports authentication headers, configurable timeouts, and JSON-based request/response formats. Ideal for integrating with model serving frameworks like TensorFlow Serving REST API, TorchServe, or custom model endpoints.

    Supported Features:

    • REST API Integration: Standard HTTP POST with JSON payloads
    • Authentication: Custom headers via AuthConfig
    • Timeout Configuration: Connect, read, and write timeouts
    • Connection Validation: HEAD requests to verify endpoint availability
    • JSON Serialization: Automatic Java Map ↔ JSON conversion

    Configuration Example:

    
     ModelConfig config = ModelConfig.builder()
         .modelId("http-model")
         .endpointUrl("https://api.modelserver.com/v1/predict")
         .modelOption("connectTimeout", "10")
         .modelOption("readTimeout", "30")
         .modelOption("writeTimeout", "30")
         .authConfig(AuthConfig.builder()
             .addHeader("Authorization", "Bearer token123")
             .addHeader("X-API-Key", "key456")
             .build())
         .build();
    
     HttpInferenceClient client = new HttpInferenceClient();
     client.initialize(config);
     

    Inference Example:

    
     Map<String, Object> inputs = new HashMap<>();
     inputs.put("feature1", 0.5);
     inputs.put("feature2", "text");
     inputs.put("feature3", new float[]{0.1f, 0.2f, 0.3f});
    
     InferenceResult result = client.infer(inputs);
     Map<String, Object> predictions = result.getOutputs();
     

    HTTP Request Details:

    MethodPOST
    Content-Typeapplication/json
    Timeout30 seconds (configurable)
    AuthenticationCustom headers

    Error Handling:

    Performance Considerations:

    • OkHttp connection pooling for HTTP/1.1 and HTTP/2
    • Configurable timeouts to prevent hanging requests
    • Single-threaded by default (use async patterns for high throughput)

    Thread Safety:

    OkHttpClient is thread-safe and can be shared across threads. However, each HttpInferenceClient instance should be used from a single thread or synchronized externally.

    Since:
    1.0.0
    Author:
    Nestor Martourez, Sr Software and Data Streaming Engineer @ CodedStreams
    See Also:
    RemoteInferenceEngine, OkHttpClient, ObjectMapper
    • Constructor Detail

      • HttpInferenceClient

        public HttpInferenceClient()
    • Method Detail

      • initialize

        public void initialize​(ModelConfig config)
                        throws InferenceException
        Initializes the HTTP inference client with connection configuration.

        Configures OkHttpClient with timeout settings from model options:

        • connectTimeout: Connection establishment timeout (default: 30s)
        • readTimeout: Response read timeout (default: 30s)
        • writeTimeout: Request write timeout (default: 30s)

        Also initializes ObjectMapper for JSON serialization.

        Specified by:
        initialize in interface InferenceEngine<Void>
        Overrides:
        initialize in class RemoteInferenceEngine
        Parameters:
        config - model configuration containing endpoint URL and timeout options
        Throws:
        InferenceException - if initialization fails
      • infer

        public InferenceResult infer​(Map<String,​Object> inputs)
                              throws InferenceException
        Sends inference request to remote HTTP endpoint.

        Request flow:

        1. Serialize inputs to JSON using ObjectMapper
        2. Create HTTP POST request with JSON body
        3. Add authentication headers from AuthConfig
        4. Execute request with configured timeouts
        5. Parse JSON response back to Map
        6. Return InferenceResult with timing information
        Specified by:
        infer in interface InferenceEngine<Void>
        Specified by:
        infer in class RemoteInferenceEngine
        Parameters:
        inputs - map of input names to values (must be JSON-serializable)
        Returns:
        inference result containing outputs and request timing
        Throws:
        InferenceException - if HTTP request fails, times out, or response parsing fails
      • validateConnection

        public boolean validateConnection()
                                   throws InferenceException
        Validates connection to remote endpoint using HTTP HEAD request.

        Sends a lightweight HEAD request to verify:

        • Endpoint is reachable
        • Endpoint responds to HTTP requests
        • Authentication works (if configured)
        Specified by:
        validateConnection in class RemoteInferenceEngine
        Returns:
        true if HEAD request succeeds (2xx status)
        Throws:
        InferenceException - if validation request fails (network error, timeout)
      • getMetadata

        public ModelMetadata getMetadata()
        Gets metadata about the remote model.
        Returns:
        model metadata (currently returns null, override for implementation)
      • getModelConfig

        public ModelConfig getModelConfig()
        Gets the model configuration.
        Returns:
        the model configuration used for initialization