Class InferenceSession


  • public class InferenceSession
    extends Object
    Wrapper class for ONNX Runtime sessions providing simplified access to inference capabilities.

    This class encapsulates the ONNX Runtime OrtSession and OrtEnvironment to provide a cleaner API for loading and managing ONNX models. It supports loading models from both file paths and byte arrays, and provides access to model metadata.

    Key Responsibilities:

    • Manage ONNX Runtime environment and session lifecycle
    • Provide access to model input/output metadata
    • Handle resource cleanup through close() method
    • Support multiple model loading strategies

    Usage Example:

    
     // Load from file
     InferenceSession session = new InferenceSession(
         "model.onnx",
         new OrtSession.SessionOptions(),
         OrtEnvironment.getEnvironment()
     );
    
     // Get input metadata
     Map<String, NodeInfo> inputs = session.getInputMetadata();
    
     // Use session for inference
     OrtSession ortSession = session.getSession();
    
     // Clean up
     session.close();
     

    Thread Safety:

    ONNX Runtime sessions are not thread-safe for concurrent inference calls. For multi-threaded scenarios, create separate InferenceSession instances or synchronize access to the getSession() method.

    Resource Management:

    Always call close() when finished with the session to release native resources. Consider using try-with-resources pattern:

    
     try (InferenceSession session = new InferenceSession(...)) {
         // Use session
     }
     
    Since:
    1.0.0
    Author:
    Nestor Martourez, Sr Software and Data Streaming Engineer @ CodedStreams
    See Also:
    OrtSession, OrtEnvironment, OnnxInferenceEngine
    • Constructor Detail

      • InferenceSession

        public InferenceSession​(String modelPath,
                                ai.onnxruntime.OrtSession.SessionOptions options,
                                ai.onnxruntime.OrtEnvironment environment)
                         throws Exception
        Loads an ONNX model from a file path.
        Parameters:
        modelPath - path to the ONNX model file
        options - session configuration options
        environment - ONNX Runtime environment
        Throws:
        Exception - if model loading fails
      • InferenceSession

        public InferenceSession​(byte[] modelBytes,
                                ai.onnxruntime.OrtSession.SessionOptions options,
                                ai.onnxruntime.OrtEnvironment environment)
                         throws Exception
        Loads an ONNX model from a byte array.

        Useful for loading models from memory or network streams.

        Parameters:
        modelBytes - byte array containing the ONNX model
        options - session configuration options
        environment - ONNX Runtime environment
        Throws:
        Exception - if model loading fails
    • Method Detail

      • getInputMetadata

        public Map<String,​ai.onnxruntime.NodeInfo> getInputMetadata()
                                                                   throws ai.onnxruntime.OrtException
        Gets metadata about model inputs.
        Returns:
        map of input names to NodeInfo describing input tensors
        Throws:
        ai.onnxruntime.OrtException - if metadata retrieval fails
      • getOutputMetadata

        public Map<String,​ai.onnxruntime.NodeInfo> getOutputMetadata()
                                                                    throws ai.onnxruntime.OrtException
        Gets metadata about model outputs.
        Returns:
        map of output names to NodeInfo describing output tensors
        Throws:
        ai.onnxruntime.OrtException - if metadata retrieval fails
      • getSession

        public ai.onnxruntime.OrtSession getSession()
        Gets the underlying ONNX Runtime session.

        Provides direct access to ONNX Runtime API for advanced use cases.

        Returns:
        the OrtSession instance
      • close

        public void close()
        Closes the session and releases native resources.

        This method is idempotent and can be called multiple times. Always call this method when finished with the session to prevent native memory leaks.