Class FraudDetectionExample


  • public class FraudDetectionExample
    extends Object
    Example demonstrating real-time fraud detection using OtterStream's ML inference capabilities.

    This example shows how to integrate machine learning models into Apache Flink streaming pipelines for real-time fraud detection on financial transactions. It demonstrates:

    • Integration of ONNX models with Flink's async I/O
    • Real-time inference on streaming transaction data
    • Risk classification based on model predictions
    • Configuration of inference parameters for production use

    Pipeline Architecture:

     Transaction Source → Async Inference → Risk Classification → Output
          ↓                     ↓                  ↓
       Synthetic    Fraud Detection     🚨/⚠️/✅ Labels
       Transactions    Model
     

    Key Features Demonstrated:

    1. Async Model Inference: Non-blocking ML inference using AsyncModelInferenceFunction
    2. ONNX Integration: Loading and running ONNX models via OnnxInferenceEngine
    3. Stream Processing: Real-time processing with configurable timeout and retry logic
    4. Risk Categorization: Three-tier risk classification (HIGH/MEDIUM/LOW)

    Transaction Features:

    The synthetic transaction generator creates realistic transaction data with features including:

    • Transaction amount (0-1000)
    • Time of day (0-23 hours)
    • Geographic location (10 regions)
    • Merchant category (20 categories)
    • Customer history (chargebacks, account age)
    • Device type (5 device categories)

    Running the Example:

    
     // 1. Place your ONNX model at: models/fraud_detection.onnx
     // 2. Run the example:
     mvn exec:java -Dexec.mainClass="com.codedstreams.otterstream.examples.FraudDetectionExample"
     

    Expected Output:

     🚨 HIGH RISK - Transaction: txn_42 - Probability: 0.95
     ✅ LOW RISK - Transaction: txn_43 - Probability: 0.23
     ⚠️ MEDIUM RISK - Transaction: txn_44 - Probability: 0.78
     

    Performance Considerations:

    • Async I/O: Uses Flink's async operators to prevent blocking on model inference
    • Batch Size: Configurable batch processing (default: 32) for throughput optimization
    • Timeout: 5-second inference timeout prevents pipeline stalls
    • Retries: Automatic retry (3 attempts) for transient failures
    Since:
    1.0.0
    Author:
    Nestor Martourez, Sr Software and Data Streaming Engineer @ CodedStreams
    See Also:
    AsyncModelInferenceFunction, OnnxInferenceEngine, InferenceConfig
    • Constructor Detail

      • FraudDetectionExample

        public FraudDetectionExample()
    • Method Detail

      • main

        public static void main​(String[] args)
                         throws Exception
        Main entry point for the fraud detection pipeline.

        Sets up and executes the complete streaming pipeline with ML inference.

        Parameters:
        args - command line arguments (not used)
        Throws:
        Exception - if pipeline execution fails