eventhub/README.md

9.1 KiB

EventHub Acquisition Service

Spring Boot + Apache Camel project skeleton for acquiring normalized EventHub point events from multiple providers/sources.

The current version intentionally focuses on acquisition. Final canonical storage/deduplication can be discussed later. The included PostgreSQL schema is a small acquisition-stage store so the project can be run and tested end-to-end.

Architecture

source-specific Camel input route
    -> source-specific mapper
    -> EventHubEventDto
    -> common EventHub acquisition route
    -> validation
    -> package-key creation from tenant + EventSource + event family + date/window
    -> aggregation / batching
    -> chronological sorting inside the batch
    -> acquisition package handoff

Namespace

at.procon.eventhub

Main model decisions

1. One event = one time point

EventHubEventDto has exactly one timestamp:

occurredAt

There is no generic duration, endTime, validFrom, or validTo. If a source row represents an interval, the mapper may emit separate point events, for example DRIVE START and DRIVE END.

2. Tenant is package-level

tenantKey identifies the owner/client/account for the package. It is required for acquisition grouping and future master-data resolution.

{
  "tenantKey": "kralowetz"
}

Organisation is not mandatory in the incoming event. It can later be derived from resolved driver/vehicle + occurredAt.

3. EventSource replaces sourceTable/sourceSystem

The acquisition context is represented by EventSourceDto:

{
  "providerKey": "TACHOGRAPH",
  "sourceKind": "VEHICLE_UNIT",
  "sourceKey": "TACHOGRAPH_VEHICLE_UNIT",
  "sourceInstanceKey": "main-tachograph-db",
  "tenantProviderSettingKey": "kralowetz-tachograph-prod",
  "externalFleetKey": null
}

Examples:

TACHOGRAPH / VEHICLE_UNIT
TACHOGRAPH / DRIVER_CARD
YELLOWFOX / TELEMATICS_PLATFORM / YELLOWFOX_D8
FLEETBOARD / TELEMATICS_PLATFORM / FLEETBOARD_POSITION

EventSource is acquisition context. It should not be part of the canonical real-world event identity. A VU event and a driver-card event may describe the same real event.

4. Source-side master references, no incoming internal IDs

The incoming DTO does not require internal driverId or vehicleId, because in normal ingestion those ids are not known yet.

Driver reference:

"driverRef": {
  "sourceEntityId": "driver-100",
  "driverCard": {
    "nation": "AT",
    "number": "D123456789"
  }
}

Vehicle reference:

"vehicleRef": {
  "sourceEntityId": "vehicle-200",
  "vin": "WDB9634031L123456",
  "vehicleRegistration": {
    "nation": "AT",
    "number": "W-12345"
  }
}

VIN is optional. Driver-card-only events can carry only the nation-scoped VRN/registration:

"vehicleRef": {
  "sourceEntityId": null,
  "vin": null,
  "vehicleRegistration": {
    "nation": "AT",
    "number": "W-12345"
  }
}

This allows late resolution when VU/master data later connects the VRN to a VIN.

5. Generic normalized eventDetails

Reusable event-specific properties are stored in:

"eventDetails": {
  "type": "DRIVER_ACTIVITY",
  "attributes": {
    "cardSlot": "DRIVER",
    "cardStatus": "INSERTED",
    "drivingStatus": "SINGLE"
  }
}

Raw provider values stay in payload:

"payload": {
  "raw": {
    "cardSlot": 0,
    "cardStatus": 0,
    "drivingStatus": 0
  }
}

This keeps the acquisition DTO generic while preserving meaningful normalized fields.

Package-level acquisition request

For external/manual ingestion, the preferred request shape is:

{
  "package": {
    "tenantKey": "kralowetz",
    "eventSource": {
      "providerKey": "TACHOGRAPH",
      "sourceKind": "VEHICLE_UNIT",
      "sourceKey": "TACHOGRAPH_VEHICLE_UNIT",
      "sourceInstanceKey": "main-tachograph-db",
      "tenantProviderSettingKey": "kralowetz-tachograph-prod"
    },
    "eventFamily": "DRIVER_ACTIVITY",
    "businessDate": "2026-04-28",
    "requestedFrom": "2026-04-28T00:00:00+02:00",
    "requestedTo": "2026-04-29T00:00:00+02:00",
    "externalPackageId": "TACHOGRAPH:VEHICLE_UNIT:DRIVER_ACTIVITY:2026-04-28"
  },
  "events": [
    {
      "externalSourceEventId": "TACHOGRAPH:VEHICLE_UNIT:activity:456:start",
      "driverRef": {
        "sourceEntityId": "driver-100",
        "driverCard": {
          "nation": "AT",
          "number": "D123456789"
        }
      },
      "vehicleRef": {
        "sourceEntityId": "vehicle-200",
        "vin": "WDB9634031L123456",
        "vehicleRegistration": {
          "nation": "AT",
          "number": "W-12345"
        }
      },
      "occurredAt": "2026-04-28T08:00:00+02:00",
      "eventDomain": "DRIVER_ACTIVITY",
      "eventType": "DRIVE",
      "lifecycle": "START",
      "eventDetails": {
        "type": "DRIVER_ACTIVITY",
        "attributes": {
          "cardSlot": "DRIVER",
          "cardStatus": "INSERTED",
          "drivingStatus": "SINGLE"
        }
      },
      "payload": {
        "raw": {
          "activity": 3,
          "cardSlot": 0,
          "cardStatus": 0,
          "drivingStatus": 0
        }
      }
    }
  ]
}

Routes

Source-specific input routes

direct:yellowfox-d8-booking-input
direct:telematics-position-input
direct:tachograph-activity-input
direct:eventhub-package-input
direct:eventhub-manual-input

Common route

direct:eventhub-normalized-input
    -> validate EventHubEventDto
    -> create package key from tenant + EventSource/package context
    -> seda:eventhub-batch-input
    -> aggregate by eventhub.packageKey
    -> sort by occurredAt inside the batch
    -> EventHubIngestionService.ingest(...)

REST endpoints

POST /api/eventhub/acquisition/yellowfox/d8-bookings
POST /api/eventhub/acquisition/telematics/positions
POST /api/eventhub/acquisition/tachograph/activities
POST /api/eventhub/acquisition/packages
POST /api/eventhub/acquisition/events

Example: tachograph driver-card activity with VRN only

curl -X POST http://localhost:8080/api/eventhub/acquisition/tachograph/activities \
  -H "Content-Type: application/json" \
  -d '[
    {
      "tenantKey": "kralowetz",
      "sourceKind": "DRIVER_CARD",
      "sourceInstanceKey": "main-tachograph-db",
      "tenantProviderSettingKey": "kralowetz-tachograph-prod",
      "externalSourceEventId": "TACHOGRAPH:DRIVER_CARD:activity:789:start",
      "driverRef": {
        "sourceEntityId": "driver-100",
        "driverCard": {
          "nation": "AT",
          "number": "D123456789"
        }
      },
      "vehicleRef": {
        "sourceEntityId": null,
        "vin": null,
        "vehicleRegistration": {
          "nation": "AT",
          "number": "W-12345"
        }
      },
      "occurredAt": "2026-04-28T08:00:00+02:00",
      "activityType": "DRIVE",
      "lifecycle": "START",
      "cardSlot": "DRIVER",
      "cardStatus": "INSERTED",
      "drivingStatus": "SINGLE",
      "payload": {
        "raw": {
          "activity": 3,
          "cardSlot": 0,
          "cardStatus": 0,
          "drivingStatus": 0
        }
      }
    }
  ]'

The mapper creates:

Tenant = kralowetz
EventSource = TACHOGRAPH / DRIVER_CARD / TACHOGRAPH_DRIVER_CARD
EventDomain = DRIVER_ACTIVITY
EventType = DRIVE
Lifecycle = START
EventDetails.type = DRIVER_ACTIVITY
VehicleRef = VRN-only, VIN can be resolved later

Start PostgreSQL

docker compose up -d

Run the service

mvn spring-boot:run

Check acquisition packages

select p.received_at,
       p.tenant_key,
       s.provider_key,
       s.source_kind,
       s.source_key,
       p.event_family,
       p.business_date,
       p.status,
       p.event_count
from eventhub.data_package p
join eventhub.event_source s on s.id = p.event_source_id
order by p.received_at desc;

Check acquired events

select occurred_at,
       driver_source_entity_id,
       driver_card_nation,
       driver_card_number,
       vehicle_source_entity_id,
       vehicle_vin,
       vehicle_registration_nation,
       vehicle_registration_number,
       event_domain,
       event_type,
       lifecycle,
       event_details,
       payload
from eventhub.acquired_event
order by occurred_at desc;

Next implementation steps

  1. Add source-specific SQL extraction routes for the tachograph DB event families:
    • activities from CardActivity/VUActivity
    • card insert/withdraw from CardVehiclesUsed/IWCycle
    • positions from places/GNSS/border/load-unload sources
    • border crossings
    • load/unload
    • specific conditions: out-of-scope and ferry/train
    • speeding events
  2. Keep each extractor package-scoped by tenant + EventSource + eventFamily + businessDate/import window.
  3. Add master-data resolution later:
    • driver by tenant + driver card nation/number + occurredAt
    • vehicle by tenant + VIN or tenant + registration nation/number + occurredAt
    • late resolution from VRN-only driver-card events to VIN after VU/master data import
  4. Discuss final storage model:
    • canonical eventhub.event
    • source-record table linked to EventSource/package
    • deduplication policy for VU vs driver-card duplicates