# EventHub Acquisition Service Spring Boot + Apache Camel project skeleton for acquiring normalized EventHub point events from multiple providers/sources. The current version intentionally focuses on **acquisition**. It stores source records as imported and does not merge or deduplicate equivalent events from different providers/sources. It does, however, keep a non-unique eventSignatureHash as a later merge/gap-filling hint. Later query/read models can merge sources when a preferred/main source contains gaps. The included PostgreSQL schema is a small acquisition-stage store so the project can be run and tested end-to-end. ## Architecture ```text source-specific Camel input route -> source-specific mapper -> EventHubEventDto -> common EventHub acquisition route -> validation -> package-key creation from tenant + EventSource + source group + import scope + event family -> aggregation / batching -> chronological sorting inside the batch -> acquisition package handoff ``` ## Namespace ```text at.procon.eventhub ``` ## Main model decisions ### 1. One event = one time point `EventHubEventDto` has exactly one timestamp: ```text occurredAt ``` There is no generic `duration`, `endTime`, `validFrom`, or `validTo`. If a source row represents an interval, the mapper may emit separate point events, for example `DRIVE START` and `DRIVE END`. ### 2. Tenant is package-level `tenantKey` identifies the owner/client/account for the package. It is required for acquisition grouping and future master-data resolution. ### 3. EventSource identifies the technical source ```json { "providerKey": "TACHOGRAPH", "sourceKind": "VEHICLE_UNIT", "sourceKey": "TACHOGRAPH_VEHICLE_UNIT", "sourceInstanceKey": "main-tachograph-db", "tenantProviderSettingKey": "kralowetz-tachograph-prod", "externalFleetKey": null } ``` Examples: ```text TACHOGRAPH / VEHICLE_UNIT TACHOGRAPH / DRIVER_CARD YELLOWFOX / TELEMATICS_PLATFORM / YELLOWFOX_D8 FLEETBOARD / TELEMATICS_PLATFORM / FLEETBOARD_POSITION ``` `EventSource` is acquisition context. A VU event, a driver-card event and a YellowFox D8 event may describe the same real-world event, but this acquisition service keeps them as separate acquired source records. Cross-source merging/gap filling is intentionally left for a later query/read model. ### 4. No cross-source deduplication during acquisition The acquisition layer stores every source record independently. It uses `sourceRecordKeyHash` only for idempotency of the same source event, so the same input package can be retried safely. It does **not** deduplicate VU vs driver-card vs YellowFox records. This is intentional because later queries may need to combine sources: for example, use tachograph data as the main source, but fill gaps from YellowFox or another provider. The acquisition table also stores a non-unique `eventSignatureHash`. This is a semantic merge hint, not a unique key. It intentionally excludes `EventSource` and `externalSourceEventId`, so VU, driver-card and YellowFox records that look like the same real-world event can share a signature while still being stored separately. Later query/projection logic can use this signature for source comparison, gap filling, and merged timelines. The signature prefers nation-scoped driver card and vehicle registration when available, then VIN or source entity id as fallback, so it remains useful before final master-data resolution. Therefore the current model preserves: ```text tenantKey eventSource sourceGroup importScope externalSourceEventId source-side driver/vehicle references eventDetails payload ``` ### 5. SourceGroup captures tachograph organisation or YellowFox fleet `sourceGroup` is package-level source grouping information. For tachograph it can be a source organisation: ```json "sourceGroup": { "type": "ORGANISATION", "sourceEntityId": "147", "code": "147", "name": "Kralowetz" } ``` For YellowFox it can be a fleet: ```json "sourceGroup": { "type": "FLEET", "sourceEntityId": "7", "code": "7", "name": "YellowFox Fleet 7" } ``` The YellowFox fleet belongs to the same tenant/customer, but it is not forced to be an organisation. It can later be mapped to a tenant organisation if needed. ### 6. ImportScope captures organisation and time filtering `importScope` describes what was selected from the source system. Full DB import: ```json "importScope": { "type": "TENANT_ALL", "rootSourceOrganisation": null, "includeChildren": false, "occurredFrom": null, "occurredTo": null } ``` Organisation subtree + time-window import: ```json "importScope": { "type": "SOURCE_ORGANISATION_SUBTREE", "rootSourceOrganisation": { "type": "ORGANISATION", "sourceEntityId": "147", "code": "147", "name": "Kralowetz" }, "includeChildren": true, "occurredFrom": "2026-04-28T00:00:00+02:00", "occurredTo": "2026-04-29T00:00:00+02:00" } ``` `occurredFrom` is inclusive and `occurredTo` is exclusive. Both may be `null` for complete source DB import. ### 7. Source-side master references, no incoming internal IDs The incoming DTO does not require internal `driverId` or `vehicleId`, because in normal ingestion those ids are not known yet. Driver reference with nation-scoped driver card: ```json "driverRef": { "sourceEntityId": "driver-100", "driverCard": { "nation": "AT", "number": "D123456789" }, "sourceOrganisation": { "type": "ORGANISATION", "sourceEntityId": "57", "code": "57", "name": "Sub Org 57" } } ``` Vehicle reference with optional VIN and nation-scoped VRN: ```json "vehicleRef": { "sourceEntityId": "vehicle-200", "vin": "WDB9634031L123456", "vehicleRegistration": { "nation": "AT", "number": "W-12345" }, "sourceOrganisation": { "type": "ORGANISATION", "sourceEntityId": "57", "code": "57", "name": "Sub Org 57" } } ``` VIN is optional. Driver-card-only events can carry only the nation-scoped VRN/registration: ```json "vehicleRef": { "sourceEntityId": null, "vin": null, "vehicleRegistration": { "nation": "AT", "number": "W-12345" }, "sourceOrganisation": null } ``` This allows late resolution when VU/master data later connects the VRN to a VIN. ### 8. Generic normalized eventDetails Reusable event-specific properties are stored in: ```json "eventDetails": { "type": "DRIVER_ACTIVITY", "attributes": { "cardSlot": "DRIVER", "cardStatus": "INSERTED", "drivingStatus": "SINGLE" } } ``` Raw provider values stay in `payload`. ## Package-level acquisition request For external/manual ingestion, the preferred request shape is: ```json { "package": { "tenantKey": "kralowetz", "eventSource": { "providerKey": "TACHOGRAPH", "sourceKind": "VEHICLE_UNIT", "sourceKey": "TACHOGRAPH_VEHICLE_UNIT", "sourceInstanceKey": "main-tachograph-db", "tenantProviderSettingKey": "kralowetz-tachograph-prod" }, "sourceGroup": { "type": "ORGANISATION", "sourceEntityId": "147", "code": "147", "name": "Kralowetz" }, "importScope": { "type": "SOURCE_ORGANISATION_SUBTREE", "rootSourceOrganisation": { "type": "ORGANISATION", "sourceEntityId": "147", "code": "147", "name": "Kralowetz" }, "includeChildren": true, "occurredFrom": "2026-04-28T00:00:00+02:00", "occurredTo": "2026-04-29T00:00:00+02:00" }, "eventFamily": "DRIVER_ACTIVITY", "businessDate": "2026-04-28", "externalPackageId": "TACHOGRAPH:ORG-147-SUBTREE:DRIVER_ACTIVITY:2026-04-28" }, "events": [ { "externalSourceEventId": "TACHOGRAPH:VEHICLE_UNIT:activity:456:start", "driverRef": { "sourceEntityId": "driver-100", "driverCard": { "nation": "AT", "number": "D123456789" }, "sourceOrganisation": { "type": "ORGANISATION", "sourceEntityId": "57" } }, "vehicleRef": { "sourceEntityId": "vehicle-200", "vin": "WDB9634031L123456", "vehicleRegistration": { "nation": "AT", "number": "W-12345" }, "sourceOrganisation": { "type": "ORGANISATION", "sourceEntityId": "57" } }, "occurredAt": "2026-04-28T08:00:00+02:00", "eventDomain": "DRIVER_ACTIVITY", "eventType": "DRIVE", "lifecycle": "START", "eventDetails": { "type": "DRIVER_ACTIVITY", "attributes": { "cardSlot": "DRIVER", "cardStatus": "INSERTED", "drivingStatus": "SINGLE" } }, "payload": { "raw": { "activity": 3, "cardSlot": 0, "cardStatus": 0, "drivingStatus": 0 } } } ] } ``` ## Routes ### Source-specific input routes ```text direct:yellowfox-d8-booking-input direct:telematics-position-input direct:tachograph-activity-input direct:eventhub-package-input direct:eventhub-manual-input ``` ### Common route ```text direct:eventhub-normalized-input -> validate EventHubEventDto -> create package key from tenant + EventSource + sourceGroup + importScope + eventFamily -> seda:eventhub-batch-input -> aggregate by eventhub.packageKey -> sort by occurredAt inside the batch -> EventHubIngestionService.ingest(...) ``` ## REST endpoints ```text POST /api/eventhub/acquisition/yellowfox/d8-bookings POST /api/eventhub/acquisition/telematics/positions POST /api/eventhub/acquisition/tachograph/activities POST /api/eventhub/acquisition/packages POST /api/eventhub/acquisition/events ``` ## Example: tachograph driver-card activity with VRN only ```bash curl -X POST http://localhost:8080/api/eventhub/acquisition/tachograph/activities \ -H "Content-Type: application/json" \ -d '[ { "tenantKey": "kralowetz", "sourceKind": "DRIVER_CARD", "sourceInstanceKey": "main-tachograph-db", "tenantProviderSettingKey": "kralowetz-tachograph-prod", "externalSourceEventId": "TACHOGRAPH:DRIVER_CARD:activity:789:start", "driverRef": { "sourceEntityId": "driver-100", "driverCard": { "nation": "AT", "number": "D123456789" }, "sourceOrganisation": { "type": "ORGANISATION", "sourceEntityId": "57" } }, "vehicleRef": { "sourceEntityId": null, "vin": null, "vehicleRegistration": { "nation": "AT", "number": "W-12345" }, "sourceOrganisation": null }, "occurredAt": "2026-04-28T08:00:00+02:00", "activityType": "DRIVE", "lifecycle": "START", "cardSlot": "DRIVER", "cardStatus": "INSERTED", "drivingStatus": "SINGLE", "payload": { "raw": { "activity": 3, "cardSlot": 0, "cardStatus": 0, "drivingStatus": 0 } } } ]' ``` The mapper creates a default `TENANT_ALL` one-day import scope for this convenience endpoint. For real tachograph import jobs with organisation subtree/full DB scope, use the package-level request or add dedicated SQL extraction job routes. ## Example: full tachograph DB import package ```json { "package": { "tenantKey": "kralowetz", "eventSource": { "providerKey": "TACHOGRAPH", "sourceKind": "VEHICLE_UNIT", "sourceKey": "TACHOGRAPH_VEHICLE_UNIT", "sourceInstanceKey": "main-tachograph-db", "tenantProviderSettingKey": "kralowetz-tachograph-prod" }, "sourceGroup": null, "importScope": { "type": "TENANT_ALL", "rootSourceOrganisation": null, "includeChildren": false, "occurredFrom": null, "occurredTo": null }, "eventFamily": "DRIVER_ACTIVITY", "businessDate": null, "externalPackageId": "TACHOGRAPH:ALL:DRIVER_ACTIVITY:FULL" }, "events": [] } ``` ## Start PostgreSQL ```bash docker compose up -d ``` ## Run the service ```bash mvn spring-boot:run ``` ## Check acquisition packages ```sql select p.received_at, p.tenant_key, s.provider_key, s.source_kind, s.source_key, p.source_group_type, p.source_group_entity_id, p.import_scope_type, p.root_source_org_entity_id, p.occurred_from, p.occurred_to, p.event_family, p.business_date, p.status, p.event_count from eventhub.data_package p join eventhub.event_source s on s.id = p.event_source_id order by p.received_at desc; ``` ## Check acquired events ```sql select occurred_at, driver_source_entity_id, driver_card_nation, driver_card_number, driver_source_org_entity_id, vehicle_source_entity_id, vehicle_vin, vehicle_registration_nation, vehicle_registration_number, vehicle_source_org_entity_id, event_domain, event_type, lifecycle, event_details, payload from eventhub.acquired_event order by occurred_at desc; ``` ## Next implementation steps 1. Add source-specific SQL extraction routes for the tachograph DB event families: - activities from CardActivity/VUActivity - card insert/withdraw from CardVehiclesUsed/IWCycle - positions from places/GNSS/border/load-unload sources - border crossings - load/unload - specific conditions: out-of-scope and ferry/train - speeding events 2. Each SQL extraction route should accept `ImportScopeDto`: - optional source organisation root + include children - optional occurredFrom/occurredTo - null time bounds mean complete DB/history import 3. Add master-data resolution later: - driver by tenant + driver card nation/number + occurredAt - vehicle by tenant + VIN or tenant + registration nation/number + occurredAt - late resolution from VRN-only driver-card events to VIN after VU/master data import 4. Discuss query/read models later: - how to merge acquired events from all sources at query time - source priority per event family when the main source contains gaps - how to expose source provenance when multiple sources describe the same real-world event