|
|
||
|---|---|---|
| docs/timescale | ||
| src | ||
| .gitignore | ||
| README.md | ||
| docker-compose.yml | ||
| pom.xml | ||
README.md
EventHub Acquisition Service
Spring Boot + Apache Camel project skeleton for acquiring normalized EventHub point events from multiple providers/sources.
The current version intentionally focuses on acquisition. It stores source records as imported and does not merge or deduplicate equivalent events from different providers/sources. It does, however, keep a non-unique eventSignatureHash as a later merge/gap-filling hint. Later query/read models can merge sources when a preferred/main source contains gaps. The included PostgreSQL schema is a small acquisition-stage store so the project can be run and tested end-to-end.
Architecture
source-specific Camel input route
-> source-specific mapper
-> EventHubEventDto
-> common EventHub acquisition route
-> validation
-> package-key creation from tenant + EventSource + source group + import scope + event family
-> aggregation / batching
-> chronological sorting inside the batch
-> acquisition package handoff
Namespace
at.procon.eventhub
Main model decisions
1. One event = one time point
EventHubEventDto has exactly one timestamp:
occurredAt
There is no generic duration, endTime, validFrom, or validTo. If a source row represents an interval, the mapper may emit separate point events, for example DRIVE START and DRIVE END.
2. Tenant is package-level
tenantKey identifies the owner/client/account for the package. It is required for acquisition grouping and future master-data resolution.
3. EventSource identifies the technical source
{
"providerKey": "TACHOGRAPH",
"sourceKind": "VEHICLE_UNIT",
"sourceKey": "TACHOGRAPH_VEHICLE_UNIT",
"sourceInstanceKey": "main-tachograph-db",
"tenantProviderSettingKey": "kralowetz-tachograph-prod",
"externalFleetKey": null
}
Examples:
TACHOGRAPH / VEHICLE_UNIT
TACHOGRAPH / DRIVER_CARD
YELLOWFOX / TELEMATICS_PLATFORM / YELLOWFOX_D8
FLEETBOARD / TELEMATICS_PLATFORM / FLEETBOARD_POSITION
EventSource is acquisition context. A VU event, a driver-card event and a YellowFox D8 event may describe the same real-world event, but this acquisition service keeps them as separate acquired source records. Cross-source merging/gap filling is intentionally left for a later query/read model.
4. No cross-source deduplication during acquisition
The acquisition layer stores every source record independently. It uses sourceRecordKeyHash only for idempotency of the same source event, so the same input package can be retried safely. It does not deduplicate VU vs driver-card vs YellowFox records.
This is intentional because later queries may need to combine sources: for example, use tachograph data as the main source, but fill gaps from YellowFox or another provider.
The acquisition table also stores a non-unique eventSignatureHash. This is a semantic merge hint, not a unique key. It intentionally excludes EventSource and externalSourceEventId, so VU, driver-card and YellowFox records that look like the same real-world event can share a signature while still being stored separately. Later query/projection logic can use this signature for source comparison, gap filling, and merged timelines. The signature prefers nation-scoped driver card and vehicle registration when available, then VIN or source entity id as fallback, so it remains useful before final master-data resolution.
Therefore the current model preserves:
tenantKey
eventSource
sourceGroup
importScope
externalSourceEventId
source-side driver/vehicle references
eventDetails
payload
5. SourceGroup captures tachograph organisation or YellowFox fleet
sourceGroup is package-level source grouping information.
For tachograph it can be a source organisation:
"sourceGroup": {
"type": "ORGANISATION",
"sourceEntityId": "147",
"code": "147",
"name": "Kralowetz"
}
For YellowFox it can be a fleet:
"sourceGroup": {
"type": "FLEET",
"sourceEntityId": "7",
"code": "7",
"name": "YellowFox Fleet 7"
}
The YellowFox fleet belongs to the same tenant/customer, but it is not forced to be an organisation. It can later be mapped to a tenant organisation if needed.
6. ImportScope captures organisation and time filtering
importScope describes what was selected from the source system.
Full DB import:
"importScope": {
"type": "TENANT_ALL",
"rootSourceOrganisation": null,
"includeChildren": false,
"occurredFrom": null,
"occurredTo": null
}
Organisation subtree + time-window import:
"importScope": {
"type": "SOURCE_ORGANISATION_SUBTREE",
"rootSourceOrganisation": {
"type": "ORGANISATION",
"sourceEntityId": "147",
"code": "147",
"name": "Kralowetz"
},
"includeChildren": true,
"occurredFrom": "2026-04-28T00:00:00+02:00",
"occurredTo": "2026-04-29T00:00:00+02:00"
}
occurredFrom is inclusive and occurredTo is exclusive. Both may be null for complete source DB import.
7. Source-side master references, no incoming internal IDs
The incoming DTO does not require internal driverId or vehicleId, because in normal ingestion those ids are not known yet.
Driver reference with nation-scoped driver card:
"driverRef": {
"sourceEntityId": "driver-100",
"driverCard": {
"nation": "AT",
"number": "D123456789"
},
"sourceOrganisation": {
"type": "ORGANISATION",
"sourceEntityId": "57",
"code": "57",
"name": "Sub Org 57"
}
}
Vehicle reference with optional VIN and nation-scoped VRN:
"vehicleRef": {
"sourceEntityId": "vehicle-200",
"vin": "WDB9634031L123456",
"vehicleRegistration": {
"nation": "AT",
"number": "W-12345"
},
"sourceOrganisation": {
"type": "ORGANISATION",
"sourceEntityId": "57",
"code": "57",
"name": "Sub Org 57"
}
}
VIN is optional. Driver-card-only events can carry only the nation-scoped VRN/registration:
"vehicleRef": {
"sourceEntityId": null,
"vin": null,
"vehicleRegistration": {
"nation": "AT",
"number": "W-12345"
},
"sourceOrganisation": null
}
This allows late resolution when VU/master data later connects the VRN to a VIN.
8. Generic normalized eventDetails
Reusable event-specific properties are stored in:
"eventDetails": {
"type": "DRIVER_ACTIVITY",
"attributes": {
"cardSlot": "DRIVER",
"cardStatus": "INSERTED",
"drivingStatus": "SINGLE"
}
}
Raw provider values stay in payload.
Package-level acquisition request
For external/manual ingestion, the preferred request shape is:
{
"package": {
"tenantKey": "kralowetz",
"eventSource": {
"providerKey": "TACHOGRAPH",
"sourceKind": "VEHICLE_UNIT",
"sourceKey": "TACHOGRAPH_VEHICLE_UNIT",
"sourceInstanceKey": "main-tachograph-db",
"tenantProviderSettingKey": "kralowetz-tachograph-prod"
},
"sourceGroup": {
"type": "ORGANISATION",
"sourceEntityId": "147",
"code": "147",
"name": "Kralowetz"
},
"importScope": {
"type": "SOURCE_ORGANISATION_SUBTREE",
"rootSourceOrganisation": {
"type": "ORGANISATION",
"sourceEntityId": "147",
"code": "147",
"name": "Kralowetz"
},
"includeChildren": true,
"occurredFrom": "2026-04-28T00:00:00+02:00",
"occurredTo": "2026-04-29T00:00:00+02:00"
},
"eventFamily": "DRIVER_ACTIVITY",
"businessDate": "2026-04-28",
"externalPackageId": "TACHOGRAPH:ORG-147-SUBTREE:DRIVER_ACTIVITY:2026-04-28"
},
"events": [
{
"externalSourceEventId": "TACHOGRAPH:VEHICLE_UNIT:activity:456:start",
"driverRef": {
"sourceEntityId": "driver-100",
"driverCard": {
"nation": "AT",
"number": "D123456789"
},
"sourceOrganisation": {
"type": "ORGANISATION",
"sourceEntityId": "57"
}
},
"vehicleRef": {
"sourceEntityId": "vehicle-200",
"vin": "WDB9634031L123456",
"vehicleRegistration": {
"nation": "AT",
"number": "W-12345"
},
"sourceOrganisation": {
"type": "ORGANISATION",
"sourceEntityId": "57"
}
},
"occurredAt": "2026-04-28T08:00:00+02:00",
"eventDomain": "DRIVER_ACTIVITY",
"eventType": "DRIVE",
"lifecycle": "START",
"eventDetails": {
"type": "DRIVER_ACTIVITY",
"attributes": {
"cardSlot": "DRIVER",
"cardStatus": "INSERTED",
"drivingStatus": "SINGLE"
}
},
"payload": {
"raw": {
"activity": 3,
"cardSlot": 0,
"cardStatus": 0,
"drivingStatus": 0
}
}
}
]
}
Routes
Source-specific input routes
direct:yellowfox-d8-booking-input
direct:telematics-position-input
direct:tachograph-activity-input
direct:eventhub-package-input
direct:eventhub-manual-input
Common route
direct:eventhub-normalized-input
-> validate EventHubEventDto
-> create package key from tenant + EventSource + sourceGroup + importScope + eventFamily
-> seda:eventhub-batch-input
-> aggregate by eventhub.packageKey
-> sort by occurredAt inside the batch
-> EventHubIngestionService.ingest(...)
REST endpoints
POST /api/eventhub/acquisition/yellowfox/d8-bookings
POST /api/eventhub/acquisition/telematics/positions
POST /api/eventhub/acquisition/tachograph/activities
POST /api/eventhub/acquisition/packages
POST /api/eventhub/acquisition/events
Example: tachograph driver-card activity with VRN only
curl -X POST http://localhost:8080/api/eventhub/acquisition/tachograph/activities \
-H "Content-Type: application/json" \
-d '[
{
"tenantKey": "kralowetz",
"sourceKind": "DRIVER_CARD",
"sourceInstanceKey": "main-tachograph-db",
"tenantProviderSettingKey": "kralowetz-tachograph-prod",
"externalSourceEventId": "TACHOGRAPH:DRIVER_CARD:activity:789:start",
"driverRef": {
"sourceEntityId": "driver-100",
"driverCard": {
"nation": "AT",
"number": "D123456789"
},
"sourceOrganisation": {
"type": "ORGANISATION",
"sourceEntityId": "57"
}
},
"vehicleRef": {
"sourceEntityId": null,
"vin": null,
"vehicleRegistration": {
"nation": "AT",
"number": "W-12345"
},
"sourceOrganisation": null
},
"occurredAt": "2026-04-28T08:00:00+02:00",
"activityType": "DRIVE",
"lifecycle": "START",
"cardSlot": "DRIVER",
"cardStatus": "INSERTED",
"drivingStatus": "SINGLE",
"payload": {
"raw": {
"activity": 3,
"cardSlot": 0,
"cardStatus": 0,
"drivingStatus": 0
}
}
}
]'
The mapper creates a default TENANT_ALL one-day import scope for this convenience endpoint. For real tachograph import jobs with organisation subtree/full DB scope, use the package-level request or add dedicated SQL extraction job routes.
Example: full tachograph DB import package
{
"package": {
"tenantKey": "kralowetz",
"eventSource": {
"providerKey": "TACHOGRAPH",
"sourceKind": "VEHICLE_UNIT",
"sourceKey": "TACHOGRAPH_VEHICLE_UNIT",
"sourceInstanceKey": "main-tachograph-db",
"tenantProviderSettingKey": "kralowetz-tachograph-prod"
},
"sourceGroup": null,
"importScope": {
"type": "TENANT_ALL",
"rootSourceOrganisation": null,
"includeChildren": false,
"occurredFrom": null,
"occurredTo": null
},
"eventFamily": "DRIVER_ACTIVITY",
"businessDate": null,
"externalPackageId": "TACHOGRAPH:ALL:DRIVER_ACTIVITY:FULL"
},
"events": []
}
Start PostgreSQL
docker compose up -d
Run the service
mvn spring-boot:run
Check acquisition packages
select p.received_at,
p.tenant_key,
s.provider_key,
s.source_kind,
s.source_key,
p.source_group_type,
p.source_group_entity_id,
p.import_scope_type,
p.root_source_org_entity_id,
p.occurred_from,
p.occurred_to,
p.event_family,
p.business_date,
p.status,
p.event_count
from eventhub.data_package p
join eventhub.event_source s on s.id = p.event_source_id
order by p.received_at desc;
Check acquired events
select occurred_at,
driver_source_entity_id,
driver_card_nation,
driver_card_number,
driver_source_org_entity_id,
vehicle_source_entity_id,
vehicle_vin,
vehicle_registration_nation,
vehicle_registration_number,
vehicle_source_org_entity_id,
event_domain,
event_type,
lifecycle,
event_details,
payload
from eventhub.acquired_event
order by occurred_at desc;
Next implementation steps
- Add source-specific SQL extraction routes for the tachograph DB event families:
- activities from CardActivity/VUActivity
- card insert/withdraw from CardVehiclesUsed/IWCycle
- positions from places/GNSS/border/load-unload sources
- border crossings
- load/unload
- specific conditions: out-of-scope and ferry/train
- speeding events
- Each SQL extraction route should accept
ImportScopeDto:- optional source organisation root + include children
- optional occurredFrom/occurredTo
- null time bounds mean complete DB/history import
- Add master-data resolution later:
- driver by tenant + driver card nation/number + occurredAt
- vehicle by tenant + VIN or tenant + registration nation/number + occurredAt
- late resolution from VRN-only driver-card events to VIN after VU/master data import
- Discuss query/read models later:
- how to merge acquired events from all sources at query time
- source priority per event family when the main source contains gaps
- how to expose source provenance when multiple sources describe the same real-world event