507 lines
14 KiB
Markdown
507 lines
14 KiB
Markdown
# EventHub Acquisition Service
|
|
|
|
Spring Boot + Apache Camel project skeleton for acquiring normalized EventHub point events from multiple providers/sources.
|
|
|
|
The current version intentionally focuses on **acquisition**. It stores source records as imported and does not merge or deduplicate equivalent events from different providers/sources. It does, however, keep a non-unique eventSignatureHash as a later merge/gap-filling hint. Later query/read models can merge sources when a preferred/main source contains gaps. The included PostgreSQL schema is a small acquisition-stage store so the project can be run and tested end-to-end.
|
|
|
|
## Architecture
|
|
|
|
```text
|
|
source-specific Camel input route
|
|
-> source-specific mapper
|
|
-> EventHubEventDto
|
|
-> common EventHub acquisition route
|
|
-> validation
|
|
-> package-key creation from tenant + EventSource + source group + import scope + event family
|
|
-> aggregation / batching
|
|
-> chronological sorting inside the batch
|
|
-> acquisition package handoff
|
|
```
|
|
|
|
## Namespace
|
|
|
|
```text
|
|
at.procon.eventhub
|
|
```
|
|
|
|
## Main model decisions
|
|
|
|
### 1. One event = one time point
|
|
|
|
`EventHubEventDto` has exactly one timestamp:
|
|
|
|
```text
|
|
occurredAt
|
|
```
|
|
|
|
There is no generic `duration`, `endTime`, `validFrom`, or `validTo`. If a source row represents an interval, the mapper may emit separate point events, for example `DRIVE START` and `DRIVE END`.
|
|
|
|
### 2. Tenant is package-level
|
|
|
|
`tenantKey` identifies the owner/client/account for the package. It is required for acquisition grouping and future master-data resolution.
|
|
|
|
### 3. EventSource identifies the technical source
|
|
|
|
```json
|
|
{
|
|
"providerKey": "TACHOGRAPH",
|
|
"sourceKind": "VEHICLE_UNIT",
|
|
"sourceKey": "TACHOGRAPH_VEHICLE_UNIT",
|
|
"sourceInstanceKey": "main-tachograph-db",
|
|
"tenantProviderSettingKey": "kralowetz-tachograph-prod",
|
|
"externalFleetKey": null
|
|
}
|
|
```
|
|
|
|
Examples:
|
|
|
|
```text
|
|
TACHOGRAPH / VEHICLE_UNIT
|
|
TACHOGRAPH / DRIVER_CARD
|
|
YELLOWFOX / TELEMATICS_PLATFORM / YELLOWFOX_D8
|
|
FLEETBOARD / TELEMATICS_PLATFORM / FLEETBOARD_POSITION
|
|
```
|
|
|
|
`EventSource` is acquisition context. A VU event, a driver-card event and a YellowFox D8 event may describe the same real-world event, but this acquisition service keeps them as separate acquired source records. Cross-source merging/gap filling is intentionally left for a later query/read model.
|
|
|
|
### 4. No cross-source deduplication during acquisition
|
|
|
|
The acquisition layer stores every source record independently. It uses `sourceRecordKeyHash` only for idempotency of the same source event, so the same input package can be retried safely. It does **not** deduplicate VU vs driver-card vs YellowFox records.
|
|
|
|
This is intentional because later queries may need to combine sources: for example, use tachograph data as the main source, but fill gaps from YellowFox or another provider.
|
|
|
|
The acquisition table also stores a non-unique `eventSignatureHash`. This is a semantic merge hint, not a unique key. It intentionally excludes `EventSource` and `externalSourceEventId`, so VU, driver-card and YellowFox records that look like the same real-world event can share a signature while still being stored separately. Later query/projection logic can use this signature for source comparison, gap filling, and merged timelines. The signature prefers nation-scoped driver card and vehicle registration when available, then VIN or source entity id as fallback, so it remains useful before final master-data resolution.
|
|
|
|
Therefore the current model preserves:
|
|
|
|
```text
|
|
tenantKey
|
|
eventSource
|
|
sourceGroup
|
|
importScope
|
|
externalSourceEventId
|
|
source-side driver/vehicle references
|
|
eventDetails
|
|
payload
|
|
```
|
|
|
|
### 5. SourceGroup captures tachograph organisation or YellowFox fleet
|
|
|
|
`sourceGroup` is package-level source grouping information.
|
|
|
|
For tachograph it can be a source organisation:
|
|
|
|
```json
|
|
"sourceGroup": {
|
|
"type": "ORGANISATION",
|
|
"sourceEntityId": "147",
|
|
"code": "147",
|
|
"name": "Kralowetz"
|
|
}
|
|
```
|
|
|
|
For YellowFox it can be a fleet:
|
|
|
|
```json
|
|
"sourceGroup": {
|
|
"type": "FLEET",
|
|
"sourceEntityId": "7",
|
|
"code": "7",
|
|
"name": "YellowFox Fleet 7"
|
|
}
|
|
```
|
|
|
|
The YellowFox fleet belongs to the same tenant/customer, but it is not forced to be an organisation. It can later be mapped to a tenant organisation if needed.
|
|
|
|
### 6. ImportScope captures organisation and time filtering
|
|
|
|
`importScope` describes what was selected from the source system.
|
|
|
|
Full DB import:
|
|
|
|
```json
|
|
"importScope": {
|
|
"type": "TENANT_ALL",
|
|
"rootSourceOrganisation": null,
|
|
"includeChildren": false,
|
|
"occurredFrom": null,
|
|
"occurredTo": null
|
|
}
|
|
```
|
|
|
|
Organisation subtree + time-window import:
|
|
|
|
```json
|
|
"importScope": {
|
|
"type": "SOURCE_ORGANISATION_SUBTREE",
|
|
"rootSourceOrganisation": {
|
|
"type": "ORGANISATION",
|
|
"sourceEntityId": "147",
|
|
"code": "147",
|
|
"name": "Kralowetz"
|
|
},
|
|
"includeChildren": true,
|
|
"occurredFrom": "2026-04-28T00:00:00+02:00",
|
|
"occurredTo": "2026-04-29T00:00:00+02:00"
|
|
}
|
|
```
|
|
|
|
`occurredFrom` is inclusive and `occurredTo` is exclusive. Both may be `null` for complete source DB import.
|
|
|
|
### 7. Source-side master references, no incoming internal IDs
|
|
|
|
The incoming DTO does not require internal `driverId` or `vehicleId`, because in normal ingestion those ids are not known yet.
|
|
|
|
Driver reference with nation-scoped driver card:
|
|
|
|
```json
|
|
"driverRef": {
|
|
"sourceEntityId": "driver-100",
|
|
"driverCard": {
|
|
"nation": "AT",
|
|
"number": "D123456789"
|
|
},
|
|
"sourceOrganisation": {
|
|
"type": "ORGANISATION",
|
|
"sourceEntityId": "57",
|
|
"code": "57",
|
|
"name": "Sub Org 57"
|
|
}
|
|
}
|
|
```
|
|
|
|
Vehicle reference with optional VIN and nation-scoped VRN:
|
|
|
|
```json
|
|
"vehicleRef": {
|
|
"sourceEntityId": "vehicle-200",
|
|
"vin": "WDB9634031L123456",
|
|
"vehicleRegistration": {
|
|
"nation": "AT",
|
|
"number": "W-12345"
|
|
},
|
|
"sourceOrganisation": {
|
|
"type": "ORGANISATION",
|
|
"sourceEntityId": "57",
|
|
"code": "57",
|
|
"name": "Sub Org 57"
|
|
}
|
|
}
|
|
```
|
|
|
|
VIN is optional. Driver-card-only events can carry only the nation-scoped VRN/registration:
|
|
|
|
```json
|
|
"vehicleRef": {
|
|
"sourceEntityId": null,
|
|
"vin": null,
|
|
"vehicleRegistration": {
|
|
"nation": "AT",
|
|
"number": "W-12345"
|
|
},
|
|
"sourceOrganisation": null
|
|
}
|
|
```
|
|
|
|
This allows late resolution when VU/master data later connects the VRN to a VIN.
|
|
|
|
### 8. Generic normalized eventDetails
|
|
|
|
Reusable event-specific properties are stored in:
|
|
|
|
```json
|
|
"eventDetails": {
|
|
"type": "DRIVER_ACTIVITY",
|
|
"attributes": {
|
|
"cardSlot": "DRIVER",
|
|
"cardStatus": "INSERTED",
|
|
"drivingStatus": "SINGLE"
|
|
}
|
|
}
|
|
```
|
|
|
|
Raw provider values stay in `payload`.
|
|
|
|
## Package-level acquisition request
|
|
|
|
For external/manual ingestion, the preferred request shape is:
|
|
|
|
```json
|
|
{
|
|
"package": {
|
|
"tenantKey": "kralowetz",
|
|
"eventSource": {
|
|
"providerKey": "TACHOGRAPH",
|
|
"sourceKind": "VEHICLE_UNIT",
|
|
"sourceKey": "TACHOGRAPH_VEHICLE_UNIT",
|
|
"sourceInstanceKey": "main-tachograph-db",
|
|
"tenantProviderSettingKey": "kralowetz-tachograph-prod"
|
|
},
|
|
"sourceGroup": {
|
|
"type": "ORGANISATION",
|
|
"sourceEntityId": "147",
|
|
"code": "147",
|
|
"name": "Kralowetz"
|
|
},
|
|
"importScope": {
|
|
"type": "SOURCE_ORGANISATION_SUBTREE",
|
|
"rootSourceOrganisation": {
|
|
"type": "ORGANISATION",
|
|
"sourceEntityId": "147",
|
|
"code": "147",
|
|
"name": "Kralowetz"
|
|
},
|
|
"includeChildren": true,
|
|
"occurredFrom": "2026-04-28T00:00:00+02:00",
|
|
"occurredTo": "2026-04-29T00:00:00+02:00"
|
|
},
|
|
"eventFamily": "DRIVER_ACTIVITY",
|
|
"businessDate": "2026-04-28",
|
|
"externalPackageId": "TACHOGRAPH:ORG-147-SUBTREE:DRIVER_ACTIVITY:2026-04-28"
|
|
},
|
|
"events": [
|
|
{
|
|
"externalSourceEventId": "TACHOGRAPH:VEHICLE_UNIT:activity:456:start",
|
|
"driverRef": {
|
|
"sourceEntityId": "driver-100",
|
|
"driverCard": {
|
|
"nation": "AT",
|
|
"number": "D123456789"
|
|
},
|
|
"sourceOrganisation": {
|
|
"type": "ORGANISATION",
|
|
"sourceEntityId": "57"
|
|
}
|
|
},
|
|
"vehicleRef": {
|
|
"sourceEntityId": "vehicle-200",
|
|
"vin": "WDB9634031L123456",
|
|
"vehicleRegistration": {
|
|
"nation": "AT",
|
|
"number": "W-12345"
|
|
},
|
|
"sourceOrganisation": {
|
|
"type": "ORGANISATION",
|
|
"sourceEntityId": "57"
|
|
}
|
|
},
|
|
"occurredAt": "2026-04-28T08:00:00+02:00",
|
|
"eventDomain": "DRIVER_ACTIVITY",
|
|
"eventType": "DRIVE",
|
|
"lifecycle": "START",
|
|
"eventDetails": {
|
|
"type": "DRIVER_ACTIVITY",
|
|
"attributes": {
|
|
"cardSlot": "DRIVER",
|
|
"cardStatus": "INSERTED",
|
|
"drivingStatus": "SINGLE"
|
|
}
|
|
},
|
|
"payload": {
|
|
"raw": {
|
|
"activity": 3,
|
|
"cardSlot": 0,
|
|
"cardStatus": 0,
|
|
"drivingStatus": 0
|
|
}
|
|
}
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
## Routes
|
|
|
|
### Source-specific input routes
|
|
|
|
```text
|
|
direct:yellowfox-d8-booking-input
|
|
direct:telematics-position-input
|
|
direct:tachograph-activity-input
|
|
direct:eventhub-package-input
|
|
direct:eventhub-manual-input
|
|
```
|
|
|
|
### Common route
|
|
|
|
```text
|
|
direct:eventhub-normalized-input
|
|
-> validate EventHubEventDto
|
|
-> create package key from tenant + EventSource + sourceGroup + importScope + eventFamily
|
|
-> seda:eventhub-batch-input
|
|
-> aggregate by eventhub.packageKey
|
|
-> sort by occurredAt inside the batch
|
|
-> EventHubIngestionService.ingest(...)
|
|
```
|
|
|
|
## REST endpoints
|
|
|
|
```text
|
|
POST /api/eventhub/acquisition/yellowfox/d8-bookings
|
|
POST /api/eventhub/acquisition/telematics/positions
|
|
POST /api/eventhub/acquisition/tachograph/activities
|
|
POST /api/eventhub/acquisition/packages
|
|
POST /api/eventhub/acquisition/events
|
|
```
|
|
|
|
## Example: tachograph driver-card activity with VRN only
|
|
|
|
```bash
|
|
curl -X POST http://localhost:8080/api/eventhub/acquisition/tachograph/activities \
|
|
-H "Content-Type: application/json" \
|
|
-d '[
|
|
{
|
|
"tenantKey": "kralowetz",
|
|
"sourceKind": "DRIVER_CARD",
|
|
"sourceInstanceKey": "main-tachograph-db",
|
|
"tenantProviderSettingKey": "kralowetz-tachograph-prod",
|
|
"externalSourceEventId": "TACHOGRAPH:DRIVER_CARD:activity:789:start",
|
|
"driverRef": {
|
|
"sourceEntityId": "driver-100",
|
|
"driverCard": {
|
|
"nation": "AT",
|
|
"number": "D123456789"
|
|
},
|
|
"sourceOrganisation": {
|
|
"type": "ORGANISATION",
|
|
"sourceEntityId": "57"
|
|
}
|
|
},
|
|
"vehicleRef": {
|
|
"sourceEntityId": null,
|
|
"vin": null,
|
|
"vehicleRegistration": {
|
|
"nation": "AT",
|
|
"number": "W-12345"
|
|
},
|
|
"sourceOrganisation": null
|
|
},
|
|
"occurredAt": "2026-04-28T08:00:00+02:00",
|
|
"activityType": "DRIVE",
|
|
"lifecycle": "START",
|
|
"cardSlot": "DRIVER",
|
|
"cardStatus": "INSERTED",
|
|
"drivingStatus": "SINGLE",
|
|
"payload": {
|
|
"raw": {
|
|
"activity": 3,
|
|
"cardSlot": 0,
|
|
"cardStatus": 0,
|
|
"drivingStatus": 0
|
|
}
|
|
}
|
|
}
|
|
]'
|
|
```
|
|
|
|
The mapper creates a default `TENANT_ALL` one-day import scope for this convenience endpoint. For real tachograph import jobs with organisation subtree/full DB scope, use the package-level request or add dedicated SQL extraction job routes.
|
|
|
|
## Example: full tachograph DB import package
|
|
|
|
```json
|
|
{
|
|
"package": {
|
|
"tenantKey": "kralowetz",
|
|
"eventSource": {
|
|
"providerKey": "TACHOGRAPH",
|
|
"sourceKind": "VEHICLE_UNIT",
|
|
"sourceKey": "TACHOGRAPH_VEHICLE_UNIT",
|
|
"sourceInstanceKey": "main-tachograph-db",
|
|
"tenantProviderSettingKey": "kralowetz-tachograph-prod"
|
|
},
|
|
"sourceGroup": null,
|
|
"importScope": {
|
|
"type": "TENANT_ALL",
|
|
"rootSourceOrganisation": null,
|
|
"includeChildren": false,
|
|
"occurredFrom": null,
|
|
"occurredTo": null
|
|
},
|
|
"eventFamily": "DRIVER_ACTIVITY",
|
|
"businessDate": null,
|
|
"externalPackageId": "TACHOGRAPH:ALL:DRIVER_ACTIVITY:FULL"
|
|
},
|
|
"events": []
|
|
}
|
|
```
|
|
|
|
## Start PostgreSQL
|
|
|
|
```bash
|
|
docker compose up -d
|
|
```
|
|
|
|
## Run the service
|
|
|
|
```bash
|
|
mvn spring-boot:run
|
|
```
|
|
|
|
## Check acquisition packages
|
|
|
|
```sql
|
|
select p.received_at,
|
|
p.tenant_key,
|
|
s.provider_key,
|
|
s.source_kind,
|
|
s.source_key,
|
|
p.source_group_type,
|
|
p.source_group_entity_id,
|
|
p.import_scope_type,
|
|
p.root_source_org_entity_id,
|
|
p.occurred_from,
|
|
p.occurred_to,
|
|
p.event_family,
|
|
p.business_date,
|
|
p.status,
|
|
p.event_count
|
|
from eventhub.data_package p
|
|
join eventhub.event_source s on s.id = p.event_source_id
|
|
order by p.received_at desc;
|
|
```
|
|
|
|
## Check acquired events
|
|
|
|
```sql
|
|
select occurred_at,
|
|
driver_source_entity_id,
|
|
driver_card_nation,
|
|
driver_card_number,
|
|
driver_source_org_entity_id,
|
|
vehicle_source_entity_id,
|
|
vehicle_vin,
|
|
vehicle_registration_nation,
|
|
vehicle_registration_number,
|
|
vehicle_source_org_entity_id,
|
|
event_domain,
|
|
event_type,
|
|
lifecycle,
|
|
event_details,
|
|
payload
|
|
from eventhub.acquired_event
|
|
order by occurred_at desc;
|
|
```
|
|
|
|
## Next implementation steps
|
|
|
|
1. Add source-specific SQL extraction routes for the tachograph DB event families:
|
|
- activities from CardActivity/VUActivity
|
|
- card insert/withdraw from CardVehiclesUsed/IWCycle
|
|
- positions from places/GNSS/border/load-unload sources
|
|
- border crossings
|
|
- load/unload
|
|
- specific conditions: out-of-scope and ferry/train
|
|
- speeding events
|
|
2. Each SQL extraction route should accept `ImportScopeDto`:
|
|
- optional source organisation root + include children
|
|
- optional occurredFrom/occurredTo
|
|
- null time bounds mean complete DB/history import
|
|
3. Add master-data resolution later:
|
|
- driver by tenant + driver card nation/number + occurredAt
|
|
- vehicle by tenant + VIN or tenant + registration nation/number + occurredAt
|
|
- late resolution from VRN-only driver-card events to VIN after VU/master data import
|
|
4. Discuss query/read models later:
|
|
- how to merge acquired events from all sources at query time
|
|
- source priority per event family when the main source contains gaps
|
|
- how to expose source provenance when multiple sources describe the same real-world event
|