You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
29 lines
1.2 KiB
Markdown
29 lines
1.2 KiB
Markdown
# Phase 4.1 adapter extensions
|
|
|
|
## Added adapters
|
|
|
|
### TED package adapter
|
|
|
|
- Source type: `TED_PACKAGE`
|
|
- Root access: `PUBLIC`, no owner tenant
|
|
- Root document type: `TED_PACKAGE`
|
|
- Child source type: `PACKAGE_CHILD`
|
|
- Child relation: `EXTRACTED_FROM`
|
|
|
|
The adapter imports the package artifact plus its XML members into the generic `DOC` model.
|
|
It does not replace the existing legacy TED package processing path; instead it complements it, so the later legacy TED parsing step can still enrich the same canonical child documents into proper `TED_NOTICE` projections by dedup hash.
|
|
|
|
### Mail/document adapter
|
|
|
|
- Source type: `MAIL`
|
|
- Root document type: `MIME_MESSAGE`
|
|
- Child relation: `ATTACHMENT_OF`
|
|
- Access: configurable via `mail-default-owner-tenant-key` and `mail-default-visibility`
|
|
|
|
The adapter stores the message body as the semantic root text and imports attachments as child documents. ZIP attachments can optionally be expanded recursively.
|
|
|
|
## Deduplication
|
|
|
|
Phase 4 deduplication by content hash is refined so the same payload is only deduplicated within the same access scope (`visibility` + `owner tenant`).
|
|
This prevents private documents from different tenants from being merged into one canonical document accidentally.
|