You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

29 lines
1.2 KiB
Markdown

# Phase 4.1 adapter extensions
## Added adapters
### TED package adapter
- Source type: `TED_PACKAGE`
- Root access: `PUBLIC`, no owner tenant
- Root document type: `TED_PACKAGE`
- Child source type: `PACKAGE_CHILD`
- Child relation: `EXTRACTED_FROM`
The adapter imports the package artifact plus its XML members into the generic `DOC` model.
It does not replace the existing legacy TED package processing path; instead it complements it, so the later legacy TED parsing step can still enrich the same canonical child documents into proper `TED_NOTICE` projections by dedup hash.
### Mail/document adapter
- Source type: `MAIL`
- Root document type: `MIME_MESSAGE`
- Child relation: `ATTACHMENT_OF`
- Access: configurable via `mail-default-owner-tenant-key` and `mail-default-visibility`
The adapter stores the message body as the semantic root text and imports attachments as child documents. ZIP attachments can optionally be expanded recursively.
## Deduplication
Phase 4 deduplication by content hash is refined so the same payload is only deduplicated within the same access scope (`visibility` + `owner tenant`).
This prevents private documents from different tenants from being merged into one canonical document accidentally.