You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
1.2 KiB
1.2 KiB
Phase 4.1 adapter extensions
Added adapters
TED package adapter
- Source type:
TED_PACKAGE - Root access:
PUBLIC, no owner tenant - Root document type:
TED_PACKAGE - Child source type:
PACKAGE_CHILD - Child relation:
EXTRACTED_FROM
The adapter imports the package artifact plus its XML members into the generic DOC model.
It does not replace the existing legacy TED package processing path; instead it complements it, so the later legacy TED parsing step can still enrich the same canonical child documents into proper TED_NOTICE projections by dedup hash.
Mail/document adapter
- Source type:
MAIL - Root document type:
MIME_MESSAGE - Child relation:
ATTACHMENT_OF - Access: configurable via
mail-default-owner-tenant-keyandmail-default-visibility
The adapter stores the message body as the semantic root text and imports attachments as child documents. ZIP attachments can optionally be expanded recursively.
Deduplication
Phase 4 deduplication by content hash is refined so the same payload is only deduplicated within the same access scope (visibility + owner tenant).
This prevents private documents from different tenants from being merged into one canonical document accidentally.