You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
1.5 KiB
1.5 KiB
Phase 3 - TED projection model
Goal
Move TED from being the implicit root data model to being a typed projection on top of the generic canonical document model.
New persistence model
Generic root
DOC.doc_documentDOC.doc_contentDOC.doc_text_representationDOC.doc_embedding
TED-specific projection
TED.ted_notice_projectionTED.ted_notice_lotTED.ted_notice_organization
Relationship model
- one generic
DOC.doc_document - zero or one
TED.ted_notice_projection - zero to many
TED.ted_notice_lot - zero to many
TED.ted_notice_organization
The projection also keeps an optional back-reference to the legacy TED.procurement_document row to
support incremental migration and validation.
Runtime behavior
When a new TED XML document is imported:
- it is parsed into the existing legacy
ProcurementDocument - the generic DOC root is ensured/refreshed
- the primary text representation is ensured
- if the generic vectorization pipeline is enabled, a pending embedding is ensured
- the TED structured projection tables are refreshed from the parsed legacy document
Why this phase matters
This is the first phase where TED is explicitly modeled as a document type projection instead of the platform's canonical root entity. That makes the next steps possible:
- generic semantic search across multiple document types
- future non-TED projections
- migration of TED structured search to the new projection tables