2.9 KiB
Phase 0 – Architecture Foundation
New project identity
- Project name: Procon Document Intelligence Platform
- Short name: DIP
- Base namespace:
at.procon.dip - Legacy namespace kept during transition:
at.procon.ted
Why this naming
The application is no longer only a TED notice processor. The new name reflects the broader goal: import arbitrary document types, derive canonical searchable text, vectorize it, and run semantic search over those representations.
Phase 0 decisions implemented in code
- New Spring Boot entry point under
at.procon.dip - Legacy TED runtime kept through explicit package scanning
- Generic vocabulary introduced via enums in
at.procon.dip.domain.document - Tenant introduced as a first-class value object in
at.procon.dip.domain.tenant - Ownership and access are explicitly separated through
DocumentAccessContext - Canonical document metadata and ingestion descriptors support both:
- tenant-owned documents
- public documents without tenant ownership
- Extension-point interfaces introduced for ingestion, classification, extraction, normalization, and vectorization
- Target schema split documented as:
DOCfor generic document modelTEDfor TED-specific projections
- Migration strategy formalized as phased additive migration:
- additive schema
- dual write
- backfill
- cutover
- retire legacy
Planned package areas
at.procon.dip.architectureat.procon.dip.domain.accessat.procon.dip.domain.documentat.procon.dip.domain.tenantat.procon.dip.ingestion.spiat.procon.dip.classification.spiat.procon.dip.extraction.spiat.procon.dip.normalization.spiat.procon.dip.vectorization.spiat.procon.dip.search.spiat.procon.dip.processing.spiat.procon.dip.migration
Ownership and visibility decision
A tenant represents the owner of a document, but ownership is optional.
A public TED notice therefore does not need a fake tenant. Instead, the canonical model uses:
- optional
ownerTenant - mandatory
DocumentVisibility
Examples:
- TED notice:
ownerTenant = null,visibility = PUBLIC - customer-private document:
ownerTenant = tenantA,visibility = TENANT - explicitly shared document:
ownerTenant = tenantA,visibility = SHARED
Phase 1 now realizes this persistence direction through the additive DOC schema. The resulting
backbone uses:
DOC.doc_document.owner_tenant_idnullableDOC.doc_document.visibilitynot null
The complete Phase 1 persistence details are documented in docs/architecture/PHASE1_GENERIC_PERSISTENCE_MODEL.md.
Non-goals of Phase 0
- No database schema migration yet
- No runtime behavior changes in TED processing
- No replacement of
ProcurementDocumentyet - No semantic search refactoring yet
Result
The codebase now has a stable generalized namespace and contract surface for future phases without requiring a disruptive rewrite.