# Phase 0 – Architecture Foundation ## New project identity - **Project name:** Procon Document Intelligence Platform - **Short name:** DIP - **Base namespace:** `at.procon.dip` - **Legacy namespace kept during transition:** `at.procon.ted` ## Why this naming The application is no longer only a TED notice processor. The new name reflects the broader goal: import arbitrary document types, derive canonical searchable text, vectorize it, and run semantic search over those representations. ## Phase 0 decisions implemented in code 1. New Spring Boot entry point under `at.procon.dip` 2. Legacy TED runtime kept through explicit package scanning 3. Generic vocabulary introduced via enums in `at.procon.dip.domain.document` 4. Tenant introduced as a first-class value object in `at.procon.dip.domain.tenant` 5. Ownership and access are explicitly separated through `DocumentAccessContext` 6. Canonical document metadata and ingestion descriptors support both: - tenant-owned documents - public documents without tenant ownership 7. Extension-point interfaces introduced for ingestion, classification, extraction, normalization, and vectorization 8. Target schema split documented as: - `DOC` for generic document model - `TED` for TED-specific projections 9. Migration strategy formalized as phased additive migration: - additive schema - dual write - backfill - cutover - retire legacy ## Planned package areas - `at.procon.dip.architecture` - `at.procon.dip.domain.access` - `at.procon.dip.domain.document` - `at.procon.dip.domain.tenant` - `at.procon.dip.ingestion.spi` - `at.procon.dip.classification.spi` - `at.procon.dip.extraction.spi` - `at.procon.dip.normalization.spi` - `at.procon.dip.vectorization.spi` - `at.procon.dip.search.spi` - `at.procon.dip.processing.spi` - `at.procon.dip.migration` ## Ownership and visibility decision A tenant represents the owner of a document, but ownership is optional. A public TED notice therefore does not need a fake tenant. Instead, the canonical model uses: - optional `ownerTenant` - mandatory `DocumentVisibility` Examples: - TED notice: `ownerTenant = null`, `visibility = PUBLIC` - customer-private document: `ownerTenant = tenantA`, `visibility = TENANT` - explicitly shared document: `ownerTenant = tenantA`, `visibility = SHARED` Phase 1 now realizes this persistence direction through the additive `DOC` schema. The resulting backbone uses: - `DOC.doc_document.owner_tenant_id` nullable - `DOC.doc_document.visibility` not null The complete Phase 1 persistence details are documented in `docs/architecture/PHASE1_GENERIC_PERSISTENCE_MODEL.md`. ## Non-goals of Phase 0 - No database schema migration yet - No runtime behavior changes in TED processing - No replacement of `ProcurementDocument` yet - No semantic search refactoring yet ## Result The codebase now has a stable generalized namespace and contract surface for future phases without requiring a disruptive rewrite.