You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
2.5 KiB
2.5 KiB
Parallel embedding subsystem plan (NV1–NV3)
This plan assumes the old vectorization subsystem remains in place temporarily, while a new generic embedding subsystem is built in parallel.
Principles
- Build the new subsystem under
at.procon.dip.embedding.*. - Do not shape it around old TED-specific or legacy vectorization services.
- Operate on
DocumentTextRepresentationandDocumentEmbeddingas the core abstraction. - Keep the new subsystem configurable and provider-based.
- Migrate and cut over later.
NV1 — provider/model/query foundation
Goal
Create a standalone embedding foundation that can:
- resolve configured providers
- resolve configured models
- embed arbitrary text lists
- embed search queries
- support deterministic testing
Deliverables
EmbeddingPropertiesEmbeddingUseCaseEmbeddingRequestEmbeddingProviderResultEmbeddingModelDescriptorResolvedEmbeddingProviderConfigEmbeddingProviderExternalHttpEmbeddingProviderMockEmbeddingProviderEmbeddingProviderRegistryEmbeddingModelRegistryEmbeddingProviderConfigResolverEmbeddingExecutionServiceQueryEmbeddingService- startup validation of provider/model wiring
Notes
- No cutover to the old vectorization path yet.
- No persistence/job orchestration yet.
- New subsystem should be safe to include in the app while disabled by default.
NV2 — persistence and job orchestration
Goal
Make the new subsystem able to create and process embedding jobs against DocumentTextRepresentation.
Deliverables
EmbeddingJobentity/repository/service- retry / backoff policy
- default
EmbeddingSelectionPolicy - representation-level embedding execution
DocumentEmbeddingpersistence updates through the new subsystem
NV3 — generic semantic search engine
Goal
Add semantic search into the generic search platform using only the new subsystem.
Deliverables
PgVectorSemanticSearchEngineDocumentSemanticSearchRepository- query embedding through
QueryEmbeddingService - chunk-aware retrieval and collapse
- fusion with lexical search
Migration philosophy
Because the app is still in development, prefer:
- migrate documents and text representations first
- re-embed through the new subsystem
- only preserve old raw vector data if there is a strong operational reason
Recommended implementation order
- NV1 foundation
- NV1 tests with mock provider
- NV2 jobs and selection policy
- NV3 semantic search
- migration/backfill
- cutover