85 lines
2.5 KiB
Markdown
85 lines
2.5 KiB
Markdown
# Parallel embedding subsystem plan (NV1–NV3)
|
||
|
||
This plan assumes the old vectorization subsystem remains in place temporarily, while a new generic embedding subsystem is built in parallel.
|
||
|
||
## Principles
|
||
|
||
- Build the new subsystem under `at.procon.dip.embedding.*`.
|
||
- Do not shape it around old TED-specific or legacy vectorization services.
|
||
- Operate on `DocumentTextRepresentation` and `DocumentEmbedding` as the core abstraction.
|
||
- Keep the new subsystem configurable and provider-based.
|
||
- Migrate and cut over later.
|
||
|
||
## NV1 — provider/model/query foundation
|
||
|
||
### Goal
|
||
Create a standalone embedding foundation that can:
|
||
- resolve configured providers
|
||
- resolve configured models
|
||
- embed arbitrary text lists
|
||
- embed search queries
|
||
- support deterministic testing
|
||
|
||
### Deliverables
|
||
- `EmbeddingProperties`
|
||
- `EmbeddingUseCase`
|
||
- `EmbeddingRequest`
|
||
- `EmbeddingProviderResult`
|
||
- `EmbeddingModelDescriptor`
|
||
- `ResolvedEmbeddingProviderConfig`
|
||
- `EmbeddingProvider`
|
||
- `ExternalHttpEmbeddingProvider`
|
||
- `MockEmbeddingProvider`
|
||
- `EmbeddingProviderRegistry`
|
||
- `EmbeddingModelRegistry`
|
||
- `EmbeddingProviderConfigResolver`
|
||
- `EmbeddingExecutionService`
|
||
- `QueryEmbeddingService`
|
||
- startup validation of provider/model wiring
|
||
|
||
### Notes
|
||
- No cutover to the old vectorization path yet.
|
||
- No persistence/job orchestration yet.
|
||
- New subsystem should be safe to include in the app while disabled by default.
|
||
|
||
## NV2 — persistence and job orchestration
|
||
|
||
### Goal
|
||
Make the new subsystem able to create and process embedding jobs against `DocumentTextRepresentation`.
|
||
|
||
### Deliverables
|
||
- `EmbeddingJob` entity/repository/service
|
||
- retry / backoff policy
|
||
- default `EmbeddingSelectionPolicy`
|
||
- representation-level embedding execution
|
||
- `DocumentEmbedding` persistence updates through the new subsystem
|
||
|
||
## NV3 — generic semantic search engine
|
||
|
||
### Goal
|
||
Add semantic search into the generic search platform using only the new subsystem.
|
||
|
||
### Deliverables
|
||
- `PgVectorSemanticSearchEngine`
|
||
- `DocumentSemanticSearchRepository`
|
||
- query embedding through `QueryEmbeddingService`
|
||
- chunk-aware retrieval and collapse
|
||
- fusion with lexical search
|
||
|
||
## Migration philosophy
|
||
|
||
Because the app is still in development, prefer:
|
||
|
||
1. migrate documents and text representations first
|
||
2. re-embed through the new subsystem
|
||
3. only preserve old raw vector data if there is a strong operational reason
|
||
|
||
## Recommended implementation order
|
||
|
||
1. NV1 foundation
|
||
2. NV1 tests with mock provider
|
||
3. NV2 jobs and selection policy
|
||
4. NV3 semantic search
|
||
5. migration/backfill
|
||
6. cutover
|