You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

2.5 KiB

Raw Blame History Unescape Escape

Parallel embedding subsystem plan (NV1–NV3)

This plan assumes the old vectorization subsystem remains in place temporarily, while a new generic embedding subsystem is built in parallel.

Principles

Build the new subsystem under at.procon.dip.embedding.*.
Do not shape it around old TED-specific or legacy vectorization services.
Operate on DocumentTextRepresentation and DocumentEmbedding as the core abstraction.
Keep the new subsystem configurable and provider-based.
Migrate and cut over later.

NV1 — provider/model/query foundation

Goal

Create a standalone embedding foundation that can:

resolve configured providers
resolve configured models
embed arbitrary text lists
embed search queries
support deterministic testing

Deliverables

EmbeddingProperties
EmbeddingUseCase
EmbeddingRequest
EmbeddingProviderResult
EmbeddingModelDescriptor
ResolvedEmbeddingProviderConfig
EmbeddingProvider
ExternalHttpEmbeddingProvider
MockEmbeddingProvider
EmbeddingProviderRegistry
EmbeddingModelRegistry
EmbeddingProviderConfigResolver
EmbeddingExecutionService
QueryEmbeddingService
startup validation of provider/model wiring

Notes

No cutover to the old vectorization path yet.
No persistence/job orchestration yet.
New subsystem should be safe to include in the app while disabled by default.

NV2 — persistence and job orchestration

Goal

Make the new subsystem able to create and process embedding jobs against DocumentTextRepresentation.

Deliverables

EmbeddingJob entity/repository/service
retry / backoff policy
default EmbeddingSelectionPolicy
representation-level embedding execution
DocumentEmbedding persistence updates through the new subsystem

NV3 — generic semantic search engine

Goal

Add semantic search into the generic search platform using only the new subsystem.

Deliverables

PgVectorSemanticSearchEngine
DocumentSemanticSearchRepository
query embedding through QueryEmbeddingService
chunk-aware retrieval and collapse
fusion with lexical search

Migration philosophy

Because the app is still in development, prefer:

migrate documents and text representations first
re-embed through the new subsystem
only preserve old raw vector data if there is a strong operational reason

Recommended implementation order

NV1 foundation
NV1 tests with mock provider
NV2 jobs and selection policy
NV3 semantic search
migration/backfill
cutover