You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

60 lines
1.5 KiB
Markdown

# Runtime split Patch B
Patch B builds on Patch A and makes the NEW runtime actually process embedding jobs.
## What changes
### 1. New embedding job scheduler
Adds:
- `EmbeddingJobScheduler`
- `EmbeddingJobSchedulingConfiguration`
Behavior:
- enabled only in `NEW` runtime mode
- active only when `dip.embedding.jobs.enabled=true`
- periodically calls:
- `RepresentationEmbeddingOrchestrator.processNextReadyBatch()`
### 2. Generic import hands off to the new embedding job path
`GenericDocumentImportService` is updated so that in `NEW` mode it:
- resolves `dip.embedding.default-document-model`
- ensures the model is registered in `DOC.doc_embedding_model`
- creates embedding jobs through:
- `RepresentationEmbeddingOrchestrator.enqueueRepresentation(...)`
It no longer creates legacy-style pending embeddings as the primary handoff for the NEW runtime path.
## Notes
- This patch assumes Patch A has already introduced:
- `RuntimeMode`
- `RuntimeModeProperties`
- `@ConditionalOnRuntimeMode`
- This patch does not yet remove the legacy vectorization runtime.
That remains the job of subsequent cutover steps.
## Expected runtime behavior in NEW mode
- `GenericDocumentImportService` persists new generic representations
- selected representations are queued into `DOC.doc_embedding_job`
- scheduler processes pending jobs
- vectors are persisted through the new embedding subsystem
## New config
Example:
```yaml
dip:
runtime:
mode: NEW
embedding:
enabled: true
jobs:
enabled: true
scheduler-delay-ms: 5000
```