# Native XML-Spalte mit XPath-Abfragen

## ✅ Implementiert

Die `xml_document`-Spalte ist jetzt ein nativer PostgreSQL XML-Typ mit voller XPath-Unterstützung.

## Hibernate-Konfiguration

```java
@Column(name = "xml_document", nullable = false)
@JdbcTypeCode(SqlTypes.SQLXML)
private String xmlDocument;
```

## XPath-Abfrage Beispiele

### 1. **Einfache XPath-Abfrage (ohne Namespaces)**

```sql
-- Alle Dokument-IDs extrahieren
SELECT
    id,
    xpath('//ID/text()', xml_document) as document_ids
FROM ted.procurement_document;
```

### 2. **XPath mit Namespaces (eForms UBL)**

eForms verwendet XML-Namespaces. Sie müssen diese bei XPath-Abfragen angeben:

```sql
-- Titel extrahieren (mit Namespace)
SELECT
    id,
    xpath(
        '//cbc:Title/text()',
        xml_document,
        ARRAY[
            ARRAY['cbc', 'urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2'],
            ARRAY['cac', 'urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2']
        ]
    ) as titles
FROM ted.procurement_document;
```

### 3. **Buyer Name extrahieren**

```sql
SELECT
    id,
    publication_id,
    xpath(
        '//cac:ContractingParty/cac:Party/cac:PartyName/cbc:Name/text()',
        xml_document,
        ARRAY[
            ARRAY['cbc', 'urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2'],
            ARRAY['cac', 'urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2']
        ]
    ) as buyer_names
FROM ted.procurement_document;
```

### 4. **CPV-Codes extrahieren**

```sql
SELECT
    id,
    xpath(
        '//cac:ProcurementProject/cac:MainCommodityClassification/cbc:ItemClassificationCode/text()',
        xml_document,
        ARRAY[
            ARRAY['cbc', 'urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2'],
            ARRAY['cac', 'urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2']
        ]
    ) as cpv_codes
FROM ted.procurement_document;
```

### 5. **Filtern nach XML-Inhalt**

```sql
-- Alle Dokumente finden, die einen bestimmten CPV-Code enthalten
SELECT
    id,
    publication_id,
    buyer_name
FROM ted.procurement_document
WHERE xpath_exists(
    '//cac:ProcurementProject/cac:MainCommodityClassification/cbc:ItemClassificationCode[text()="45000000"]',
    xml_document,
    ARRAY[
        ARRAY['cbc', 'urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2'],
        ARRAY['cac', 'urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2']
    ]
);
```

### 6. **Estimated Value extrahieren**

```sql
SELECT
    id,
    xpath(
        '//cac:ProcurementProject/cac:RequestedTenderTotal/cbc:EstimatedOverallContractAmount/text()',
        xml_document,
        ARRAY[
            ARRAY['cbc', 'urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2'],
            ARRAY['cac', 'urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2']
        ]
    ) as estimated_values
FROM ted.procurement_document;
```

## JPA/Hibernate Native Queries

Sie können XPath auch in Spring Data JPA Repositories verwenden:

### Repository-Beispiel

```java
@Repository
public interface ProcurementDocumentRepository extends JpaRepository<ProcurementDocument, UUID> {

    /**
     * Findet alle Dokumente, die einen bestimmten Text im Titel enthalten (via XPath)
     */
    @Query(value = """
        SELECT * FROM ted.procurement_document
        WHERE xpath_exists(
            '//cbc:Title[contains(text(), :searchText)]',
            xml_document,
            ARRAY[
                ARRAY['cbc', 'urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2']
            ]
        )
        """, nativeQuery = true)
    List<ProcurementDocument> findByTitleContaining(@Param("searchText") String searchText);

    /**
     * Extrahiert CPV-Codes via XPath
     */
    @Query(value = """
        SELECT
            unnest(xpath(
                '//cac:ProcurementProject/cac:MainCommodityClassification/cbc:ItemClassificationCode/text()',
                xml_document,
                ARRAY[
                    ARRAY['cbc', 'urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2'],
                    ARRAY['cac', 'urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2']
                ]
            ))::text as cpv_code
        FROM ted.procurement_document
        WHERE id = :documentId
        """, nativeQuery = true)
    List<String> extractCpvCodes(@Param("documentId") UUID documentId);

    /**
     * Findet Dokumente nach CPV-Code
     */
    @Query(value = """
        SELECT * FROM ted.procurement_document
        WHERE xpath_exists(
            '//cbc:ItemClassificationCode[text()=:cpvCode]',
            xml_document,
            ARRAY[
                ARRAY['cbc', 'urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2']
            ]
        )
        """, nativeQuery = true)
    List<ProcurementDocument> findByCpvCode(@Param("cpvCode") String cpvCode);
}
```

## PostgreSQL XML-Funktionen

Weitere nützliche XML-Funktionen:

### `xml_is_well_formed()`
```sql
SELECT xml_is_well_formed(xml_document) FROM ted.procurement_document;
```

### `xpath_exists()` - Prüft ob Pfad existiert
```sql
SELECT xpath_exists('//cbc:Title', xml_document, ...) FROM ted.procurement_document;
```

### `unnest()` - Array zu Zeilen
```sql
SELECT
    id,
    unnest(xpath('//cbc:Title/text()', xml_document, ...))::text as title
FROM ted.procurement_document;
```

## Häufige eForms Namespaces

```sql
ARRAY[
    ARRAY['cbc', 'urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2'],
    ARRAY['cac', 'urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2'],
    ARRAY['ext', 'urn:oasis:names:specification:ubl:schema:xsd:CommonExtensionComponents-2'],
    ARRAY['efac', 'http://data.europa.eu/p27/eforms-ubl-extension-aggregate-components/1'],
    ARRAY['efbc', 'http://data.europa.eu/p27/eforms-ubl-extension-basic-components/1']
]
```

## Performance-Tipps

1. **Indizierung**: Für häufige XPath-Abfragen können Sie funktionale Indexe erstellen:
```sql
CREATE INDEX idx_doc_title ON ted.procurement_document
    USING GIN ((xpath('//cbc:Title/text()', xml_document, ...)));
```

2. **Materialized Views**: Für komplexe XPath-Abfragen:
```sql
CREATE MATERIALIZED VIEW ted.document_titles AS
SELECT
    id,
    unnest(xpath('//cbc:Title/text()', xml_document, ...))::text as title
FROM ted.procurement_document;
```

## Vorteile der nativen XML-Spalte

✅ Native XPath-Abfragen
✅ XML-Validierung möglich
✅ Effiziente Speicherung
✅ PostgreSQL XML-Funktionen verfügbar
✅ Strukturierte Abfragen auf XML-Elementen
✅ Funktionale Indexe möglich

## Hibernate funktioniert jetzt korrekt

Mit `@JdbcTypeCode(SqlTypes.SQLXML)` weiß Hibernate, dass es `SQLXML` verwenden muss für INSERT/UPDATE.

Das verhindert den Fehler:
```
ERROR: column "xml_document" is of type xml but expression is of type character varying
```