Using the pgvector vector extension in JPA + Hibernate 6
Asked Answered
V

1

1

What would be the best way to support the pgvector vector extension in a PostgreSQL database using Spring-Boot 3 with Hibernate 6, so I can create vector embeddings using a JPA entity?

CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(5));

While I did found solutions for previous Hibernate versions, they don't work for Hibernate 6, as it requires jakarta.persistence.* packages instead of javax.persistence.*. I tried using a custom converter class, but that didn't work out (and I am not sure that is even the correct direction):

Item.java

import com.fasterxml.jackson.annotation.JsonInclude;
import jakarta.persistence.*;
import lombok.Data;
import lombok.NoArgsConstructor;
import org.springframework.data.annotation.TypeAlias;

import java.util.List;

@Data
@NoArgsConstructor
@Entity
@Table(name = "items")
@JsonInclude(JsonInclude.Include.NON_NULL)
@TypeAlias("json")
public class Item {
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    @Convert(converter = VectorConverter.class)
    @Column(columnDefinition = "vector(5)")
    private List<Double> embedding;
}

VectorConverter.java:


import jakarta.persistence.AttributeConverter;
import jakarta.persistence.Converter;

import java.util.ArrayList;
import java.util.List;

@Converter
public class VectorConverter implements AttributeConverter<List<Double>, String> {

    @Override
    public String convertToDatabaseColumn(List<Double> attribute) {
        if (attribute == null || attribute.isEmpty()) {
            return "'[]'";
        }

        StringBuilder stringBuilder = new StringBuilder();
        stringBuilder.append("[");

        for (int i = 0; i < attribute.size(); i++) {
            stringBuilder.append(attribute.get(i));
            if (i < attribute.size() - 1) {
                stringBuilder.append(",");
            }
        }

        stringBuilder.append("]");
        return stringBuilder.toString();
    }

    public List<Double> convertToEntityAttribute(String dbData) {
        if (dbData == null || dbData.isEmpty()) {
            return new ArrayList<>();
        }

        String[] values = dbData.substring(1, dbData.length() - 1).split(",");

        List<Double> embeddingList = new ArrayList<>();
        for (String value : values) {
            embeddingList.add(Double.valueOf(value.trim()));
        }

        return embeddingList;
    }
}

Causing following error:

  Hint: You will need to rewrite or cast the expression.
  Position: 39] [insert into items (embedding) values (?)]; SQL [insert into items (embedding) values (?)]] with root cause

org.postgresql.util.PSQLException: ERROR: column "embedding" is of type vector but expression is of type character varying
  Hint: You will need to rewrite or cast the expression.
  Position: 39
Vaughan answered 13/9, 2023 at 17:40 Comment(3)
I guess you need to implement a custom UserType to anbale the conversion between the PostgreSQLvector type and List<Double>.Boatbill
Jakarta.peristence libraries are pretty much a straight replacement for javax.persistence ones; you just need to change the imports in your classes and recompile. What solutions exactly from previous Hibernate versions don't work on the latest - seems like they just need to change their imports and recompile to work unless they are using deprecated/removed Hibernate internal apiCyprinid
Did you solve the problem? Can you please share your progress?Cloyd
D
1

Since 6.4 I think, Hibernate has released a vector module : https://docs.jboss.org/hibernate/orm/current/userguide/html_single/Hibernate_User_Guide.html#vector-module

You import the hibernate-vector lib (you won't need converter anymore) and then this should work :

@Column( name = "the_vector" )
@JdbcTypeCode(SqlTypes.VECTOR)
@Array(length = 3)
private float[] theVector;
Door answered 23/3 at 21:28 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.