What would be the best way to support the pgvector vector extension in a PostgreSQL database using Spring-Boot 3 with Hibernate 6, so I can create vector embeddings using a JPA entity?
CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(5));
While I did found solutions for previous Hibernate versions, they don't work for Hibernate 6, as it requires jakarta.persistence.* packages instead of javax.persistence.*. I tried using a custom converter class, but that didn't work out (and I am not sure that is even the correct direction):
Item.java
import com.fasterxml.jackson.annotation.JsonInclude;
import jakarta.persistence.*;
import lombok.Data;
import lombok.NoArgsConstructor;
import org.springframework.data.annotation.TypeAlias;
import java.util.List;
@Data
@NoArgsConstructor
@Entity
@Table(name = "items")
@JsonInclude(JsonInclude.Include.NON_NULL)
@TypeAlias("json")
public class Item {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
@Convert(converter = VectorConverter.class)
@Column(columnDefinition = "vector(5)")
private List<Double> embedding;
}
VectorConverter.java:
import jakarta.persistence.AttributeConverter;
import jakarta.persistence.Converter;
import java.util.ArrayList;
import java.util.List;
@Converter
public class VectorConverter implements AttributeConverter<List<Double>, String> {
@Override
public String convertToDatabaseColumn(List<Double> attribute) {
if (attribute == null || attribute.isEmpty()) {
return "'[]'";
}
StringBuilder stringBuilder = new StringBuilder();
stringBuilder.append("[");
for (int i = 0; i < attribute.size(); i++) {
stringBuilder.append(attribute.get(i));
if (i < attribute.size() - 1) {
stringBuilder.append(",");
}
}
stringBuilder.append("]");
return stringBuilder.toString();
}
public List<Double> convertToEntityAttribute(String dbData) {
if (dbData == null || dbData.isEmpty()) {
return new ArrayList<>();
}
String[] values = dbData.substring(1, dbData.length() - 1).split(",");
List<Double> embeddingList = new ArrayList<>();
for (String value : values) {
embeddingList.add(Double.valueOf(value.trim()));
}
return embeddingList;
}
}
Causing following error:
Hint: You will need to rewrite or cast the expression.
Position: 39] [insert into items (embedding) values (?)]; SQL [insert into items (embedding) values (?)]] with root cause
org.postgresql.util.PSQLException: ERROR: column "embedding" is of type vector but expression is of type character varying
Hint: You will need to rewrite or cast the expression.
Position: 39
UserType
to anbale the conversion between the PostgreSQLvector
type andList<Double>
. – Boatbill