JPA with Postgres anomalously writes text in @Lob column as a number
Asked Answered
S

2

7

I am trying to use a @Lob column with a Java String type to map its content to TEXT inside Postgres. Here is the relevant entity:

@Entity(name="metadata")
public class Metadata {
    @Id
    @GeneratedValue(strategy=GenerationType.IDENTITY)
    @Column(name = "id")
    private Long id;

    @Column(name = "created_on")
    @ColumnDefault(value="CURRENT_TIMESTAMP")
    @Generated(GenerationTime.INSERT)
    private LocalDateTime createdOn;

    @Lob
    @Column(name = "content")
    private String content;

    @Column(name = "draft")
    private Boolean draft;

    @OneToMany(cascade = javax.persistence.CascadeType.ALL, fetch = FetchType.LAZY, mappedBy = "metadata")
    private List<Attachment> attachments;

    public void addAttachment(Attachment attachment) {
        if (attachments == null) {
            attachments = new ArrayList<>();
        }
        attachments.add(attachment);
        attachment.setMetadata(this);
    }

    // getters and setters
}

I have code which creates a new Metadata entity based on use input. I verify manually in IntelliJ debug mode that this entity has the content set to its intended value (which happens to be a JSON string). However, when I check Postgres after running the code, I see this data:

my_db=> select * from metadata;
 id | content |       created_on        | draft
----+---------+-------------------------+-------
  1 | 49289   | 2021-04-26 14:21:25.733 | t
(1 row)

Note carefully that the strange values 49289 is appearing where we would expect to see a JSON string. Note that I also verified from the command line that the correct table is what was created:

CREATE TABLE scarfon_attachment (
    id bigint NOT NULL,
    contents text,
    filename character varying(255),
    scarfon_id bigint NOT NULL
);

All the other columns in the entity/table are working as expected. What could be the problem with the @Lob annotation. For reference, I am running a fairly old version of Postgres (9.2), but it is not that ancient.

Szeged answered 26/4, 2021 at 6:29 Comment(3)
The @Lob annotation uses the dreaded "Large Objects" instead of a proper string. Annotate the column with @String - a text column is not a "Lob"Bulldog
@a_horse_with_no_name But will @String tell the driver to instruct Postgres to use TEXT? Forgive me, you certainly know more about this than I do.Szeged
A text column can be used in exactly the same way as a varchar column. So yes, JPA should treat it correctlyBulldog
S
4

My first doubt here owed to many sources suggesting multiple ways for creating a TEXT column. For example, this Baeldung post suggests using @Lob in addition to use a definition with the @Column annotation.

As it turns out, @Lob is not interpreted the same by all databases. In the case of Postgres, just using @Lob alone will result in Postgres storing the column contents in a different table, with the column annotated with @Lob just storing an ID for each entry in that table. While it has been suggested here that also specifying the correct type via the @Type annotation can remedy this problem, I decided to go with the second suggestion by the Baledung post, which is using @Column:

@Lob
@Column(columnDefinition="TEXT")
private String content;

This worked fine, and the resulting Postgres table had a TEXT definition as expected. The only potential issue with the above might center around portability to other SQL databases which perhaps do not support a TEXT type, or maybe support some alternative. I did not test beyond Postgres and H2, but in both cases the above was working without issues.

Szeged answered 16/5, 2021 at 3:56 Comment(5)
I had to also remove @Lob to make it workJunk
I had a similar problem and also had to remove @Lob because it was mapping to a Long. Leaving only @Column(columnDefinition="TEXT") works.Etter
This is not correct, @Lob is not required at all for a postgres database, and if you use it, it will always create a (not referentially guaranteed) reference to the system table pg_largeobject instead. IF done as described here, the column will still contain a reference to that table, but as a string instead (which is worse then as a long). More details are in the comments to the other, correct answer https://mcmap.net/q/1538587/-jpa-with-postgres-anomalously-writes-text-in-lob-column-as-a-numberInfiltration
See for example this post: geeknet.me/hibernate-postgresql-the-lob-annotation This is correct and provides a solution, and describes the problem a bit more in depth. Unfortunately this stackoverflow answer ranks higher in googleInfiltration
To be more precise, when I wrote "@Lob is not required at all for a postgres database", I was just referring to the case as asked (i.e. text data). For other data, postgres provides a set of server-side functions (postgresql.org/docs/current/lo-funcs.html#LO-FUNCS-TABLE). @Lob in combination with the postgres driver uses these functions, but this will result in the opaque oid int4 column instead of actual data in the source tableInfiltration
R
0

When data is stored in a LOB column, it is not stored directly in the regular row data like other columns. Instead, it is stored in a separate location in the database, and the row only contains a reference (usually a numeric identifier) to that separate location.

The reason you see numbers like 1124 instead of the real TEXT code when you query the database directly is that the actual TEXT content is stored in a separate location, and the number 1124 is a reference (or pointer) to that location. This is a way to manage and optimize the storage of large data that would not fit well within the regular row data structure.

When you retrieve the data using the application, the JPA will handle the process of retrieving the LOB data and mapping it back to the layout field as a string containing the actual TEXT.

Reinhart answered 18/7, 2023 at 10:25 Comment(15)
@NotTheDr01ds, Why do you think it's AI?Reinhart
Thanks for the reply - After digging in on it a bit more, I'm less inclined to believe that it might be, but are you rewriting the answer using any tool? That could throw things off. Also, see here for why I don't go into details on the specifics of your question ;-). Thanks!Plasmosome
@NotTheDr01ds, From my point of view, why offer something or ask if you are not going to answer the answers - this is not right. It's better then not to start a dialogue at all :)Reinhart
Under normal site conditions, we wouldn't be. We'd simply be flagging answers that we thought were AI-assisted. However, due to the moderator strike, we're left with this suboptimal "commenting" to give users a heads-up when we think something is AI-based. Every so often, we're wrong, which is also why it's nice to have the Moderation team be a second set of eyes on answers before taking action. I do hope things can get back to "normal" soon :-).Plasmosome
However, to be clear, when the Mod team takes action (as they've done on thousands of answers), they don't provide the rationale either, for the same reasons I mentioned.Plasmosome
@NotTheDr01ds, I noticed that this forum is generally engaged in lawlessness without explaining the reasons - for example, they banned me from creating questions "just because". So for many people, StackOverFlow is a forum that does not follow its own rules ...Reinhart
I don't know the exact circumstances, but question bans are typically automatic based on hitting a certain number of "poorly received" (by downvotes or close votes) questions. Yes, the algorithm that handles this is, by necessity, not made public by SE, or users would try to evade it. There's usually no "they" involved in "they banned" - It's the way the site works. I'm not sure we can say that the system isn't following its own rules ;-)Plasmosome
@NotTheDr01ds, Then analyze my questions and you will see that they are compiled in accordance with the requirements of the forum "how to ask questions correctly". HOWEVER, there are A LOT of "Explain, I don't understand how it works" questions WITHOUT stating what the person did and such questions are NOT BANNED. Explain why? For example, the most banal question style is https://mcmap.net/q/12868/-what-does-if-__name__-quot-__main__-quot-do/…Reinhart
Often we never really know for sure why people downvote (or not). I mean, obvious you got a downvote for your "minuses fart" edit on your question, but only a single one before it was rolled back by a moderator. The others were likely due to a lack of clarity, but I don't know for sure. There was also one in there that looked like a homework question the way it was phrased - I don't think it was, but some people do downvote things that look like homework.Plasmosome
The one you linked to, IMHO, is a very good question. It's obviously one that a lot of people search on, with it having been seen 4.4 million times. Sure, it's basic to most people now, but every Python beginner comes across it at some point.Plasmosome
@NotTheDr01ds, The fact is that this question was created in VIOLATION of the rules of this forum - the person himself did not look for anything, but immediately went to ask a question. There are thousands of such questions! Another thing is that this question is popular, but it violates the rules of the forum. However, my question does NOT violate forum rules. From this we can conclude that the forum moderators "sit on two chairs" and act according to double standards - like it, then leave it, if you don't like it - ban it. (continued in next comment)Reinhart
@NotTheDr01ds, Consequently, the forum moderators still LEAVED such moments (such as my questions) without due attention, and the moderators themselves, it turns out, violate their own rules. What are we talking about now? No one is going to deal with this situation, I'm still banned from creating new questions. Everything is simple here - either you unban me (because the questions are asked according to the rules of the forum) or everything remains as it is and the moderators continue to violate their own rules. As the saying goes "less words, more action".Reinhart
This is the correct answer unfortunately. The accepted answer is wrong and will produce a column in the original table of type "text", but containing a OID (as string) which references the system table pg_largeobject. I know this because we have exactly the same code as in the original question in our service, and I just analyzed that to find out what is stated in this answer. There is no good documentation on this. In my eyes it's a bug in Spring Data/Hibernate, and this wrong answer here even leads to a wrong resolution. I may post a blog post to hve something to refer to.Infiltration
To summarise: https://mcmap.net/q/1538587/-jpa-with-postgres-anomalously-writes-text-in-lob-column-as-a-number is the wrong answer, and what's stated in this answer is correct. It may or may note have been created by an AI, but it is nevertheless correct.Infiltration
Original code from our codebase (2024): @Lob @Column(columnDefinition = "text") private String payload; @Lob() @Column(columnDefinition = "text") private String errorMessage; Columns are TEXT according to the schema, contents are numbers (for example, 301352'), and these numbers (as strings) refer to the system table pg_largeobjectInfiltration

© 2022 - 2024 — McMap. All rights reserved.