How to use ora_hash on a column of datatype xmltype
Asked Answered
F

1

2

I would like to use ORA_HASH on xmltype datatype, and workarounds are straight forward solutions ?

I am using oracle 11g r2 and binary xml as storage option for xmltype column

Query which I used for creation of the table is

create table samplebinary ( indexid number(19,0) , xmlcolumn xmltype not null) xmltype column xmlcolumn store as binary xml;

Faintheart answered 23/4, 2014 at 7:11 Comment(2)
Why do you want to use it - does it have to be ora_hash, or just a hash? Would getting the hash of the first 4k of the content suffice?Thimbleweed
No I want to make sure that entire xml document hasnt been changedFaintheart
T
7

As you already know, ora_hash doesn't accept long or LOB values. You could pass in the first 4k or 32k of the XML content, but if you need to make sure that the entire XML document hasn't changed, that won't be sufficient. And as Ben mentioned, ora_hash has a maximum of 4294967295 buckets, so collisions are rather more likely than with SHA-1 or MD5. As the documentation says, ora_hash 'is useful for operations such as analyzing a subset of data and generating a random sample'.

You can use the dbms_crypto package to hash the whole XMLType value, as a CLOB extracted with the getClobVal function, with a wrapper function to make it simpler to use:

create or replace function my_hash(xml xmltype) return raw is
begin
  return dbms_crypto.hash(src=>xml.getclobval(), typ=>dbms_crypto.hash_sh1);
end;
/

You can then pass in your XMLType, as a value or as a column as part of a select:

select my_hash(xml) from t42;

MY_HASH(XML)                                 
---------------------------------------------
494C4E7688963BCF312B709B33CD1B5CCA7C0289     
Thimbleweed answered 23/4, 2014 at 7:33 Comment(11)
ORA_HASH() won't tell you if 32k has changed anyway. There are so few values that it's extremely easy to get a clash.Air
I am getting the same hash value for for all the xml fields inspite of the fact that they are all different.Faintheart
@NishanthLawrence - I don't see that behaviour, any difference in the XML content creates a different hash value. (Collisions are still possible, but very unlikely). Without seeing your data or how you've implemented and called this I'm not sure what might be wrong. Is any of your XML short enough to add a couple of examples to the question?Thimbleweed
@ALex Poole I forgot to mention that I am using storage option as binary xml.Faintheart
@NishanthLawrence - shouldn't matter, that's the default in 11g anyway. (Assuming you mean an XMLType column in a table, not a table of xmltype, which is different - maybe add your table DDL to the question to clarify?). Is the actual content still character-based though?Thimbleweed
@NishanthLawrence - I've tried it with an XMLType column that is explicitly stored as binary XML (which is the default, as I said); and with a table of XMLType. I get different hash values from different XML docs both ways. So still don't know what's different for you. Can't do an SQL Fiddle unfortunately as that doesn't support dbms_crypto.Thimbleweed
@Alex Poole I am using column of xmltype only. I asked you this because default storage option for xmltype in mine is CLOB and if I leave it unchanged am getting unique hash vales for each xml.Faintheart
@NishanthLawrence - in 11gR2 the default is binary... oh no, you're right, that's only from 11.2.0.2, so if you're on 11.2.0.1 it would still be CLOB. But still, I get unique hash values whether I explicitly use CLOB or binary XML storage.Thimbleweed
@AlexPoole I have edited the question and added my DDLFaintheart
@NishanthLawrence - works for me like that. I get a different hash depending on whether the column is stored as CLOB or binary, but that makes sense looking at exactly what is stored (an XSI section is added to the binary version). Sounds like you might have found a bug, perhaps, if you are on 11.2.0.1? You could try hash_md5 instead but I doubt that would make any difference, if it's hashing the wrong thing somehow.Thimbleweed
@AlexPoole At last I made to work by passing the value returned Extract on the xmlcolum to the my_hash function. Thanks a lot for the my_hash function.Faintheart

© 2022 - 2024 — McMap. All rights reserved.