Which index should I use on binary datatype column mysql
Asked Answered
W

1

7

I am writing a simple tool to check duplicate files(i.e. files having same data). The mechanism is to generate hashes for each file using sha-512 algorithm and then store these hashes in MYSQL database. I store hashes in binary(64) unique not null column. Each row will have a unique binary hash and used to check file is duplicate or not.

-- My questions are --

  1. Can I use indexes on binary column, my default table collation is latin1 - default collation?

  2. Which Indexing mechanism should I use Btree or Hash, for getting high performance? I need to update or add 100 of rows per seconds.

  3. What other things should I take care of to get best performance?

Willhite answered 29/5, 2013 at 6:32 Comment(0)
P
18
  1. Can I use indexes on binary column, my default table collation is latin1 - default collation?

    Yes, you can; collation is only relevant for character datatypes, not binary datatypes (it defines how characters should be ordered)—also, be aware that latin1 is a character encoding, not a collation.

  2. Which Indexing mechanism should I use Btree or Hash, for getting high performance? I need to update or add 100 of rows per seconds.

    Note that hash indexes are only available with the MEMORY and NDB storage engines, so you may not even have a choice.

    In any event, either would typically be able to meet your performance criteria—although for this particular application I see no benefit from using B-Tree (which is ordered), whereas Hash would give better performance. Therefore, if you have the choice, you may as well use Hash.

    See Comparison of B-Tree and Hash Indexes for more information.

  3. What other things should I take care of to get best performance?

    Depends on your definition of "best performance" and your environment. In general, remember Knuth's maxim "premature optimisation is the root of all evil": that is, only optimise when you know that there will be a problem with the simplest approach.

Pendleton answered 29/5, 2013 at 6:43 Comment(1)
I am using Innodb storage engine for hash store table, so HEAP indexing mechanism will not be available for it. I think, Btree indexing will not be bad.Willhite

© 2022 - 2024 — McMap. All rights reserved.