We have a legacy database which is a sql server db (2005, and 2008).
All of the primary keys in the tables are UniqueIdentifiers.
The tables currently have no clustered index created on them and we are running into performance issues on tables with only 750k records. This is the first database i've worked on with unique identifiers as the sole primary key and I've never seen sql server be this slow with returning data.
I don't want to create a clustered index on the uniqueidentifier as they are not sequential and will therefore slow the apps down when it comes to inserting data.
We cannot remove the uniqueidentifier as that is used for remote site record identity management purposes.
I had thought about adding a big integer identity column to the tables and creating the clustered index on this column and including the unique identifier column.
i.e.
int identity - First column to maintain insert speeds unique identifier - To ensure the application keeps working as expected.
The goal is to improve the identity query and joined table query performance.
Q1: Will this improve the query performance of the db or will it slow it down?
Q2: Is there an alternative to this that I haven't listed?
Thanks Pete
Edit: The performance issues are on retrieving data quickly through select statements, especially if a few of the more "transactional / changing" tables are joined together.
Edit 2: The joins between tables are generally all between the primary key and foreign keys, for tables that have foreign keys they are included in the non-clustered index to provide a more covering index.
The tables all have no other values which would provide a good clustered index.
I'm leaning more towards adding an additional identity column on each of the high load tables and then including the current Guid PK column within the clustered index to provide the best query performance.
Edit 3: I would estimate that 80% of the queries are performed on primary and foreign keys alone through the data access mechanism. Generally our data model has lazy loaded objects which perform the query when accessed, these queries use the objects id and the PK column. We have a large amount of user driven data exclusion / inclusion queries which use the foreign key columns as a filter based on the criteria of for type X exclude the following id's. The remaining 20% is where clauses on Enum (int) or date range columns, very few text based queries are performed in the system.
Where possible I have already added covering indexes to cover the heaviest queries, but as yet i'm still dissapointed by the performance. As bluefooted says the data is being stored as a heap.