I have about 1.5 million files I need to insert records for in the database. Each record is inserted with a key that includes the name of the file.
The catch: The files are not uniquely identified currently.
So, what we'd like to do is, for each file:
- Insert a record. One of the fields in the record should include an amazon S3 key which should include the ID of the newly inserted record.
- Rename the file to include the ID so that it matches the format of the key.
The best thing I can think to do is:
- Run an individual insert command that returns the ID of the added row.
- Add that back as a property to the individual business object I'm looping through.
- Generate an update statement that updates the S3 key to include the ID
- Output the file, concatenate the ID into the end the file name.
As I can tell, that looks to be :
- 1.5 million insert statements
- individual SqlCommand executions and read because we need the ID back),
- 1.5 million times setting a property on an object.
- 1.5 million update statements generated and executed
- Perhaps could make this a one giant concatenated update statement to do them all at once; not sure if that helps
- 1.5 million file copies.
I can't get around the actual file part, but for the rest, is there a better strategy I'm not seeing?
IDENTITY_INSERT
, in c# code builde insert query with all columns with ID(for example 1 to 1 500 000) After disableIDENTITY_INSERT
– PyridoxineGuid()
? so the DB will have a unique ID and so will the files name. – Glimpse