I doubt the original poster cares anymore, but came up with this while trying to produce bulk dummy data. Hopefully it is useful for someone else.
/* == Seed Data ==*/
if(OBJECT_ID('tempdb..#dummyData') is not null)
DROP TABLE #dummyData
CREATE TABLE #dummyData ([id] INT IDENTITY PRIMARY KEY, [value] nvarchar(256));
INSERT #dummyData ([value]) VALUES
('Itßaly Terrazzo Sacra Nero 24x24 Honed'),
('Nero Marquina 1x3 Herringbone'),
('Myorka Blue 2x8'),
('Chrysler Driftwood Hickory Sea 12mil Wear Layer Glue Down 6.3x48.4'),
('Nero Marquina 1x3 Herringbone'),
('Broadway Lvt Ash 9x72 Rigid Core ä Click- 6.0mm/28mil Wear Layer'),
('Minetta Concreto Pearl 2.5mm/28mil Wear Layer Glue Down 18x36'),
('Speak Leather Black 24 x 24'),
('Accent Montana White 12 x 36 Glossy'),
('Baroque Crackled 1/2 X 6 Pencil Capri'),
('Banza Green Cielo 8" Hex'),
('Masia $550 Torello Nero Brillo 3/4x12 Pencil'),
('Baroque 1.000.95€ Crackled 1 3/4 X 6 Chair Rail Blanco');
/* == Main Loop ==*/
DECLARE @insertNum int = 1000000
DECLARE @top int = 0
DECLARE @seedCount int = (SELECT COUNT(*) FROM #dummyData)
DECLARE @currentCount int = @seedCount
WHILE(@currentCount < @insertNum)
BEGIN
SET @top = @insertNum - @currentCount
IF(@top < @insertNum)
INSERT INTO #dummyData([value])
SELECT TOP(@top) [value]
FROM #dummyData
SET @currentCount = (SELECT COUNT(*) FROM #dummyData)
END
/* == Update Step ==*/
UPDATE #dummyData
SET value = cast(id as nvarchar) + value
WHERE id > @seedCount
Seed data Set up as needed -- manual or from another source (Note: you must seed at least one row) Using a temp table for demo purposes.
Main Loop Set @insertNum desired number of rows
Update Step Skip if you don't need to transform data -- I could have skipped this by inserting a new id in the loop, however it was slower by ~2 seconds.
This produced 1,000,000 unique values for parse testing. The @seedCount kept the first rows unaltered; a form of control I suppose.
Example result
SELECT *
FROM #dummyData
WHERE ID between 10 and 15
ORDER BY id
id |
value |
10 |
Baroque Crackled 1/2 X 6 Pencil Capri |
11 |
Banza Green Cielo 8" Hex |
12 |
Masia $550 Torello Nero Brillo 3/4x12 Pencil |
13 |
Baroque 1.000.95€ Crackled 1 3/4 X 6 Chair Rail Blanco |
14 |
14Itßaly Terrazzo Sacra Nero 24x24 Honed |
15 |
15Nero Marquina 1x3 Herringbone |
The optimizer is fairly smart -- adding the id to the value (rows 14+) allowed them to all be unique. That stopped the optimizer from recognizing 13 repeated strings and giving me far faster performance than could be expected from unique data.
Performance Although it is a while loop, it batches into itself so it multiplies data quickly, and it cuts off at the number you set it to. A seed of 1 row hits 1,000,000 at iteration 19.
I set up two dummy data tests, and I ran each through a CTE and this while loop. The while loop ran quicker, with 1,000,000 rows.
From my machine (Client Results: Average 10 runs):
Query |
CTE |
While |
This Example* |
14619 ms |
6825 ms |
random value generation |
11831 ms |
6393 ms |
*Perhaps I could have joined the seed data to the CTE in a more performant way.