Create Random Data with SQL Server

This article will show you how to create data randomly. You will learn how to:

  • create a random number between x and y.
  • create a random date.
  • create a random letter.
  • create a random word.
  • pick a random element from a list.
  • create a random email or phone number.
  • insert data in tables using random data.

What is the purpose of creating random data?

The purpose of random data creation is to test the performance of your SQL queries.

Example: you want to create a database containing information about your customers. You currently have only a few customers, but you would like to know how your SQL scripts will behave when you have hundreds of thousands of customers in your database. As you do not have this amount of customer data yet, you need to create it artificially.
To do that, you need to create a script that:

  • creates rows different from each other (people have different names, different birth dates,...).
  • creates the data automatically. Indeed, you do not want to write hundred thousands insert statements by yourself.

To fullfil the first condition, you need to be able to create data randomly, so that at each insertion of data, the data generated is different.

Create a random number between x and y with SQL

Goal:

Create a stored procedure that returns a random integer between x and y. Example: create a random number between 5 and 8. This stored procedure takes x and y as input parameters. Its name will be “getRandomNumber”. Its output will be an integer between x and y.

Syntax:

create procedure getRandomNumber
(
@randomNumber bigint output,
@minNumber bigint,
@maxNumber bigint
)
as
begin
declare @rand float
select @rand = rand()

-- Use the following method to have one 50% chance to use floor,
-- 50% to choose ceiling
if ( cast(floor( 1000 * @rand ) as int) % 2 = 0)
begin
select @randomNumber = floor( (@maxNumber - @minNumber) * @rand + @minNumber )
end
else
begin
select @randomNumber = ceiling( (@maxNumber - @minNumber) * @rand + @minNumber )
end
end

How to use the stored procedure:

declare @random_number bigint
exec getRandomNumber @random_number output,1,3
select @random_number

Explanation:

select @rand = rand() : rand() is a function returning a random number between 0 and 1 with 14 digits after the decimal point. Example: rand() can return 0,789224689617. Note that it almost never returns exactly 0 or 1. Indeed, the probability to get 0,00000000000000 or 1,00000000000000 is 2 out of 100 000 000 000 000 (2 chances out of 100 000 billion).

The explanations below are a bit tricky. You can skip them and go to the next chapter if you are confused.

select @randomNumber = floor( (@maxNumber - @minNumber) * @rand + @minNumber ): let us take an example. We want a number between 1 and 4:
(@maxNumber - @minNumber) is then equal to 4-1=3.
As @rand will be a value between 0 and 1 (but almost never 0 or 1), then (@maxNumber - @minNumber) * @rand will be between 3*0 and 3*1, which is between 0 and 3 (but almost never 0 or 3, remember). For example, it can return 0.1332362736, 1.9348384738 or 2.122837284. Using these 3 examples, (@maxNumber - @minNumber) * @rand + @minNumber (@minNumber is 1 in our example) would be equal to 0.1332362736+1=1.1332362736 or 1.9348384738+1=2.9348384738 or 2.122837284+1=3.122837284. The function floor is rounding down this result:
floor(1.1332362736) is 1. floor(2.9348384738) is 2. floor(3.122837284) is 3. As a result, notice that we almost never get 4. Indeed, only floor(4.000000000) gives 4. Even floor(3.999999999) gives 3.

select @randomNumber = ceiling( (@maxNumber - @minNumber) * @rand + @minNumber): ceiling is rounding up the number inside the parenthesis. Looking at our example:
ceiling(1.1332362736) is 2. ceiling(2.9348384738) is 3. ceiling(3.122837284) is 4. As a result, ceiling( (@maxNumber - @minNumber) * @rand + @minNumber (@minNumberr is 1 in our example) is either 2, 3 or 4. Notice that we almost never get 1.

if ( cast(floor( 1000 * @rand ) as int) % 2 = 0)
begin
select @randomNumber = floor( (@maxNumber - @minNumber) * @rand + @minNumber )end
else begin
select @randomNumber = ceiling( (@maxNumber - @minNumber) * @rand + @minNumber )
end
: you saw above that the method with floor gives 1,2 or 3, while the method with ceiling gives 2, 3 or 4. As a result, the only way to get 1, 2, 3 or 4 is to combine the two methods.

To combine the two, we use randomly (1 chance out of 2, just like tossing a coin) either floor or ceiling.

To "toss a coin", we go through the following steps:

1) We multiply rand() by 1000. This will give a number 1 and 1000, but almost never 1 or 1000 (example: 139,218262931716).

2) We use floor( 1000 * @rand ) to get an integer (example: 139). As rand() is of type float, floor( 1000 * @rand ) is of also of type float. To get a result of type integer, we use cast(floor( 1000 * @rand ) as int) , which converts the result in a result of type integer.

3) Then we check if this result is even or odd. This is done using “% 2”. Example : 5%2 is the rest of the division of 5 by 2. In this case 5=2*2+1 so the rest is 1. 4%2 is 0 as 4=2*2+0. The result is always 0 for even numbers and always different from 0 for odd number. As a number has one chance out of two to be odd or even, we get our one chance out of two to have the condition in the if statement equal to true.

Create a random date with SQL

Goal:

Create a stored procedure that returns a random date between x and y. Example: create a random date between March 23rd 1980 and February 23rd 2015. The stored procedure takes x and y as input parameters. Its name will be “getRandomDate”. Its output will be a date between x and y.

Syntax:

create procedure getRandomDate
(
@randomDate date output,
@minDate date,
@maxDate date
)
as
begin
declare @maxDayDifference int,
@randomDayDifference int
-- Get the difference between the earliest date and the latest date
select @maxDayDifference=datediff(day,@minDate,@maxDate)
-- Get a random number from 0 to this difference

exec getRandomNumber @randomDayDifference output, 0, @maxDayDifference
-- Add the random number to the earliest date

select @randomDate = dateadd(day,@randomDayDifference,@minDate)
end

How to use the stored procedure:

declare @random_date date
exec getRandomDate @random_date output,'20120718','20120720'
select @random_date

Explanation:

select @maxDayDifference=datediff(day,@minDate,@maxDate): datediff is a function calculating the difference between two dates. “day” means you want to get the difference in days. In this example, you want to get the difference between the date @minDate and the date @maxDate.

select @randomDate = dateadd(day,@randomDayDifference,@minDate): The function dateadd adds an amount of time indicated in input to a date also indicated in input. Example: datedadd(day,2,’20120718’) adds two days to July 18th, 2012. The result is July 20th, 2012. “day” means the amount of time is given in days. In this example, we add an amount of days equal to @randomDayDifference to @minDate.
In our example, we want a random date between July 18th, 2012 and July 20th, 2012. To do that, we add a random number of days to the earliest date, which is July 18th, 2012. We have to be careful though not to add too many days, otherwise the resulting date would be after July 20th. Therefore, the random number must be smaller or equal to the difference between the two dates.

Create a random letter with SQL

Goal:

Create a random letter or a number between 0 and 1 that can be included in a string. A parameter will indicate if the stored procedure returns a lowercase letter (then the input will be equal to ‘L’), an uppercase letter (the input will be ‘U’) or a number (‘N’). In all cases, the stored procedure will return a parameter of type string.

Syntax:

create procedure getRandomCharacter
(
@character char(1) output,
@case_type char(1)
)
as
begin
declare @validCharactersLower varchar(100),
@validCharactersUpper varchar(100),
@validCharactersNumbers varchar(100),
@characterChosen int,
@validCharacterLength int

set @validCharactersLower = 'abcdefghijklmnopqrstuvwxyz'
set @validCharactersUpper = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
set @validCharactersNumbers = '0123456789'


-- Lower case
if @case_type = 'L'
begin
select @validCharacterLength = len(@validCharactersLower)
exec getRandomNumber @characterChosen output,1,@validCharacterLength
select @character = substring(@validCharactersLower, @characterChosen, 1)
end

-- Upper case
if @case_type = 'U'
begin
select @validCharacterLength = len(@validCharactersUpper)
exec getRandomNumber @characterChosen output,1,@validCharacterLength
select @character = substring(@validCharactersUpper, @characterChosen, 1)
end

-- Numbers
if @case_type = 'N'
begin
exec getRandomNumber @characterChosen output,1,10
select @character = substring(@validCharactersNumbers, @characterChosen, 1)
end
end

How to use the stored procedure:

declare @random_character char(1)
exec getRandomCharacter @random_character output,'L'
select @random_character
exec getRandomCharacter @random_character output,'U'
select @random_character
exec getRandomCharacter @random_character output,'N'
select @random_character

Explanation:

set @validCharactersLower = 'abcdefghijklmnopqrstuvwxyz': we put in the parameter @validCharactersLower all the letters of the alphabet. We will then generate a random number. This random number will tell us which of these letters we are going to pick. If we pick the random number 5, we will choose the fifth character from the left of the string @validCharactersLower, which is “e”.

set @validCharactersUpper = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ': same technique as above, except we want to generate an uppercase letter.

set @validCharactersNumbers = '0123456789': same technique as above, except that we want to generate a number (useful to generate random phone or fax numbers).

if @case_type = 'L': this if statement helps you to generate a different type of output based on the input parameter @case_type. If @case_type is equal to “L”, then you will generate a lowercase character.

select @validCharacterLength = len(@validCharactersLower): get the length of the string @validCharactersLower. We will use it later in the stored procedure to create a random number between 1 and the last position of the string @validCharactersLower (the length of a string is the last position of a string).

select @character = SUBSTRING(@validCharactersLower, @characterChosen, 1): substring gets a part of a string’s value:
- the part starts at a position given in input.
- a certain number of characters is taken from this string.

In this example, we start looking at a position from the left equal to @characterChosen in the string @validCharactersLower, and we take only 1 letter.
Example: @validCharactersLower='abcdefghijklmnopqrstuvwxyz'
If @characterChosen is equal to 4, then SUBSTRING(@validCharactersLower, @characterChosen, 1) is equal to “d” (fourth character from the left when you look at @validCharactersLower).

Create a random word with SQL

Goal:

Create a stored procedure that returns a random word which length is between x and y. For example, create a random word which length is between 2 and 3.

The stored procedure takes x and y as input parameters. It takes also as input a parameter indicating if the letters in the word will be upper case (then the input parameter value should be U), lower case (L) or if the first letter is uppercase and the other letters are lowercase (UL). It is also possible to get a string containing a number having between x and y digits (useful to build random phone or fax numbers). The stored procedure’s name will be “getRandomWord”.

Syntax:

create procedure getRandomWord
(
@randomWord varchar(50) output,
@minLength int,
@maxLength int,
@caseType char(2)
)
as
begin
declare @chosenLength int,
@currentCharacter char(1),
@iterator int
-- Get a random number that will represent the length of the word
exec getRandomNumber @chosenLength output, @minLength,@maxLength
select @randomWord = ''
select @currentCharacter = ''
select @iterator = 1
-- First letter upper case, next letters lower case
if @caseType = 'UL'
begin
exec getRandomCharacter @currentCharacter output,'U'
select @randomWord = @randomWord + @currentCharacter
while @iterator < (@chosenLength)
begin
exec getRandomCharacter @currentCharacter output,'L'
select @randomWord = @randomWord + @currentCharacter
select @iterator = @iterator + 1
end
end
if @caseType = 'U' or @caseType = 'L' or @caseType = 'N'
begin
while @iterator <= (@chosenLength)
begin
exec getRandomCharacter @currentCharacter output,@caseType
select @randomWord = @randomWord + @currentCharacter
select @iterator = @iterator + 1
end
end
end

How to use the stored procedure:

declare @random_word varchar(50)
exec getRandomWord @random_word output,1,3,'L'
select @random_word
exec getRandomWord @random_word output,4,5,'UL'
select @random_word
exec getRandomWord @random_word output,2,2,'U'
select @random_word
exec getRandomWord @random_word output,2,10,'N'
select @random_word

Explanation:

while @iterator < (@chosenLength) begin … end: this means that we execute all operations between begin and end several times. We execute these operations as long as the number @iterator is strictly smaller than @chosenLength. This will help to generate a word equal to the length calculated randomly (@chosenLength).

SELECT @randomWord = @randomWord + @currentCharacter: we add to the string @randomWord the next letter generated randomly. We are doing this a number of times equal to @chosenLength. As a result, we produce a string of length equal to @chosenLength.

select @iterator = @iterator + 1: we add 1 to the value of the number @iterator. We do this each time we execute what is between begin and end. When the value of the number @iterator becomes equal to @chosenLength, then we stop executing what is between begin and end.

Pick a random element from a list

Goals:

Pick a random element from a predefined list. We will create two stored procedures:

  • one to pick a state randomly among 50 american states.
  • another to pick a status among the following statuses: “active”, “pending” or “inactive”.

Syntax:

Random state:

create procedure getRandomState(@stateName varchar(40) output)
as
begin
declare @poolOfValues table ( id int, value varchar(40) )
declare @idChosen int
insert into @poolOfValues values (1,'Alabama')
insert into @poolOfValues values (2,'Alaska')
insert into @poolOfValues values (3,'Arizona')
insert into @poolOfValues values (4,'Arkansas')
insert into @poolOfValues values (5,'California')
insert into @poolOfValues values (6,'Colorado')
insert into @poolOfValues values (7,'Connecticut')
insert into @poolOfValues values (8,'Delaware')
insert into @poolOfValues values (9,'Florida')
insert into @poolOfValues values (10,'Georgia')
insert into @poolOfValues values (11,'Hawaii')
insert into @poolOfValues values (12,'Idaho')
insert into @poolOfValues values (13,'Illinois')
insert into @poolOfValues values (14,'Indiana')
insert into @poolOfValues values (15,'Iowa')
insert into @poolOfValues values (16,'Kansas')
insert into @poolOfValues values (17,'Kentucky')
insert into @poolOfValues values (18,'Louisiana')
insert into @poolOfValues values (19,'Maine')
insert into @poolOfValues values (20,'Maryland')
insert into @poolOfValues values (21,'Massachusetts')
insert into @poolOfValues values (22,'Michigan')
insert into @poolOfValues values (23,'Minnesota')
insert into @poolOfValues values (24,'Mississippi')
insert into @poolOfValues values (25,'Missouri')
insert into @poolOfValues values (26,'Montana')
insert into @poolOfValues values (27,'Nebraska')
insert into @poolOfValues values (28,'Nevada')
insert into @poolOfValues values (29,'New Hampshire')
insert into @poolOfValues values (30,'New Jersey')
insert into @poolOfValues values (31,'New Mexico')
insert into @poolOfValues values (32,'New York')
insert into @poolOfValues values (33,'North Carolina')
insert into @poolOfValues values (34,' North Dakota')
insert into @poolOfValues values (35,'Ohio')
insert into @poolOfValues values (36,'Oklahoma')
insert into @poolOfValues values (37,'Oregon')
insert into @poolOfValues values (38,'Pennsylvania')
insert into @poolOfValues values (39,'Rhode Island')
insert into @poolOfValues values (40,'South Carolina')
insert into @poolOfValues values (41,'South Dakota')
insert into @poolOfValues values (42,'Tennessee')
insert into @poolOfValues values (43,'Texas')
insert into @poolOfValues values (44,'Utah')
insert into @poolOfValues values (45,'Vermont')
insert into @poolOfValues values (46,'Virginia')
insert into @poolOfValues values (47,'Washington')
insert into @poolOfValues values (48,'West Virginia')
insert into @poolOfValues values (49,'Wisconsin')
insert into @poolOfValues values (50,'Wyoming')
-- Get a random number between 1 and 50
exec getRandomNumber @idChosen output,1,50
-- Get the state corresponding to this random number
select @stateName = value from @poolOfValues
where id = @idChosen
return
end

Random status:

create procedure getRandomStatus(@statusName varchar(20) output)
as
begin
declare @poolOfValues table ( id int, value varchar(40) )
declare @idChosen int
insert into @poolOfValues values (1,'active')
insert into @poolOfValues values (2,'pending')
insert into @poolOfValues values (3,'inactive')
exec getRandomNumber @idChosen output,1,3
select @statusName = value from @poolOfValues
where id = @idChosen
return
end

How to use the stored procedures:

declare @random_state varchar(40)
exec getRandomState @random_state output
select @random_state
declare @randomStatus varchar(20)
exec getRandomStatus @randomStatus output
select @randomStatus

Explanation:

declare @poolOfValues table ( id int, value varchar(40) ):
this creates a table variable named @poolOfValues. This table variable will have two columns: “id” and “value”. It behaves like a table: you can read information from this table using a select statement, insert information in this table,...
“id” is a column of type int, and “value” a column of type varchar(40).

insert into @poolOfValues values (1,'Alabama'): this inserts a row in the table variable called @poolOfValues. Note that the syntax is the same as for an insert in a regular table, except that we add a “@” before the table variable name.
To pick a random state, we insert the possible values in a table variable, giving to each value a unique ID. Then we generate a random number that is not bigger than the number of possible values (50). Then we select from the table variable the line which unique ID is equal to the generated random number.

Create a random email, a random phone number with SQL

Goal:

Create a procedure generating a random email (getRandomEmail) and a stored procedure creating a random phone number (getRandomPhone).

Syntax:

create procedure getRandomEmail(@randomEmail varchar(50) output )
as
begin
declare @randomFirstName varchar(6),
@randomLastName varchar(6),
@randomDomain varchar(5),
@randomExtension varchar(3)
exec getRandomWord @randomFirstName output,4,6,'L'
exec getRandomWord @randomLastName output,4,6,'L'
exec getRandomWord @randomDomain output,4,5,'L'
exec getRandomWord @randomExtension output,3,3,'L'
select @randomEmail =
@randomFirstName+'.'[email protected]+
'@'[email protected]+'.'[email protected]
end

create procedure getRandomPhone(@randomPhone varchar(20) output)
as
begin
declare @countryCode varchar(3),
@region varchar(3),
@serialNumber varchar(6)
exec getRandomWord @countryCode output,3,3,'N'
exec getRandomWord @region output,3,3,'N'
exec getRandomWord @serialNumber output,6,6,'N'
select @randomPhone =
'('+ @countryCode+') '[email protected]+
'-'[email protected]
end

How to use the stored procedures:

declare @random_email varchar(50)
exec getRandomEmail @random_email output
select @random_email
declare @randomPhone varchar(20)
exec getRandomPhone @randomPhone output
select @randomPhone

Use your stored procedures to insert data in tables:

Goal:

We want to use the stored procedures created above to:

  • Insert random data (only one row for the moment) into the table salesperson.
  • Insert random data (only one row for the moment) in the table clients.
  • Insert random data in the table client_status. Note that for the client inserted, 3 lines will be inserted in the client_status table with a client ID corresponding to the row inserted during the previous step.

Syntax:

Table creation:

create table client_status
(
client_id bigint,
status_date date,
client_status varchar(10)
)
create table clients
(
ID bigint identity not null,
first_name varchar(30),
second_name varchar(30),
phone_number varchar(20),
state varchar(40),
email varchar(30),
salesperson_id bigint
)
create table salesperson
(
ID bigint identity not null,
total_sales decimal(20,0),
first_name varchar(30),
second_name varchar(30),
state varchar(40),
status varchar(20)
)

Data insertion:

declare @first_name varchar(30),
@second_name varchar(30),
@phone_number varchar(20),
@state varchar(40),
@status varchar(30),
@total_sales decimal(7,2)
-- Generate salesperson data randomly for one salesperson
exec getRandomWord @first_name output,4,5,'UL'
exec getRandomWord @second_name output,5,7,'UL'
exec getRandomPhone @phone_number output
exec getRandomState @state output
exec getRandomStatus @status output
exec getRandomNumber @total_sales output,5000,99900
-- Insert the random data in the saleperson table
insert into salesperson(total_sales,first_name,second_name,state,status)
values
(
@total_sales,
@first_name,
@second_name,
@state,
@status
)
declare @first_name_clients varchar(30),
@second_name_clients varchar(30),
@phone_number_clients varchar(20),
@state_clients varchar(40),
@email_clients varchar(30),
@salesperson_id bigint,
@max_salesperson_id bigint,
@client_id bigint,
@client_status varchar(30),
@status_date date
-- Generate customer data randomly for one salesperson
exec getRandomWord @first_name_clients output,4,5,'UL'
exec getRandomWord @second_name_clients output,5,7,'UL'
exec getRandomPhone @phone_number_clients output
exec getRandomState @state_clients output
exec getRandomEmail @email_clients output
-- Get the maximum id from the salesperson table
select @max_salesperson_id = max(ID) from salesperson
-- Insert the random data in the saleperson table
insert into clients(first_name,second_name,phone_number,state,email,salesperson_id)
values
(
@first_name_clients,
@second_name_clients,
@phone_number_clients,
@state_clients,
@email_clients,
@max_salesperson_id
)
-- Get the id of the last client inserted
select @client_id = SCOPE_IDENTITY()
-- Generate a random status and a random date
exec getRandomStatus @client_status output
exec getRandomDate @status_date output,'20121001','20100918'
-- Insert 3 statuses for the last inserted client
insert into client_status
values (
@client_id,
@status_date,
@client_status
)
exec getRandomStatus @client_status output
exec getRandomDate @status_date output,'20121001','20100918'
insert into client_status
values (
@client_id,
@status_date,
@client_status
)
exec getRandomStatus @client_status output
exec getRandomDate @status_date output,'20121001','20100918'
insert into client_status
values (
@client_id,
@status_date,
@client_status
)

Explanations:

select @max_salesperson_id = max(ID) from salesperson: in this article we insert only one row in the table salesperson. Therefore, max(ID) will be the ID of the row we have just inserted.

select @client_id = SCOPE_IDENTITY(): we want to insert a row in the status table. We want this row in the status table to correspond to the ID of the client we just inserted. The row will contain information about the status of the inserted client. To do that, we need to know what is the ID of the client we just inserted (we don’t know this value, as it is generated automatically when inserting the row in clients). This is why we use SCOPE_IDENTITY(), which gives use the ID generated for the last row that was inserted.

You have now learned how to generate random data and insert this data in tables. The next step is to execute the same actions in a loop to create a massive amount of data. This will be the subject of the next article.

 
Please help this site survive, share on:



Author:
Last udpated: September 10, 2017
Created: March 24, 2015

Something not clear? Ask your question here:

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>