Featured image of post SQL Database Testing: Generating Sample Data for SQL DBs

SQL Database Testing: Generating Sample Data for SQL DBs

Using Faker, tSQLt, pgTAP for MSSQL, MySql and Postgres

Introduction

If you’ve ever worked with SQL databases, you know how important realistic test data is. Whether you’re testing stored procedures, triggers, or query performance, generating high-quality fake data is essential.

This article explores various tools for SQL database data generation, including:

  • Faker – Python-based fake data generator
  • tSQLt – Unit testing for SQL Server
  • pgTAP – Unit testing for PostgreSQL
  • dbForge Data Generator – GUI-based data seeding
  • SQL Data Generator – Commercial tool for test data

Why Use Fake Data in SQL Databases?

FeatureDescription
Unit TestingEnsures stored procedures and queries return expected results.
Performance TestingSimulate real-world load with thousands of records.
Database SeedingPopulate dev/test databases without real data.
Data PrivacyAvoid using sensitive production data in testing.
RepeatabilityEnsure tests produce consistent results.

Generating Fake Data with Faker (Python)

Faker is a Python library for generating realistic fake data.

1. Installing Faker

1
pip install faker

2. Generating Fake SQL Data

1
2
3
4
5
6
from faker import Faker

fake = Faker()

for _ in range(10):
    print(f"INSERT INTO users (name, email, address) VALUES ('{fake.name()}', '{fake.email()}', '{fake.address()}');")

3. Inserting Faker Data into PostgreSQL

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
import psycopg2

conn = psycopg2.connect("dbname=testdb user=postgres password=secret")
cur = conn.cursor()

fake = Faker()

for _ in range(100):
    cur.execute(
        "INSERT INTO users (name, email, address) VALUES (%s, %s, %s);",
        (fake.name(), fake.email(), fake.address())
    )

conn.commit()
cur.close()
conn.close()

Generating Fake Data in SQL Server Using tSQLt

tSQLt is a unit testing framework for SQL Server.

4. Installing tSQLt

1
2
3
4
EXEC sp_configure 'show advanced options', 1;
RECONFIGURE;
EXEC sp_configure 'clr enabled', 1;
RECONFIGURE;

5. Creating a Fake Table in SQL Server

1
2
3
EXEC tSQLt.FakeTable 'dbo.Users';

INSERT INTO dbo.Users (Name, Email) VALUES ('John Doe', 'john@example.com');

6. Running a Unit Test in SQL Server

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
CREATE PROCEDURE TestUserTable AS
BEGIN
    EXEC tSQLt.FakeTable 'dbo.Users';
    
    INSERT INTO dbo.Users (Name, Email) VALUES ('Alice Smith', 'alice@example.com');

    DECLARE @count INT;
    SELECT @count = COUNT(*) FROM dbo.Users;

    EXEC tSQLt.AssertEquals 1, @count;
END;

Generating Fake Data in PostgreSQL Using pgTAP

pgTAP is a unit testing framework for PostgreSQL.

7. Installing pgTAP

1
CREATE EXTENSION IF NOT EXISTS pgtap;

8. Running a Unit Test in PostgreSQL

1
2
3
4
5
6
SELECT plan(2);

SELECT has_column('users', 'email', 'Users table has an email column');
SELECT results_eq('SELECT count(*) FROM users WHERE active = TRUE', ARRAY[10], 'There are 10 active users');

SELECT finish();

Comparing Data Generation Tools

ToolDatabase SupportEase of UseBest For
FakerAll DatabasesEasyQuick fake data generation
tSQLtSQL ServerModerateUnit testing stored procedures
pgTAPPostgreSQLModeratePostgreSQL unit testing
dbForgeSQL Server, MySQLEasy (GUI)Large-scale database seeding
SQL Data GeneratorSQL ServerModerateCommercial test data tool

Alternative Approaches: Pros & Cons

ApproachProsCons
Using Static DataSimple & predictableHard to scale
Hand-Coded Test DataControl over test casesTime-consuming
Using FakerFast & repeatableRequires Python
Using tSQLtIntegrated with SQL ServerSQL Server only
Using pgTAPNative PostgreSQL solutionPostgreSQL only

  • Faker + SQLAlchemy: Use Faker to populate test databases in Python.
  • tSQLt + SQL Server: Automate stored procedure testing in SQL Server.
  • pgTAP + PostgreSQL: Ensure data integrity and performance testing in PostgreSQL.

Key Ideas

  • Faker is a great tool for generating fake SQL data in any database.
  • tSQLt and pgTAP allow for native SQL-based unit testing.
  • Database seeding is critical for unit tests, performance testing, and development.
  • Choose the right tool based on your database and testing needs.

References

  1. Faker GitHub Repository
  2. Faker Documentation
  3. tSQLt Official Site
  4. pgTAP Documentation
  5. SQL Server Unit Testing
  6. dbForge Data Generator