Featured image of post SQL-Exploring CET Analytic(Window) Functions, Common Table Expressions (CTEs)

SQL-Exploring CET Analytic(Window) Functions, Common Table Expressions (CTEs)

Explored in Oracle, MSSQL , MySql and Postgres

1. Analytic Functions in SQL

Analytic functions (also called window functions) perform calculations across a specific range of rows (a window) related to the current row. Unlike aggregate functions (which collapse multiple rows into one), analytic functions retain individual row results while still providing aggregated insights.

πŸ”Ή Key Features of Analytic Functions

  • They operate over a defined window of rows.
  • They do not group the result set into a single row.
  • They require an OVER() clause.

πŸ”Ή Common Analytic Functions

FunctionDescription
ROW_NUMBER()Assigns a unique row number to each row in a partition.
RANK()Assigns a rank to each row, allowing for ties (skips numbers for ties).
DENSE_RANK()Similar to RANK(), but without skipping numbers for ties.
NTILE(n)Distributes rows into n roughly equal buckets.
LAG(column, n)Returns the value of the column from n rows before the current row.
LEAD(column, n)Returns the value of the column from n rows after the current row.
SUM() OVER()Calculates a running total.
AVG() OVER()Computes a moving average.

πŸ”Ή Example: Using ROW_NUMBER()

Let’s say we have a sales table:

idcustomeramount
1Alice100
2Bob200
3Alice150
4Bob250
5Charlie300

To rank each customer’s purchases:

1
2
3
4
5
SELECT 
    customer, 
    amount, 
    ROW_NUMBER() OVER(PARTITION BY customer ORDER BY amount DESC) AS row_num
FROM sales;

Breakdown:

  • PARTITION BY customer: Resets the row numbering for each customer.
  • ORDER BY amount DESC: Orders within each partition by highest amount.
  • ROW_NUMBER(): Assigns a unique row number per partition.

Result:

customeramountrow_num
Alice1501
Alice1002
Bob2501
Bob2002
Charlie3001

2. Common Table Expressions (CTEs)

A Common Table Expression (CTE) is a temporary result set that exists only within the execution of a single query. It helps improve readability and modularity.

πŸ”Ή Why Use CTEs?

  • Makes complex queries easier to read.
  • Can be referenced multiple times within the same query.
  • Helps avoid subquery repetition.

πŸ”Ή Basic Syntax

1
2
3
4
WITH cte_name AS (
    -- Your query here
)
SELECT * FROM cte_name;

πŸ”Ή Example: Using a CTE

Consider a sales table where we want to calculate the total sales per customer, and then filter those with sales above $200.

1
2
3
4
5
6
7
8
WITH total_sales AS (
    SELECT customer, SUM(amount) AS total_amount
    FROM sales
    GROUP BY customer
)
SELECT * 
FROM total_sales
WHERE total_amount > 200;

Breakdown:

  1. The CTE (total_sales) first computes SUM(amount) per customer.
  2. The main query filters results where total_amount > 200.

πŸ’‘ Alternative (Without CTEs): We’d have to repeat the aggregation:

1
2
3
4
5
6
SELECT * FROM (
    SELECT customer, SUM(amount) AS total_amount
    FROM sales
    GROUP BY customer
) t
WHERE total_amount > 200;

CTEs make queries much cleaner! 😎


Combining Analytic Functions and CTEs

Sometimes, CTEs and analytic functions work well together. Let’s rank sales using a CTE:

1
2
3
4
5
6
7
8
WITH ranked_sales AS (
    SELECT 
        customer, 
        amount, 
        ROW_NUMBER() OVER(PARTITION BY customer ORDER BY amount DESC) AS rank
    FROM sales
)
SELECT * FROM ranked_sales WHERE rank = 1;

πŸ”Ή Purpose: This finds the highest sale per customer.



3. Temporary Table

A Temporary Table is a physical table stored in tempdb (SQL Server) or a temporary schema (MySQL, PostgreSQL). It can store intermediate results and be referenced multiple times within a session.

Characteristics of Temporary Tables

  • Scope:
    • #TempTable (local) exists only in the current session.
    • ##GlobalTempTable (global) can be accessed by multiple sessions.
  • Performance: Stored in memory/disk, better for large data sets.
  • Reusability: Can be modified after creation (INSERT, UPDATE, DELETE).
  • Indexing: Can have indexes for performance optimization.
  • Syntax:
    1
    2
    3
    4
    5
    6
    7
    8
    
    CREATE TABLE #temp_table (
        id INT,
        name VARCHAR(50)
    );
    
    INSERT INTO #temp_table VALUES (1, 'Alice');
    
    SELECT * FROM #temp_table;
    

Example: Using a Temporary Table to Store Sales Data

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
-- Create a temporary table
CREATE TABLE #TempSales (
    customer VARCHAR(50),
    total_amount DECIMAL(10,2)
);

-- Insert aggregated data into temp table
INSERT INTO #TempSales
SELECT customer, SUM(amount) 
FROM sales
GROUP BY customer;

-- Query from temp table multiple times
SELECT * FROM #TempSales WHERE total_amount > 500;
SELECT COUNT(*) FROM #TempSales;

-- Drop the temp table when done
DROP TABLE #TempSales;

πŸ”Ή What Happens?

  • #TempSales is physically created in memory (or disk if large).
  • The data persists until the session ends or the table is dropped.
  • You can run multiple queries against it, unlike a CTE.

** Key Differences Between CTEs and Temp Tables**

FeatureCTETemporary Table
ScopeExists only for the duration of the queryExists for the session or until explicitly dropped
PersistenceNot stored physicallyStored in tempdb (SQL Server) or temporary schema
PerformanceBest for small, in-memory calculationsBetter for large datasets that need indexing
ReusabilityCannot be reused outside the queryCan be queried multiple times
ModifiabilityRead-onlyCan be modified (INSERT, UPDATE, DELETE)
IndexingNo indexesCan have indexes

** When to Use What?**

ScenarioBest Choice
Breaking down complex queries for better readabilityβœ… CTE
One-time use within a single queryβœ… CTE
Need to reuse data in multiple queriesβœ… Temp Table
Working with large datasets that need indexingβœ… Temp Table
Needing to modify data after creationβœ… Temp Table

** 4. Analytic Function Syntax in Different Databases**

** Oracle**

Oracle has the most mature support for analytic functions, available since Oracle 8i.

Basic Example: Ranking Orders by Amount

1
2
3
4
SELECT 
    customer_id, order_id, amount,
    RANK() OVER (PARTITION BY customer_id ORDER BY amount DESC) AS order_rank
FROM orders;

Running Total Example

1
2
3
4
SELECT 
    customer_id, order_id, amount,
    SUM(amount) OVER (PARTITION BY customer_id ORDER BY order_id) AS running_total
FROM orders;

πŸ”Ή Oracle-Specific Features

  • Supports advanced functions like CUME_DIST() and PERCENT_RANK().
  • Performance-optimized window functions with Oracle’s query optimizer.
  • Can use RANGE and ROWS in the OVER() clause.

** SQL Server (MSSQL)**

SQL Server introduced analytic functions in SQL Server 2012, making it one of the later adopters.

Basic Example: Ranking Employees by Salary

1
2
3
4
SELECT 
    employee_id, department_id, salary,
    RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rank_in_department
FROM employees;

Lag and Lead Functions Example

1
2
3
4
5
SELECT 
    employee_id, salary, 
    LAG(salary, 1, 0) OVER (ORDER BY employee_id) AS previous_salary,
    LEAD(salary, 1, 0) OVER (ORDER BY employee_id) AS next_salary
FROM employees;

πŸ”Ή SQL Server-Specific Features

  • Full support for analytic functions (since 2012).
  • Performance tuning required for large datasets.
  • Limited statistical functions (compared to Oracle/PostgreSQL).

** PostgreSQL**

PostgreSQL introduced full analytic function support in version 8.4 and provides more flexibility than SQL Server.

Basic Example: Ranking Products by Sales

1
2
3
4
SELECT 
    product_id, category_id, sales,
    RANK() OVER (PARTITION BY category_id ORDER BY sales DESC) AS rank
FROM sales;

Running Sum Example

1
2
3
4
SELECT 
    customer_id, amount,
    SUM(amount) OVER (PARTITION BY customer_id ORDER BY transaction_date) AS running_sum
FROM transactions;

πŸ”Ή PostgreSQL-Specific Features

  • Supports advanced statistical window functions (CUME_DIST(), PERCENT_RANK()).
  • Supports custom aggregate window functions (not available in Oracle or SQL Server).
  • Better performance for large datasets (compared to SQL Server).

** MySQL (8.0+)**

MySQL introduced window functions in MySQL 8.0 (before that, they were unsupported).

Basic Example: Ranking Customers

1
2
3
4
SELECT 
    customer_id, order_id, amount,
    RANK() OVER (PARTITION BY customer_id ORDER BY amount DESC) AS rank
FROM orders;

Running Total Example

1
2
3
4
SELECT 
    customer_id, order_id, amount,
    SUM(amount) OVER (PARTITION BY customer_id ORDER BY order_id) AS running_total
FROM orders;

πŸ”Ή MySQL-Specific Limitations

  • Lacks statistical functions (CUME_DIST(), PERCENT_RANK()).
  • Performance is not optimized for large datasets.
  • No custom aggregate window functions (unlike PostgreSQL).

πŸš€ Key Differences Across Databases

FeatureOracleSQL Server (MSSQL)PostgreSQLMySQL (8.0+)
First to support window functions?βœ… Yes (Oracle 8i)❌ No (SQL Server 2012)βœ… Yes (PostgreSQL 8.4)❌ No (MySQL 8.0)
Ranking functions (RANK(), DENSE_RANK(), ROW_NUMBER())βœ… Yesβœ… Yesβœ… Yesβœ… Yes
Lag/Lead functions (LAG(), LEAD())βœ… Yesβœ… Yesβœ… Yesβœ… Yes
Running totals (SUM() OVER(), AVG() OVER())βœ… Yesβœ… Yesβœ… Yesβœ… Yes
Statistical functions (CUME_DIST(), PERCENT_RANK())βœ… Yesβœ… Yesβœ… Yes❌ No
Custom aggregate window functions❌ No❌ Noβœ… Yes❌ No