Performance Tuning: Monitoring and Improving Database Performance
Performance tuning is an ongoing process of monitoring, analyzing, and optimizing your PostgreSQL database. A well-tuned database can handle significantly more load and provide faster response times. Let's explore the tools and techniques for performance tuning.
Performance Monitoring Tools
Built-in Statistics Views
PostgreSQL provides several system catalogs and views for monitoring performance.
-- Check table statistics
SELECT
schemaname,
tablename,
seq_scan,
seq_tup_read,
idx_scan,
idx_tup_fetch,
n_tup_ins,
n_tup_upd,
n_tup_del,
n_live_tup,
n_dead_tup
FROM pg_stat_user_tables
ORDER BY seq_scan DESC;
-- Check index usage
SELECT
schemaname,
tablename,
indexname,
idx_scan,
idx_tup_read,
idx_tup_fetch
FROM pg_stat_user_indexes
ORDER BY idx_scan DESC;
-- Check slow queries (requires pg_stat_statements extension)
SELECT
query,
calls,
total_time,
mean_time,
rows,
100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent
FROM pg_stat_statements
ORDER BY mean_time DESC
LIMIT 10;
Query Performance Analysis
-- Analyze query execution plan
EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON)
SELECT u.name, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.created_at > '2023-01-01'
GROUP BY u.id, u.name
HAVING COUNT(o.id) > 5;
-- Check for sequential scans
SELECT
schemaname,
tablename,
attname,
n_distinct,
correlation
FROM pg_stats
WHERE schemaname = 'public'
ORDER BY tablename, attname;
Identifying Performance Bottlenecks
Slow Queries
-- Find queries that take the most time
SELECT
query,
calls,
total_time,
mean_time,
stddev_time,
rows
FROM pg_stat_statements
WHERE mean_time > 100 -- Queries taking more than 100ms on average
ORDER BY total_time DESC
LIMIT 10;
-- Find queries with high I/O
SELECT
query,
shared_blks_read,
shared_blks_hit,
shared_blks_written,
shared_blks_dirtied
FROM pg_stat_statements
ORDER BY shared_blks_read DESC
LIMIT 10;
Table and Index Issues
-- Find tables with many sequential scans
SELECT
schemaname,
tablename,
seq_scan,
seq_tup_read,
idx_scan,
CASE
WHEN seq_scan + idx_scan = 0 THEN 0
ELSE ROUND(100.0 * seq_scan / (seq_scan + idx_scan), 2)
END as seq_scan_percent
FROM pg_stat_user_tables
WHERE seq_scan > 0
ORDER BY seq_scan_percent DESC;
-- Find unused indexes
SELECT
schemaname,
tablename,
indexname,
idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY schemaname, tablename;
-- Find bloated tables
SELECT
schemaname,
tablename,
n_live_tup,
n_dead_tup,
ROUND(100.0 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 2) as dead_percent
FROM pg_stat_user_tables
WHERE n_dead_tup > 0
ORDER BY dead_percent DESC;
Query Optimization Techniques
Index Optimization
-- Create composite indexes for common query patterns
CREATE INDEX idx_orders_user_date ON orders(user_id, order_date DESC);
-- Create partial indexes for filtered queries
CREATE INDEX idx_users_active_email ON users(email) WHERE active = true;
-- Create expression indexes for function calls
CREATE INDEX idx_users_lower_email ON users(LOWER(email));
-- Create covering indexes to avoid table lookups
CREATE INDEX idx_orders_covering ON orders(user_id, order_date)
INCLUDE (amount, status);
Query Rewriting
-- Bad: Using OR conditions
SELECT * FROM users
WHERE email = 'user@example.com' OR phone = '+1234567890';
-- Good: Use UNION for better index usage
SELECT * FROM users WHERE email = 'user@example.com'
UNION
SELECT * FROM users WHERE phone = '+1234567890';
-- Bad: Using NOT IN with subquery
SELECT * FROM users
WHERE id NOT IN (SELECT user_id FROM orders);
-- Good: Use NOT EXISTS
SELECT * FROM users u
WHERE NOT EXISTS (SELECT 1 FROM orders o WHERE o.user_id = u.id);
Partitioning for Large Tables
-- Create partitioned table
CREATE TABLE orders_partitioned (
id SERIAL,
user_id INTEGER,
amount DECIMAL(10,2),
order_date DATE
) PARTITION BY RANGE (order_date);
-- Create partitions
CREATE TABLE orders_2023_01 PARTITION OF orders_partitioned
FOR VALUES FROM ('2023-01-01') TO ('2023-02-01');
CREATE TABLE orders_2023_02 PARTITION OF orders_partitioned
FOR VALUES FROM ('2023-02-01') TO ('2023-03-01');
-- Query only relevant partition
SELECT * FROM orders_partitioned
WHERE order_date >= '2023-01-15' AND order_date < '2023-01-16';
Configuration Tuning
Memory Settings
-- Check current settings
SHOW shared_buffers;
SHOW work_mem;
SHOW maintenance_work_mem;
SHOW effective_cache_size;
-- Recommended settings for different scenarios
-- For dedicated database server (8GB RAM):
-- shared_buffers = 2GB (25% of RAM)
-- work_mem = 16MB
-- maintenance_work_mem = 256MB
-- effective_cache_size = 6GB (75% of RAM)
-- For shared server:
-- shared_buffers = 256MB
-- work_mem = 4MB
-- maintenance_work_mem = 64MB
-- effective_cache_size = 1GB
Checkpoint and WAL Settings
-- Checkpoint settings
SHOW checkpoint_completion_target;
SHOW wal_buffers;
SHOW max_wal_size;
-- Optimize for write-heavy workloads
-- checkpoint_completion_target = 0.9
-- wal_buffers = 16MB
-- max_wal_size = 2GB
Connection and Concurrency
-- Connection settings
SHOW max_connections;
SHOW shared_preload_libraries;
-- For high-concurrency applications
-- max_connections = 200
-- shared_preload_libraries = 'pg_stat_statements'
Maintenance Tasks
Regular Maintenance
-- Update table statistics
ANALYZE users;
ANALYZE orders;
-- Update all tables in a schema
ANALYZE VERBOSE;
-- Rebuild indexes
REINDEX INDEX idx_users_email;
-- Rebuild all indexes
REINDEX DATABASE mydatabase;
-- Vacuum tables to reclaim space
VACUUM users;
VACUUM ANALYZE users;
-- Full vacuum (locks table)
VACUUM FULL users;
Automated Maintenance
-- Create maintenance function
CREATE OR REPLACE FUNCTION maintenance_routine()
RETURNS void AS $$
BEGIN
-- Update statistics
ANALYZE;
-- Vacuum tables with more than 10% dead tuples
PERFORM
'VACUUM ANALYZE ' || schemaname || '.' || tablename
FROM pg_stat_user_tables
WHERE n_dead_tup > n_live_tup * 0.1;
-- Log maintenance completion
INSERT INTO maintenance_log (action, completed_at)
VALUES ('routine_maintenance', NOW());
END;
$$ LANGUAGE plpgsql;
-- Schedule with pg_cron (if available)
SELECT cron.schedule('daily-maintenance', '0 2 * * *', 'SELECT maintenance_routine();');
Performance Monitoring Dashboard
Create Performance Views
-- Create a comprehensive performance view
CREATE VIEW performance_dashboard AS
WITH table_stats AS (
SELECT
schemaname,
tablename,
n_live_tup,
n_dead_tup,
seq_scan,
idx_scan,
CASE
WHEN seq_scan + idx_scan = 0 THEN 0
ELSE ROUND(100.0 * seq_scan / (seq_scan + idx_scan), 2)
END as seq_scan_percent
FROM pg_stat_user_tables
),
index_stats AS (
SELECT
schemaname,
tablename,
COUNT(*) as index_count,
SUM(CASE WHEN idx_scan = 0 THEN 1 ELSE 0 END) as unused_indexes
FROM pg_stat_user_indexes
GROUP BY schemaname, tablename
),
slow_queries AS (
SELECT
query,
calls,
mean_time,
total_time
FROM pg_stat_statements
WHERE mean_time > 100
ORDER BY total_time DESC
LIMIT 5
)
SELECT
'Table Statistics' as category,
schemaname,
tablename,
n_live_tup as live_rows,
n_dead_tup as dead_rows,
seq_scan_percent
FROM table_stats
WHERE seq_scan_percent > 50
UNION ALL
SELECT
'Index Issues' as category,
schemaname,
tablename,
index_count as total_indexes,
unused_indexes as unused_count,
0 as seq_scan_percent
FROM index_stats
WHERE unused_indexes > 0;
Real-time Monitoring
-- Monitor active queries
SELECT
pid,
usename,
application_name,
client_addr,
state,
query_start,
query
FROM pg_stat_activity
WHERE state = 'active'
AND query NOT LIKE '%pg_stat_activity%';
-- Monitor locks
SELECT
l.pid,
l.mode,
l.granted,
t.relname as table_name,
a.usename,
a.query
FROM pg_locks l
JOIN pg_class t ON l.relation = t.oid
JOIN pg_stat_activity a ON l.pid = a.pid
WHERE t.relname IS NOT NULL;
Best Practices
1. Regular Monitoring
-- Set up automated monitoring queries
-- Check for long-running transactions
SELECT
pid,
usename,
application_name,
EXTRACT(EPOCH FROM (NOW() - query_start)) as duration_seconds,
query
FROM pg_stat_activity
WHERE state = 'active'
AND query_start < NOW() - INTERVAL '5 minutes';
-- Check for table bloat
SELECT
schemaname,
tablename,
ROUND(100.0 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 2) as dead_percent
FROM pg_stat_user_tables
WHERE n_dead_tup > n_live_tup * 0.1;
2. Proactive Optimization
-- Create indexes based on query patterns
-- Monitor slow queries and create appropriate indexes
-- Use covering indexes for frequently accessed columns
-- Consider partitioning for large tables
-- Example: Create covering index for common query
CREATE INDEX idx_orders_user_covering ON orders(user_id, order_date)
INCLUDE (amount, status);
3. Configuration Management
-- Document configuration changes
-- Test changes in staging environment
-- Monitor impact of changes
-- Use connection pooling for high-concurrency applications
-- Example: PgBouncer configuration
-- [databases]
-- mydb = host=localhost port=5432 dbname=mydb
--
-- [pgbouncer]
-- pool_mode = transaction
-- max_client_conn = 1000
-- default_pool_size = 20
Performance tuning is an iterative process. Start by identifying the biggest bottlenecks, implement optimizations, measure the impact, and continue refining. Regular monitoring and maintenance are key to maintaining optimal performance.