Table of Contents
- 1 The Problem: Case Sensitivity Creates User Experience Nightmares
- 2 The Naive Solution: LOWER() Everywhere
- 3 The Effective Solution: citext Type
- 4 How citext Works Under the Hood
- 5 Real-World Examples
- 6 Advanced Patterns
- 7 Performance Considerations
- 8 When NOT to Use citext
- 9 Common Pitfalls
- 10 Migration Example: Converting an Existing Application
- 11 Testing citext Behavior
- 12 Summary
- 13 Further Reading
The Problem: Case Sensitivity Creates User Experience Nightmares
You’ve built a user registration system. Alice signs up with [email protected]. A week later, she tries to log in but types [email protected]. Your login fails because PostgreSQL’s TEXT type is case-sensitive:
-- Alice's account
INSERT INTO users (email, password_hash)
VALUES ('[email protected]', '$2b$12$...');
-- Alice tries to log in
SELECT * FROM users
WHERE email = '[email protected]';
-- Returns 0 rows! Authentication fails.
This is frustrating for users and creates duplicate account problems:
-- Alice creates a "new" account with different casing
INSERT INTO users (email, password_hash)
VALUES ('[email protected]', '$2b$12$...');
-- ✓ Success! Now Alice has two accounts.
The Naive Solution: LOWER() Everywhere
The common workaround is to use LOWER() in queries:
-- Store everything in lowercase
INSERT INTO users (email, password_hash)
VALUES (LOWER('[email protected]'), '$2b$12$...');
-- Query with LOWER()
SELECT * FROM users
WHERE LOWER(email) = LOWER('[email protected]');
Problems with this approach:
Problem 1: You Must Remember LOWER() Everywhere
One forgotten LOWER() and you have a bug:
-- ❌ Developer forgets LOWER() in a new feature SELECT * FROM users WHERE email = '[email protected]'; -- Returns nothing -- ❌ Background job doesn't use LOWER() UPDATE users SET last_login = NOW() WHERE email = user_input; -- Case-sensitive match fails
Problem 2: Can’t Use Standard Indexes
A regular index on email won’t help queries with LOWER():
CREATE INDEX idx_users_email ON users(email); -- This query can't use the index! SELECT * FROM users WHERE LOWER(email) = '[email protected]'; -- Seq Scan on users (SLOW!)
You need a functional index instead:
CREATE INDEX idx_users_email_lower ON users(LOWER(email)); -- Now this query can use the index SELECT * FROM users WHERE LOWER(email) = '[email protected]'; -- Index Scan using idx_users_email_lower (FAST!)
Problem 3: Unique Constraints Don’t Work as Expected
CREATE TABLE users (
id INTEGER PRIMARY KEY,
email TEXT UNIQUE, -- This is case-sensitive!
password_hash TEXT
);
INSERT INTO users VALUES (1, '[email protected]', 'hash1');
INSERT INTO users VALUES (2, '[email protected]', 'hash2');
-- ✓ Both succeed! You now have duplicate emails.
To fix this, you need a partial unique index:
CREATE UNIQUE INDEX idx_users_email_unique_lower ON users(LOWER(email));
Problem 4: Code Complexity and Maintenance Burden
Every developer needs to remember:
- Always use
LOWER()in INSERT/UPDATE - Always use
LOWER()in WHERE clauses - Always use
LOWER()in JOIN conditions - Create functional indexes on
LOWER(column) - Create unique indexes on
LOWER(column)
This is error-prone and creates a maintenance nightmare.
The Effective Solution: citext Type
PostgreSQL provides the citext (case-insensitive text) extension that handles all of this automatically:
-- Enable the extension (once per database)
CREATE EXTENSION IF NOT EXISTS citext;
-- Create table with citext
CREATE TABLE users (
id INTEGER PRIMARY KEY,
email citext UNIQUE,
password_hash TEXT
);
Now case-insensitivity works automatically:
-- Insert with mixed case INSERT INTO users VALUES (1, '[email protected]', 'hash1'); -- Query with different case - it works! SELECT * FROM users WHERE email = '[email protected]'; -- Returns Alice's row ✓ -- Try to insert duplicate with different case INSERT INTO users VALUES (2, '[email protected]', 'hash2'); -- ERROR: duplicate key value violates unique constraint "users_email_key" -- DETAIL: Key (email)=([email protected]) already exists.
Magic! The database handles case-insensitivity at the type level.
How citext Works Under the Hood
citext is not a native type—it’s a domain built on top of TEXT with custom comparison operators:
-- These are all equivalent when comparing citext values 'Alice'::citext = 'alice'::citext -- true 'Alice'::citext = 'ALICE'::citext -- true 'Alice'::citext = 'aLiCe'::citext -- true
Comparison Operators
All standard operators work case-insensitively:
SELECT
'alice'::citext = 'ALICE'::citext as equals, -- true
'alice'::citext != 'bob'::citext as not_equals, -- true
'alice'::citext < 'BOB'::citext as less_than, -- true (lexicographic)
'charlie'::citext > 'BOB'::citext as greater_than, -- true
'alice'::citext LIKE 'AL%' as pattern_match; -- true
LIKE and ILIKE
-- With citext, LIKE is already case-insensitive! SELECT email FROM users WHERE email LIKE 'alice%'; -- Matches: '[email protected]', '[email protected]', '[email protected]' -- ILIKE still works but is redundant SELECT email FROM users WHERE email ILIKE 'alice%'; -- Same results
Real-World Examples
Example 1: User Authentication System
CREATE TABLE users (
id INTEGER GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
email citext UNIQUE NOT NULL,
username citext UNIQUE NOT NULL,
password_hash TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- Create index (standard B-tree works perfectly!)
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_users_username ON users(username);
Registration:
-- User registers with mixed case
INSERT INTO users (email, username, password_hash)
VALUES ('[email protected]', 'AliceSmith', '$2b$12$...')
RETURNING id;
-- ✓ Success: id = 1
-- Someone tries to register with different casing
INSERT INTO users (email, username, password_hash)
VALUES ('[email protected]', 'alicesmith', '$2b$12$...');
-- ✗ ERROR: duplicate key value violates unique constraint
Login (email or username):
-- Login with different case variations SELECT id, password_hash FROM users WHERE email = '[email protected]' OR username = 'alicesmith'; -- Returns Alice's record ✓ SELECT id, password_hash FROM users WHERE email = '[email protected]' OR username = 'AliceSmith'; -- Returns Alice's record ✓
Example 2: Product SKU Codes
SKUs are often case-insensitive in practice:
CREATE TABLE products (
id INTEGER PRIMARY KEY,
sku citext UNIQUE NOT NULL,
name TEXT NOT NULL,
price NUMERIC(10, 2) NOT NULL
);
INSERT INTO products VALUES
(1, 'WIDGET-001', 'Basic Widget', 19.99),
(2, 'GADGET-002', 'Premium Gadget', 49.99);
Queries work regardless of case:
-- Customer searches with lowercase SELECT * FROM products WHERE sku = 'widget-001'; -- Returns: WIDGET-001 ✓ -- Warehouse system uses uppercase SELECT * FROM products WHERE sku = 'WIDGET-001'; -- Returns: WIDGET-001 ✓ -- API uses mixed case SELECT * FROM products WHERE sku = 'Widget-001'; -- Returns: WIDGET-001 ✓
Duplicate prevention:
-- Try to insert duplicate with different case INSERT INTO products VALUES (3, 'widget-001', 'Duplicate', 29.99); -- ✗ ERROR: duplicate key value violates unique constraint
Example 3: Tags and Categories
CREATE TABLE tags (
id INTEGER GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
name citext UNIQUE NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- Users create tags with various casings
INSERT INTO tags (name) VALUES ('PostgreSQL');
INSERT INTO tags (name) VALUES ('Database');
INSERT INTO tags (name) VALUES ('Tutorial');
Finding or creating tags:
-- Function to get or create tag (case-insensitive lookup)
CREATE FUNCTION get_or_create_tag(tag_name citext)
RETURNS INTEGER AS $$
DECLARE
tag_id INTEGER;
BEGIN
-- Try to find existing tag (case-insensitive)
SELECT id INTO tag_id FROM tags WHERE name = tag_name;
IF tag_id IS NULL THEN
-- Create new tag
INSERT INTO tags (name) VALUES (tag_name) RETURNING id INTO tag_id;
END IF;
RETURN tag_id;
END;
$$ LANGUAGE plpgsql;
-- These all return the same tag ID
SELECT get_or_create_tag('postgresql'); -- Returns 1
SELECT get_or_create_tag('PostgreSQL'); -- Returns 1
SELECT get_or_create_tag('POSTGRESQL'); -- Returns 1
SELECT get_or_create_tag('PoStGrEsQl'); -- Returns 1
Example 4: Domain Names and Hostnames
Domain names are case-insensitive by specification:
CREATE TABLE websites (
id INTEGER PRIMARY KEY,
domain citext UNIQUE NOT NULL,
owner_id INTEGER NOT NULL,
ssl_enabled BOOLEAN DEFAULT false
);
INSERT INTO websites VALUES
(1, 'Example.com', 100, true),
(2, 'GitHub.com', 101, true);
Lookups work naturally:
-- User types domain in various ways SELECT * FROM websites WHERE domain = 'example.com'; -- ✓ Found SELECT * FROM websites WHERE domain = 'Example.com'; -- ✓ Found SELECT * FROM websites WHERE domain = 'EXAMPLE.COM'; -- ✓ Found SELECT * FROM websites WHERE domain = 'ExAmPlE.CoM'; -- ✓ Found
Example 5: File Paths (Case-Insensitive Filesystems)
For systems running on Windows or macOS (case-insensitive filesystems):
CREATE TABLE file_metadata (
id INTEGER PRIMARY KEY,
path citext UNIQUE NOT NULL,
size_bytes BIGINT NOT NULL,
modified_at TIMESTAMPTZ NOT NULL
);
-- Store file with one casing
INSERT INTO file_metadata VALUES
(1, '/Users/Alice/Documents/Report.pdf', 1048576, NOW());
-- Find file with different casing
SELECT * FROM file_metadata
WHERE path = '/users/alice/documents/report.pdf';
-- ✓ Found (matches filesystem behavior on Mac/Windows)
Advanced Patterns
Pattern 1: Partial citext for Mixed-Case Storage with Case-Insensitive Queries
You want to preserve the original case but query case-insensitively:
CREATE TABLE users (
id INTEGER PRIMARY KEY,
email_display TEXT NOT NULL, -- Original case preserved
email_search citext UNIQUE NOT NULL -- For searching
);
-- Trigger to keep them in sync
CREATE FUNCTION sync_email() RETURNS TRIGGER AS $$
BEGIN
NEW.email_search := NEW.email_display;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER sync_email_trigger
BEFORE INSERT OR UPDATE ON users
FOR EACH ROW EXECUTE FUNCTION sync_email();
-- Insert with original casing
INSERT INTO users (id, email_display)
VALUES (1, '[email protected]');
-- email_search is automatically set to '[email protected]' (as citext)
-- Query case-insensitively, get original casing
SELECT email_display FROM users WHERE email_search = '[email protected]';
-- Returns: '[email protected]' ✓
Pattern 2: Case-Insensitive JOINs
CREATE TABLE customers (
id INTEGER PRIMARY KEY,
email citext UNIQUE
);
CREATE TABLE orders (
id INTEGER PRIMARY KEY,
customer_email citext NOT NULL,
total NUMERIC(10, 2)
);
-- JOIN works case-insensitively
SELECT c.id, o.total
FROM customers c
JOIN orders o ON c.email = o.customer_email;
-- Matches regardless of case differences in the data
Pattern 3: Migrating Existing TEXT Columns
-- Start with a TEXT column
CREATE TABLE products (
id INTEGER PRIMARY KEY,
sku TEXT UNIQUE
);
-- Enable citext
CREATE EXTENSION IF NOT EXISTS citext;
-- Migrate to citext
BEGIN;
-- Drop old unique constraint
ALTER TABLE products DROP CONSTRAINT products_sku_key;
-- Change column type
ALTER TABLE products ALTER COLUMN sku TYPE citext;
-- Add unique constraint back
ALTER TABLE products ADD CONSTRAINT products_sku_key UNIQUE (sku);
COMMIT;
Pattern 4: Case-Insensitive Arrays
CREATE TABLE articles (
id INTEGER PRIMARY KEY,
title TEXT,
tags citext[] NOT NULL DEFAULT '{}'
);
-- Insert with mixed case tags
INSERT INTO articles VALUES
(1, 'PostgreSQL Tips', ARRAY['Database', 'SQL', 'Tutorial']::citext[]);
-- Search for tag (case-insensitive)
SELECT * FROM articles WHERE 'database'::citext = ANY(tags);
-- ✓ Found (matches 'Database')
SELECT * FROM articles WHERE 'TUTORIAL'::citext = ANY(tags);
-- ✓ Found (matches 'Tutorial')
Performance Considerations
Index Performance
citext uses the same B-tree index structure as TEXT:
-- Standard B-tree index works perfectly CREATE INDEX idx_users_email ON users(email); -- Query uses index efficiently EXPLAIN SELECT * FROM users WHERE email = '[email protected]'; -- Index Scan using idx_users_email
Performance is nearly identical to TEXT + functional index, but with simpler code.
Storage Overhead
citext has zero storage overhead compared to TEXT—it’s stored identically:
SELECT pg_column_size('Hello'::text); -- 9 bytes
SELECT pg_column_size('Hello'::citext); -- 9 bytes (same!)
Comparison Speed
citext comparisons are slightly slower than TEXT because they perform case-folding:
-- Benchmark: 1 million comparisons -- TEXT: ~50ms -- citext: ~65ms -- Overhead: ~30% slower
But this is negligible because:
- Comparisons are very fast (microseconds)
- You avoid full table scans from forgotten
LOWER() - You eliminate bugs from inconsistent case handling
When NOT to Use citext
Case 1: Case Matters for Business Logic
If case differences are meaningful, don’t use citext:
-- ❌ WRONG: Password hashes should be case-sensitive!
CREATE TABLE users (
email citext,
password_hash citext -- DON'T DO THIS!
);
-- ✅ CORRECT: Only case-insensitive columns use citext
CREATE TABLE users (
email citext,
password_hash TEXT -- Keep case-sensitive
);
Case 2: Binary Data or Exact Matching Required
-- ❌ WRONG: API keys should be exact matches
CREATE TABLE api_keys (
key citext PRIMARY KEY -- Don't do this
);
-- ✅ CORRECT: Use TEXT for exact matching
CREATE TABLE api_keys (
key TEXT PRIMARY KEY
);
Case 3: International Characters
citext uses database locale for case-folding, which can have issues with some Unicode characters:
-- Turkish 'i' problem SELECT 'i'::citext = 'İ'::citext; -- May not work as expected in some locales
For full Unicode case-insensitivity, consider using ICU collations (PostgreSQL 12+):
CREATE COLLATION case_insensitive (
provider = icu,
locale = 'und-u-ks-level2',
deterministic = false
);
CREATE TABLE users (
email TEXT COLLATE case_insensitive
);
Common Pitfalls
Pitfall 1: Mixing TEXT and citext in Comparisons
CREATE TABLE users (email citext);
INSERT INTO users VALUES ('[email protected]');
-- ❌ This works but is implicit casting
SELECT * FROM users WHERE email = '[email protected]'; -- TEXT literal
-- ✅ Better: explicit cast when needed
SELECT * FROM users WHERE email = '[email protected]'::citext;
In practice, PostgreSQL handles this well, but explicit casts make intent clear.
Pitfall 2: Forgetting citext in Function Parameters
-- ❌ WRONG: TEXT parameter loses case-insensitivity
CREATE FUNCTION find_user(user_email TEXT) RETURNS INTEGER AS $$
SELECT id FROM users WHERE email = user_email;
$$ LANGUAGE SQL;
-- ✅ CORRECT: Use citext parameter
CREATE FUNCTION find_user(user_email citext) RETURNS INTEGER AS $$
SELECT id FROM users WHERE email = user_email;
$$ LANGUAGE SQL;
Pitfall 3: Not Understanding How GROUP BY Works
CREATE TABLE events (
id INTEGER,
tag citext
);
INSERT INTO events VALUES
(1, 'PostgreSQL'),
(2, 'postgresql'),
(3, 'POSTGRESQL');
-- GROUP BY with citext
SELECT tag, COUNT(*) FROM events GROUP BY tag;
-- Returns:
-- tag | count
-- -------------|------
-- PostgreSQL | 3 (all grouped together!)
This might be surprising if you expected separate groups. It’s actually the desired behavior for case-insensitive grouping.
Migration Example: Converting an Existing Application
Let’s migrate a real application from TEXT + LOWER() to citext:
Before (TEXT with LOWER()):
-- Schema
CREATE TABLE users (
id INTEGER PRIMARY KEY,
email TEXT UNIQUE NOT NULL
);
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
-- Application code
def find_user_by_email(email):
return db.query(
"SELECT * FROM users WHERE LOWER(email) = LOWER(%s)",
(email,)
)
def create_user(email, password):
db.execute(
"INSERT INTO users (email) VALUES (LOWER(%s))",
(email,)
)
After (citext):
-- Migration
CREATE EXTENSION IF NOT EXISTS citext;
BEGIN;
ALTER TABLE users DROP CONSTRAINT users_email_key;
DROP INDEX idx_users_email_lower;
ALTER TABLE users ALTER COLUMN email TYPE citext;
ALTER TABLE users ADD CONSTRAINT users_email_key UNIQUE (email);
CREATE INDEX idx_users_email ON users(email);
COMMIT;
-- Simplified application code
def find_user_by_email(email):
return db.query(
"SELECT * FROM users WHERE email = %s", # No LOWER()!
(email,)
)
def create_user(email, password):
db.execute(
"INSERT INTO users (email) VALUES (%s)", # No LOWER()!
(email,)
)
Benefits:
- Removed all
LOWER()calls - Simpler queries
- Standard indexes instead of functional indexes
- Less error-prone
- Same performance
Testing citext Behavior
Here’s a complete test suite:
-- Setup
CREATE EXTENSION IF NOT EXISTS citext;
CREATE TABLE test_citext (value citext);
-- Test 1: Equality
INSERT INTO test_citext VALUES ('Hello');
SELECT value FROM test_citext WHERE value = 'hello'; -- ✓ Returns 'Hello'
SELECT value FROM test_citext WHERE value = 'HELLO'; -- ✓ Returns 'Hello'
SELECT value FROM test_citext WHERE value = 'HeLLo'; -- ✓ Returns 'Hello'
-- Test 2: Uniqueness
CREATE TABLE test_unique (id INTEGER, email citext UNIQUE);
INSERT INTO test_unique VALUES (1, '[email protected]');
INSERT INTO test_unique VALUES (2, '[email protected]'); -- ✗ ERROR: duplicate
-- Test 3: Ordering
CREATE TABLE test_order (name citext);
INSERT INTO test_order VALUES ('Charlie'), ('alice'), ('BOB');
SELECT name FROM test_order ORDER BY name;
-- Returns: alice, BOB, Charlie (case-insensitive sort)
-- Test 4: Pattern Matching
SELECT 'Alice'::citext LIKE 'ali%'; -- ✓ true
SELECT 'Alice'::citext LIKE 'ALI%'; -- ✓ true
SELECT 'Alice'::citext ~ '^ali'; -- ✗ false (regex is still case-sensitive!)
SELECT 'Alice'::citext ~* '^ali'; -- ✓ true (use ~* for case-insensitive regex)
-- Cleanup
DROP TABLE test_citext, test_unique, test_order;
Summary
Key Takeaways:
- Use
citextfor identifiers where case shouldn’t matter (emails, usernames, SKUs) - citext makes UNIQUE constraints work intuitively
- Standard B-tree indexes work without needing functional indexes
- No need to remember
LOWER()in every query - Near-zero performance overhead
- Case is preserved for display but ignored for comparison
- Don’t use citext for case-sensitive data (passwords, API keys)
- Be careful with Unicode edge cases in some locales
The Bottom Line:
citext eliminates an entire class of bugs by making case-insensitivity a type-level guarantee rather than a convention that developers must remember. It’s simpler, safer, and just as fast as the LOWER() pattern.
Further Reading
Next in the series: Item 3 – Use EXCLUSION Constraints for Temporal Data

I build softwares that solve problems. I also love writing/documenting things I learn/want to learn.