Skip to content

log file sync - Diagnose Oracle Commit Performance Issues

Wait Event Class: Commit

Parameters: buffer# (the log buffer position that must be flushed)

log file sync is the wait a foreground session incurs from the moment it issues a COMMIT (or ROLLBACK) until the Log Writer (LGWR) background process has written all redo generated by that transaction to the online redo log files on disk.

  1. Session calls COMMIT
  2. Session posts LGWR to write redo
  3. Session waits on log file sync
  4. LGWR writes redo to disk (itself waiting on log file parallel write)
  5. LGWR signals the session that the write is complete
  6. Session’s log file sync wait ends; the commit returns to the application

The critical insight: log file sync duration equals LGWR’s write latency plus any LGWR scheduling delay. If LGWR is slow — because redo logs are on slow storage, because the OS is not scheduling LGWR promptly, or because there are too many concurrent commits — every committed transaction feels that latency.

  • Every single committed transaction waits on this event
  • In high-frequency OLTP systems, log file sync is commonly in the top 3 wait events
  • A 5 ms average wait means every commit takes at least 5 ms — capping throughput at ~200 commits/second per session
  • High log file sync waits directly translate to poor application response time for write-heavy workloads

Average Wait TimeAssessment
< 1 msExcellent — redo logs on NVMe or very fast SAN with write cache
1–5 msGood — typical well-tuned SAN or SSD
5–10 msInvestigate — may be storage contention, slow disk, or misconfiguration
10–20 msProblem — application performance is significantly impacted
> 20 msCritical — redo log I/O is severely degraded

This event is expected in any database that commits. It becomes problematic when:

  • Average latency is consistently above 5–10 ms
  • The total time in log file sync represents > 10% of DB time in an OLTP system
  • Application response times spike in correlation with commit-heavy operations
  • log file sync average latency diverges significantly from log file parallel write average latency (indicating LGWR scheduling delays)

-- System-level commit wait overview
SELECT
event,
total_waits,
total_timeouts,
ROUND(time_waited / 100, 2) AS total_secs,
ROUND(average_wait * 10, 2) AS avg_wait_ms,
ROUND(time_waited * 100.0 /
NULLIF((SELECT SUM(time_waited)
FROM v$system_event
WHERE wait_class NOT IN ('Idle')), 0), 2) AS pct_db_time
FROM v$system_event
WHERE event IN ('log file sync', 'log file parallel write')
ORDER BY time_waited DESC;
-- Compare sync vs parallel write to detect LGWR scheduling gaps
-- If log file sync avg >> log file parallel write avg, LGWR is being starved of CPU
SELECT
e1.event AS sync_event,
ROUND(e1.average_wait * 10, 2) AS sync_avg_ms,
e2.event AS write_event,
ROUND(e2.average_wait * 10, 2) AS write_avg_ms,
ROUND((e1.average_wait - e2.average_wait) * 10, 2) AS lgwr_sched_delay_ms
FROM v$system_event e1
CROSS JOIN v$system_event e2
WHERE e1.event = 'log file sync'
AND e2.event = 'log file parallel write';
-- Active sessions waiting for log file sync right now
SELECT
sw.sid,
sw.serial#,
s.username,
s.program,
s.machine,
s.sql_id,
sw.p1 AS log_buffer_position,
sw.seconds_in_wait AS secs_waiting,
s.last_call_et AS session_last_active_secs
FROM v$session_wait sw
JOIN v$session s ON sw.sid = s.sid
WHERE sw.event = 'log file sync'
ORDER BY sw.seconds_in_wait DESC;
-- Online redo log groups, sizes, and current status
SELECT
l.group#,
l.members,
ROUND(l.bytes / 1024 / 1024, 0) AS size_mb,
l.status,
l.archived,
l.sequence#,
TO_CHAR(l.first_time, 'YYYY-MM-DD HH24:MI:SS') AS first_time
FROM v$log l
ORDER BY l.group#;
-- Redo log member file locations
SELECT
lf.group#,
lf.member,
lf.status,
lf.type
FROM v$logfile lf
ORDER BY lf.group#, lf.member;
-- Log switch frequency (should be every 15-30+ minutes in OLTP)
SELECT
TO_CHAR(first_time, 'YYYY-MM-DD HH24') AS hour,
COUNT(*) AS switches_per_hour,
ROUND(60.0 / COUNT(*), 1) AS avg_minutes_between_switches
FROM v$log_history
WHERE first_time > SYSDATE - 7
GROUP BY TO_CHAR(first_time, 'YYYY-MM-DD HH24')
ORDER BY 1 DESC
FETCH FIRST 48 ROWS ONLY;
-- Commits per second from V$SYSSTAT
SELECT
name,
value AS total_commits,
ROUND(value / (SELECT (SYSDATE - startup_time) * 86400
FROM v$instance), 2) AS commits_per_sec
FROM v$sysstat
WHERE name IN ('user commits', 'user rollbacks');
-- ASH: Which programs/users are causing the most log file sync waits?
SELECT
s.program,
s.username,
COUNT(*) AS ash_samples,
COUNT(*) * 10 AS estimated_wait_secs,
ROUND(COUNT(*) * 100.0 /
SUM(COUNT(*)) OVER (), 2) AS pct_of_sync_waits
FROM v$active_session_history ash
JOIN v$session s ON ash.session_id = s.sid
WHERE ash.event = 'log file sync'
AND ash.sample_time > SYSDATE - 1/24
GROUP BY s.program, s.username
ORDER BY ash_samples DESC
FETCH FIRST 15 ROWS ONLY;

Redo log members on spinning disks, NFS mounts, or shared SAN volumes with high contention exhibit high per-write latency. Since every commit blocks on this write, storage speed is the primary determinant of commit throughput.

Key point: Redo logs must be written sequentially and synchronously before the commit returns. This is non-negotiable for durability. The only solution is faster storage or reducing the frequency of commits.

Best practice: Place redo log members on the fastest available storage — dedicated SSDs, NVMe, or a high-tier SAN LUN that is not shared with data files. In ASM, use a separate disk group for redo logs.

2. Excessive Commit Frequency — Committing After Every Row

Section titled “2. Excessive Commit Frequency — Committing After Every Row”

Applications that commit inside a row-by-row loop generate one log file sync wait per row. For a million-row insert, this means 1 million individual commits, each paying the full latency penalty.

Example pattern:

-- BAD: Commit per row
FOR each_record IN source_cursor LOOP
INSERT INTO target_table VALUES (each_record);
COMMIT; -- 1 commit per row = 1 log file sync per row
END LOOP;

This is the single most common cause of high aggregate log file sync time. The fix is trivial: batch commits.

3. Redo Log Files Too Small — Frequent Log Switches

Section titled “3. Redo Log Files Too Small — Frequent Log Switches”

Small redo log files cause frequent log switches. During a log switch, LGWR must wait for ARCH to finish archiving the current log before it can begin writing to the next one (if all groups are in ACTIVE state). This causes checkpoint-related stalls which manifest as elevated log file sync and log file switch waits.

Target: redo log switches no more frequently than every 15–30 minutes during peak load.

In heavily loaded systems, LGWR may not get CPU promptly after being posted by a committing session. The gap between when LGWR is woken and when it actually runs shows up as the difference between log file sync average and log file parallel write average. On virtualized environments, vCPU scheduling can introduce significant delays.

Each log group must have all its members written before LGWR considers the write complete. If one member is on fast storage and another is on slow storage, every write waits for the slowest member.

Detection: Check V$LOGFILE for member locations; compare performance of different mount points.

When asynchronous I/O is not available (e.g., due to OS configuration or file system type), LGWR blocks during writes. With async I/O enabled, LGWR can issue multiple writes concurrently across log members. Check V$IOSTAT_FILE and the disk_asynch_io parameter.


-- Step 1: Identify current log file locations
SELECT group#, member FROM v$logfile ORDER BY group#, member;
-- Step 2: Add new members on faster storage to each group
ALTER DATABASE ADD LOGFILE MEMBER '/fast_disk/redo/redo01b.log' TO GROUP 1;
ALTER DATABASE ADD LOGFILE MEMBER '/fast_disk/redo/redo02b.log' TO GROUP 2;
ALTER DATABASE ADD LOGFILE MEMBER '/fast_disk/redo/redo03b.log' TO GROUP 3;
-- Step 3: Perform log switches to move away from old members
ALTER SYSTEM SWITCH LOGFILE;
ALTER SYSTEM CHECKPOINT;
-- Step 4: Drop old slow members (only after group is not CURRENT/ACTIVE)
ALTER DATABASE DROP LOGFILE MEMBER '/slow_disk/redo/redo01a.log';
ALTER DATABASE DROP LOGFILE MEMBER '/slow_disk/redo/redo02a.log';
ALTER DATABASE DROP LOGFILE MEMBER '/slow_disk/redo/redo03a.log';
-- BAD pattern (commit per row):
BEGIN
FOR i IN 1..1000000 LOOP
INSERT INTO target_table VALUES (i, 'data');
COMMIT;
END LOOP;
END;
/
-- GOOD pattern (commit every N rows):
DECLARE
l_commit_interval NUMBER := 10000;
BEGIN
FOR i IN 1..1000000 LOOP
INSERT INTO target_table VALUES (i, 'data');
IF MOD(i, l_commit_interval) = 0 THEN
COMMIT;
END IF;
END LOOP;
COMMIT; -- Final commit for remainder
END;
/
-- BEST pattern for bulk loads (single commit):
INSERT /*+ APPEND */ INTO target_table
SELECT * FROM source_table;
COMMIT;

Resize Redo Log Files to Reduce Switch Frequency

Section titled “Resize Redo Log Files to Reduce Switch Frequency”
-- Step 1: Check current sizes and switch frequency
SELECT group#, ROUND(bytes/1024/1024) AS size_mb FROM v$log;
-- Calculate optimal size: target 20-minute switches
-- optimal_size_mb = redo_generated_per_min * 20
-- Check redo generated per minute:
SELECT
ROUND(AVG(redo_size_mb), 2) AS avg_redo_per_interval_mb
FROM (
SELECT
snap_id,
(e_redo - b_redo) / 1024 / 1024 AS redo_size_mb
FROM (
SELECT
snap_id,
SUM(CASE WHEN stat_name = 'redo size' THEN value END) AS e_redo,
LAG(SUM(CASE WHEN stat_name = 'redo size' THEN value END))
OVER (ORDER BY snap_id) AS b_redo
FROM dba_hist_sysstat
WHERE stat_name = 'redo size'
AND snap_id BETWEEN &begin_snap AND &end_snap
GROUP BY snap_id
)
);
-- Step 2: Add new larger log groups
ALTER DATABASE ADD LOGFILE GROUP 4
('/fast_disk/redo/redo04a.log', '/mirror/redo/redo04b.log')
SIZE 1G;
ALTER DATABASE ADD LOGFILE GROUP 5
('/fast_disk/redo/redo05a.log', '/mirror/redo/redo05b.log')
SIZE 1G;
-- Step 3: Drop old groups (must not be CURRENT or ACTIVE)
-- Force switches until old groups are INACTIVE
ALTER SYSTEM SWITCH LOGFILE;
ALTER SYSTEM CHECKPOINT;
-- Check status before dropping
SELECT group#, status FROM v$log;
ALTER DATABASE DROP LOGFILE GROUP 1;
ALTER DATABASE DROP LOGFILE GROUP 2;
-- Check current async I/O setting
SHOW PARAMETER disk_asynch_io;
-- Enable if disabled (requires OS support)
ALTER SYSTEM SET disk_asynch_io = TRUE SCOPE=SPFILE;
-- Requires database restart
-- Verify LGWR is using async I/O
SELECT * FROM v$iostat_file
WHERE file_type = 'Log File';

1. Place Redo Logs on Dedicated Fast Storage

Section titled “1. Place Redo Logs on Dedicated Fast Storage”

Never share redo log LUNs with data files, temp files, or archive logs. Use a dedicated high-priority storage pool. In virtualized environments, use thick-provisioned disks with write-back cache enabled at the array level.

2. Monitor Commit Rate and log file sync Latency Continuously

Section titled “2. Monitor Commit Rate and log file sync Latency Continuously”
-- Create a baseline metric to track over time
SELECT
TO_CHAR(s.begin_interval_time, 'YYYY-MM-DD HH24:MI') AS snap_time,
ROUND((e2.time_waited_micro - e1.time_waited_micro) /
NULLIF((e2.total_waits - e1.total_waits), 0) / 1000, 3) AS avg_ms_per_commit
FROM dba_hist_system_event e1
JOIN dba_hist_system_event e2 ON e2.snap_id = e1.snap_id + 1
AND e2.event_name = e1.event_name
AND e2.dbid = e1.dbid
AND e2.instance_number = e1.instance_number
JOIN dba_hist_snapshot s ON s.snap_id = e2.snap_id
AND s.dbid = e2.dbid
AND s.instance_number = e2.instance_number
WHERE e1.event_name = 'log file sync'
ORDER BY s.begin_interval_time DESC
FETCH FIRST 48 ROWS ONLY;

3. Consider COMMIT_WRITE for Async Commits (Non-Critical Data)

Section titled “3. Consider COMMIT_WRITE for Async Commits (Non-Critical Data)”

For workloads where absolute durability is not required for every commit (e.g., session state data, non-financial logging), Oracle supports asynchronous commit — the commit returns immediately without waiting for LGWR. Use with care.

-- Session-level async commit (use cautiously)
ALTER SESSION SET commit_write = 'BATCH,NOWAIT';
-- Or via hint:
COMMIT /*+ WRITE BATCH NOWAIT */;
-- Reset to default (synchronous) when done
ALTER SESSION SET commit_write = 'IMMEDIATE,WAIT';

For bulk loads where data can be reloaded from source if needed, use NOLOGGING to dramatically reduce redo generation (and thus log file sync waits):

-- Direct path insert with NOLOGGING
ALTER TABLE target_table NOLOGGING;
INSERT /*+ APPEND */ INTO target_table SELECT * FROM source_table;
COMMIT;
ALTER TABLE target_table LOGGING;
-- Note: Take a backup after NOLOGGING operations