Understanding LGWR, Log File Sync Waits and Commit ... - E2SN [PDF]

Understanding LGWR, Log File Sync Waits and Commit Performance Tanel Põder http://blog.tanelpoder.com http://tech.e2sn.com

Enterprise 2.0 Service Networks

© 2009-2010 www.e2sn.com

WWW.E2SN.COM

1

Intro: About me • Tanel Põder • http://tech.e2sn.com - My company and technical Oracle stuff • http://blog.tanelpoder.com - Personal Blog (more tech stuff) • [email protected] - Questions & enquiries

• Consulting, Training, Seminars • [email protected] • http://tech.e2sn.com/oracle-training-seminars • Online seminars coming soon!


© 2009-2010 www.e2sn.com

WWW.E2SN.COM

2

Topics • How does commit & log file sync work • Overview

• Reasons for “too long” log file sync waits

• How to measure where’s the problem?


© 2009-2010 www.e2sn.com

WWW.E2SN.COM

3

Reasons for log file sync waits • Commits wait for log file sync by default • User commits • There’s an user commits statistic in v$sesstat

• DDL • Resulting recursive transactions commit

• Recursive data dictionary DML

• Rollbacks wait too! • User rollbacks • User/application issued a rollback command

• Transaction rollbacks • We had an internal rollback (because of some failure) • Space allocation/ASSM problems, cancelled queries, killed sessions


© 2009-2010 www.e2sn.com

WWW.E2SN.COM

4

Commit & log file sync flow – idealistic overview Time

1) User issues a COMMIT 5) LGWR posts the FG proc.

2) Foreground proc posts LGWR

FG proc.

3) LGWR issues the physical write syscall

LGWR

6) Commit complete

4) The physcical write syscall completes

IO

log file parallel write log file sync Enterprise 2.0 Service Networks

© 2009-2010 www.e2sn.com

WWW.E2SN.COM

5

Log file sync performance -> disk IO speed Time

The physical write IO (log file parallel write) takes most of the time

FG proc.

LGWR IO

Most of the log file sync time was spent waiting on log file parallel write The other components, scheduling latency, IPC were small


© 2009-2010 www.e2sn.com

WWW.E2SN.COM

6

Log file sync performance -> scheduling latency Time

1) User issues a COMMIT

2) LGWR waits in CPU runqueue

6) Foreground proc gets posted, gets onto CPU runqueue

FG proc.

LGWR IO

3) LGWR submits the IO, goes to sleep


© 2009-2010 www.e2sn.com

4) IO completes, OS puts LGWR to CPU runqueue WWW.E2SN.COM

5) LGWR gets onto CPU, posts foreground proc.

7

Log file sync flow 1. Foreground process (FG) posts LGWR and goes to sleep • The “log file sync” wait starts • Posting is done via a semaphore operation on Unix/Linux

2. LGWR wakes up, gets onto CPU • Issues the IO request(s) • LGWR goes to sleep, waiting for “log file parallel write” wait

3. Hardware completes the IO and OS wakes up LGWR • LGWR gets onto CPU • Marks “log file parallel write” event complete and posts the FG

4. Foreground process is woken up by LGWR post • Foreground process gets onto CPU and completes the “log file sync” wait Enterprise 2.0 Service Networks

© 2009-2010 www.e2sn.com

WWW.E2SN.COM

8

Measuring LGWR "speed" SQL> @snapper out 1 10 1096 -- Session Snapper v2.01 by Tanel Poder ( http://www.tanelpoder.com ) --------------------------------------------------------------------------------SID, USERNAME , TYPE, STATISTIC , DELTA, HDELTA/SEC, %TIME, GRAPH --------------------------------------------------------------------------------1096, (LGWR) , STAT, messages sent , 12 , 12, 1096, (LGWR) , STAT, messages received , 10 , 10, 1096, (LGWR) , STAT, background timeouts , 1 , 1, 1096, (LGWR) , STAT, physical write total IO requests , 40, 40, 1096, (LGWR) , STAT, physical write total multi block request, 38, 38, 1096, (LGWR) , STAT, physical write total bytes, 2884608 , 2.88M, 1096, (LGWR) , STAT, calls to kcmgcs , 20 , 20, 1096, (LGWR) , STAT, redo wastage , 4548 , 4.55k, 1096, (LGWR) , STAT, redo writes , 10 , 10, 1096, (LGWR) , STAT, redo blocks written , 2817 , 2.82k, 1096, (LGWR) , STAT, redo write time , 25 , 25, 1096, (LGWR) , WAIT, LGWR wait on LNS , 1040575 , 1.04s, 104.1%, |@@@@@@@@@@| 1096, (LGWR) , WAIT, log file parallel write , 273837 , 273.84ms, 27.4%,|@@@ | 1096, (LGWR) , WAIT, events in waitclass Other , 1035172 , 1.04s , 103.5%,|@@@@@@@@@@| -- End of snap 1, end=2010-03-23 12:46:04, seconds=1


© 2009-2010 www.e2sn.com

WWW.E2SN.COM

9

LGWR and Asynch IO oracle@linux01:~$ strace -cp `pgrep -f lgwr` Process 12457 attached - interrupt to quit ^CProcess 12457 detached % time seconds usecs/call calls errors This is --------what the log-------------- ----------- ----------file263 parallel write38wait 100.00 0.010000 3 0.00 0.000000 event0measures213 – AIO 0.00 0.000000 0 8 reaping duration 0.00 0.000000 0 701 0.00 0.000000 0 41 0.00 0.000000 0 41 This system call is2not 0.00 0.000000 0 instrumented by37wait 0.00 0.000000 0 ------ ----------- -----------Interface! --------- --------100.00 0.010000 1081 3


© 2009-2010 www.e2sn.com

WWW.E2SN.COM

syscall -------------semtimedop times getrusage gettimeofday io_getevents io_submit semop semctl -------------total

10

Warning • The (background processes) IO instrumentation has quite a few bugs • Different IO modes, sync, asynd, direct, buffered etc

• On some versions, the log file parallel write (and db file parallel write) aren’t properly instrumented • Version dependent, for example 11.2.0.1 on Linux • When filesystemio_options = NONE, IO syscall waits are instrumented ok (but you don’t wan to use this option) • When filesystem_io_options = ASYNC, IO reaping waits are all very short • However there’s unaccounted time in LGWR’s wait profile

• When filesystem_io_options = SETALL, IO reaping waits are instrumented properly Enterprise 2.0 Service Networks

© 2009-2010 www.e2sn.com

WWW.E2SN.COM

11

Redo, commit related latches and “tuning” • Redo related latches • redo allocation latches • Protect allocating space in log buffer / RBA ranges in redolog stream

• redo copy latches • Used only for keeping track of whether anyone’s copying data into redo log buffer • …so that LGWR would know to wait for these memory copies to complete before it tries to write buffers to disk • LGWR will wait for LGWR wait for redo copy wait event in such cases • Used to be tuned by _log_simultaneous_copies

• Should we “tune” any of these? • No, we should fix only problems which exist • In other words, if wait interface doesn’t show anyone waiting for them, then don’t bother “tuning” them!


© 2009-2010 www.e2sn.com

WWW.E2SN.COM

12

Instrumentation • Wait Events: • log file sync • log file parallel write • log file single write

• Performance Counters (V$SESSTAT, V$SYSSTAT) • • • • •


redo size redo writing time user commits user rollbacks transaction rollbacks

© 2009-2010 www.e2sn.com

WWW.E2SN.COM

13

Wait event: log buffer space • (Not a commit problem) • LGWR is too slow flushing redo log buffer contents to disk • Either because too slow IO subsystem • Or LGWR not getting enough (quality) CPU time • Sometimes pops up due large (unplanned) transactions

• Of course, it can also be because of a too small log buffer • Which is not the case anymore in modern days • Log buffer is usually multiple MB due how it is allocated from SGA • You shouldn’t even set the log_buffer parameter in 10g+


© 2009-2010 www.e2sn.com

WWW.E2SN.COM

14

Wait event: log file single write • Single block redo IO is used mostly for logfile header block reading/writing • Log switch is the main cause • Archiving as well as it updates log header • Who wait: LGWR & ARCH

• Example of what LGWR does during a log switch: WAIT #0: nam='log file sequential read' ela= 12607 log#=0 block#=1 WAIT #0: nam='log file sequential read' ela= 21225 log#=1 block#=1 WAIT #0: nam='control file sequential read' ela= 358 file#=0 WAIT #0: nam='log file single write' ela= 470 log#=0 block#=1 WAIT #0: nam='log file single write' ela= 227 log#=1 block#=1


© 2009-2010 www.e2sn.com

WWW.E2SN.COM

15

LGWR trace warnings… • Starting from 10.2.0.3 (or was it 10.2.0.4) LGWR is trying to be helpful and dump warnings when the actual log write IO takes too long: • New parameter: _side_channel_batch_timeout_ms • timeout before shipping out the batched side channelmessages in milliseconds

• LGWR trace file: *** 2010-03-10 11:36:06.759 Warning: log write time 690ms, size 19KB *** 2010-03-10 11:37:23.778 Warning: log write time 52710ms, size 0KB *** 2010-03-10 11:37:27.302 Warning: log write time 3520ms, size 144KB Enterprise 2.0 Service Networks

© 2009-2010 www.e2sn.com

WWW.E2SN.COM

16

Log file sync in Statspack / AWR • How much of the end-to-end response time goes to log file sync? • How big it is compared to the full response time of the end user? • Log file sync may take 20% of your DB Time… • …but the DB Time itself may take only 10% of the total end user response time!


© 2009-2010 www.e2sn.com

WWW.E2SN.COM

17

Log file sync in Statspack / AWR • If log file sync waits take a significant part of the response time, look into the “Avg wait (ms)” column: Top 5 Timed Events Avg %Total ~~~~~~~~~~~~~~~~~~ wait Call Event Waits Time (s) (ms) Time ------------------------------------- ------------ ----------- ------ -----PL/SQL lock timer 57,159 8,754 153 68.8 db file sequential read 61,258 1,562 25 12.3 log file sync 5,873 1,522 259 12.0 CPU time 463 3.6 direct path write 235,606 155 1 1.2 ............

Event Waits ------------------------ -----------log file parallel write 13,292 db file parallel write 3,470 db file sequential read 257


© 2009-2010 www.e2sn.com

%Tim Total Wait out Time (s) ---- ---------0 108 0 39 0 4

WWW.E2SN.COM

Avg %Total wait Waits Call (ms) /txn Time ------ -------- -----8 0.9 .8 11 0.2 .3 16 0.0 .0

18

Better breakdown of wait times • V$EVENT_HISTOGRAM • Instead of a single wait time average, breaks wait times into buckets SQL> select event, wait_time_milli,wait_count 2 from v$event_histogram 3 where event = 'log file parallel write'; EVENT WAIT_TIME_MILLI WAIT_COUNT ------------------------- --------------- ---------log file parallel write 1 22677 log file parallel write 2 424 log file parallel write 4 141 log file parallel write 8 340 log file parallel write 16 1401 log file parallel write 32 812 log file parallel write 64 391 log file parallel write 128 21 log file parallel write 256 6


© 2009-2010 www.e2sn.com

WWW.E2SN.COM

19

Identifying scheduling latency Easy on Solaris • prstat –m • Works since Solaris 8

• Timed_os_statistics • OS Wait-cpu (latency) time in v$sesstat

On other OS’es, it can’t be directly measured with standard tools • Indirectly, you can look into system wide average CPU runqueue length and assume that LGWR was also queueing • Not a too systematic approach, huh?


© 2009-2010 www.e2sn.com

WWW.E2SN.COM

20

Measure scheduling latency (Solaris) • Reading thread-level microstate accounting data with prstat: • • • • • • •

USR SYS TRP TFL/DFL LCK SLP LAT

# prstat -mLp 5124 PID USERNAME USR 5124 oracle 4.2 5124 oracle 0.0 5124 oracle 0.0 5124 oracle 0.0 5124 oracle 0.0 5124 oracle 0.0 5124 oracle 0.0 5124 oracle 0.0 5124 oracle 0.0


© 2009-2010 www.e2sn.com

-

% % % % % % %

SYS 1.6 0.6 0.6 0.6 0.6 0.0 0.0 0.0 0.0

Time spent on CPU in user mode Time spent on CPU in kernel mode Time spent processing system traps (CPU traps) Time spent processing text/data page faults Time spent waiting for user locks Time spent sleeping (other than user locks) CPU scheduling latency

TRP 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

TFL 0.6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

DFL 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

LCK 0.0 99 99 99 99 0.0 100 100 100

SLP 94 0.0 0.0 0.7 0.0 100 0.0 0.0 0.0

WWW.E2SN.COM

LAT VCX ICX SCL SIG PROCESS/LWPID 0.1 412 20 3K 0 oracle/1 0.0 44 3 90 0 oracle/11 0.0 44 0 90 0 oracle/7 0.0 46 0 90 0 oracle/9 0.0 44 1 90 0 oracle/5 0.0 24 0 7 0 oracle/2 0.0 1 0 5 0 oracle/3 0.0 1 0 5 0 oracle/10 0.0 1 0 5 0 oracle/8

21

Bugs, problems • No instrumentation • On some version/platform/IO configuration the wait interface doesn’t record log file parallel write waits at all • The same goes for db file parallel writes • (I’ve noticed it on 9.2-10.2.0.x on Solaris for example)

• For LGWR you can use V$SESSTAT redo write time statistic instead • It’s in centiseconds

• 1-second log file sync bug • Most log file syncs took ~1 second to complete • The posts sent back by LGWR were missed by foreground process • Thus the FG always waited until the 1 second log file sync wait timeout happened


© 2009-2010 www.e2sn.com

WWW.E2SN.COM

22

Tuning • No need for tuning! • Log buffer is quite large by default • All memory remaining in a granule after the allocation for fixed SGA is given to log buffer

• Oracle used to have a single redo log buffer until v9.0 • Redo allocation latch could become the ultimate contention point

• Since 9.2, Oracle can have the log buffer split into multiple buffers • Each protected by a separate redo allocation latch

• From 10g, Oracle can keep lots of small private redo strands in shared pool • Each protected by a separate redo allocation latch • @rs.sql – Show redo strands available


© 2009-2010 www.e2sn.com

WWW.E2SN.COM

23

Evil tuning • If you don’t care about the D in ACID (and want to occasionally lose data for fun), then: • 10gR1: • commit_logging

transaction commit log write behavior

• 10gR2: • commit_write • commit_wait

transaction commit log write behavior transaction commit log wait behavior

• Old undocumented stuff • _wait_for_sync

wait for sync on commit MUST BE ALWAYS TRUE

• Old: Put redologs to /tmp (on Solaris) or in-memory disks (/dev/shm) for duration of a migration/upgrade • If your OS / server crashes, you’ll need to restore from a backup! Enterprise 2.0 Service Networks

© 2009-2010 www.e2sn.com

WWW.E2SN.COM

24

Optimizations for working around bad applications • Commit optimization • In PL/SQL since Oracle 9i • The log file sync is deferred until the end of the PL/SQL call! SQL> exec while true loop update t set a=a+1 ; commit ; end loop;

• No log file sync waits • log buffer space / log file switch completion waits more likely!


© 2009-2010 www.e2sn.com

WWW.E2SN.COM

25

LGWR configuration – CPU • Prevent priority decay • Put LGWR into fixed priority scheduling class (FX60 on Solaris) • LGWR should get onto CPU faster when waking up • LGWR isn’t thrown off CPU as likely

• If LGWR is still experiencing significant scheduling latency • You can put LGWR into a higher priority class

• You should not put LGWR into the highest real-time class • Real time is tricky – your process can monopolize a CPU for itself • You don’t want to make LGWR pre-empt the OS kernel! • Note that Oracle sets some processes into higher priority by default: • _high_priority_processes Enterprise 2.0 Service Networks

© 2009-2010 www.e2sn.com

WWW.E2SN.COM

LMS*|VKTM 26

LGWR configuration - IO • Reduce the amount of work and waiting a “log file parallel write” has to do • Unbuffered concurrent IO • Verify with truss/strace whether proper flags are used • (O_DIRECT, O_DIO, O_CIO etc)

• Or use raw devices • ASM is essentially a raw device

• Or ODM for some cases

• And optimize the whole IO hardware stack, of course! • Note shat mid-large size storage arrays do have write cache built in • So, moving redo log files to SSD may not give any advantage! • Verify what’s your current “log file parallel write” latency using v$event_histogram Enterprise 2.0 Service Networks

© 2009-2010 www.e2sn.com

WWW.E2SN.COM

27

log file sync magic tuning super-secret!!!

COMMIT LESS!!! Commit when your business transaction ends, not after every single update!


© 2009-2010 www.e2sn.com

WWW.E2SN.COM

28

Summary • Application: Commit less! • Ideally only when your logical business transaction ends

• Troubleshooting: Measure log file sync at session level detail • CPU: If waits for log file sync are significant - see whether LGWR gets: • Enough (quality) CPU time • Onto the CPU fast enough

• IO: See how much LGWR waits for log file parallel write event • What’s the log file parallel write completion time • V$EVENT_HISTOGRAM for better detail Enterprise 2.0 Service Networks

© 2009-2010 www.e2sn.com

WWW.E2SN.COM

29

Thanks! • Download slides from: • http://tech.e2sn.com/oracle-slides-and-whitepapers

• Download Snapper from: • http://tech.e2sn.com/oracle-scripts-and-tools/session-snapper

• Blog: • http://blog.tanelpoder.com

• Email: • [email protected]


© 2009-2010 www.e2sn.com

WWW.E2SN.COM

30