Home
About
Brendan's blog
Search Find
In this post, Ill provide an example of a USE-based metric list for the Solaris family of operating systems.
Im writing this for later Solaris 10, Oracle Solaris 11, and illumos-based systems: SmartOS, OmniOS. This
is primarily intended for system administrators of the physical systems (not tenants of cloud or zone instances;
for those users, see my SmartOS performance checklist).
Physical Resources
component type metric
per-cpu: mpstat 1, usr + sys; system-wide: vmstat 1, us + sy; per-
CPU utilization process: prstat -c 1(CPU == recent), prstat -mLc 1(USR + SYS);
per-kernel-thread: lockstat -Ii rate, DTrace profile stack()
system-wide: uptime, load averages; vmstat 1, r; DTrace dispqlen.d (DTT)
CPU saturation
for a better vmstat r; per-process: prstat -mLc 1, LAT
fmadm faulty; cpustat(CPC) for whatever error counters are supported (eg,
CPU errors
thermal throttling)
Memory system-wide: vmstat 1, free (main memory), swap (virtual memory); per-
utilization
capacity process: prstat -c, RSS (main memory), SIZE (virtual memory)
system-wide: vmstat 1, sr (bad now), w (was very bad); vmstat -p 1,
Memory api (anon page ins == pain), apo; per-process: prstat -mLc 1, DFL;
dtrace.org/blogs/brendan/2012/03/01/the-use-method-solaris-performance-checklist/ 1/7
12/14/13 Brendan's blog The USE Method: Solaris Performance Checklist
dtrace.org/blogs/brendan/2012/03/01/the-use-method-solaris-performance-checklist/ 2/7
12/14/13 Brendan's blog The USE Method: Solaris Performance Checklist
interconnect
Memory cpustat(CPC) for memory busses, tput / max; or CPI greater than, say, 5; CPC
utilization
interconnect may also have local vs remote counters
Memory
saturation cpustat(CPC) for stall cycles
interconnect
Memory errors cpustat(CPC) for whatever is available
interconnect
I/O busstat(SPARC only); cpustatfor tput / max if available; inference via known
utilization
interconnect tput from iostat/nicstat/
I/O
saturation cpustat(CPC) for stall cycles
interconnect
I/O
errors cpustat(CPC) for whatever is available
interconnect
CPU utilization: a single hot CPU can be caused by a single hot thread, or mapped hardware interrupt.
Relief of the bottleneck usually involves tuning to use more CPUs in parallel.
lockstat and plockstat are DTrace-based since Solaris 10 FCS.
vmstat r: this is coarse as it is only updated once per second.
CPC == CPU Performance Counters (aka Performance Instrumentation Counters (PICs), or
Performance Monitoring Events), read via programmable registers on each CPU, by cpustat(1M) or
the DTrace cpc provider. These have traditionally been hard to work with due to differences
between CPUs, but are getting much easier with the PAPI standard. Still, expect to spend some
quality time (days) with the processor vendor manuals (what cpustat -h tells you to read), and to
post-process cpustat with awk or perl. See my short talk (video) about CPC (2010). (Many years
ago, I made a toolkit including CPC scripts CacheKit that was too much work to maintain.)
Memory capacity utilization: interpreting vmstats free has been tricky across different Solaris
versions (we documented it in the Perf & Tools book), due to different ways it was calculated, and
tunables that affect when the system will kick-off the page scanner. Itll also typically shrink as the
kernel uses unused memory for caching (ZFS ARC).
Be aware that kstat can report bad data (so can any tool); there isnt really a test suite for kstat data,
and engineers can add new code paths and forget to add the counters.
DTT == DTraceToolkit scripts, DTB == DTrace book scripts.
CPI == Cycles Per Instruction (others use IPC == Instructions Per Cycle).
I/O interconnect: this includes the CPU to I/O controller busses, the I/O controller(s), and device
busses (eg, PCIe).
Software Resources
component type metric
Kernel
utilization lockstat -H(held time); DTrace lockstat provider
mutex
Kernel lockstat -C(contention); DTrace lockstat provider; spinning shows up with
saturation
mutex dtrace -n 'profile-997 { @[stack()] = count(); }'
Kernel lockstat -E, eg recusive mutex enter (other errors can cause kernel
errors
dtrace.org/blogs/brendan/2012/03/01/the-use-method-solaris-performance-checklist/ 3/7
12/14/13 Brendan's blog The USE Method: Solaris Performance Checklist
lockstat/plockstat often drop events due to load; I often roll my own to avoid this using the DTrace
lockstat/plockstat provider (examples in the DTrace book).
File descriptor utilization: while other OSes have a system-wide limit, Solaris doesnt (at least at the
moment, this could change; see my writeup about it).
Whats Next
See the USE Method for the follow-up strategies after identifying a possible bottleneck. If you complete this
checklist but still have a performance issue, move onto other strategies: drill-down analysis and latency
analysis.
Also see my USE method performance checklists for SmartOS, Linux, Mac OS X, and FreeBSD.
2 Responses
dtrace.org/blogs/brendan/2012/03/01/the-use-method-solaris-performance-checklist/ 4/7
12/14/13 Brendan's blog The USE Method: Solaris Performance Checklist
1. Written by Kebabbert
on March 8, 2012 at 1:48 am
Permalink
Wow! Great list!!!! Thank you for sharing this info. :o)
Brendan,
I have been using dtrace to successfully identifying bottlenecks on SUN servers (particularly Global
zones with 20-30 containers). I am sure, the USE method takes it further in quickly isolating
performance problems when dealing with resource contention. I like the template that USE method
provides and we can always expand on these.
-Harsha Nippani
Previous post
Next post
Recent Posts
Cloud Performance Training
Systems Performance: available now
Open Source Systems Performance
The TSA Method
Control T for TENEX
The USE Method: Unix 7th Edition Performance Checklist
The USE Method: FreeBSD Performance Checklist
The USE Method: Mac OS X Performance Checklist
Memory Leak (and Growth) Flame Graphs
dtrace.org/blogs/brendan/2012/03/01/the-use-method-solaris-performance-checklist/ 5/7
12/14/13 Brendan's blog The USE Method: Solaris Performance Checklist
My Books
Tags
7410 analytics art benchmarking book cloud cloud analytics CPI dtrace example experimental
filesystem frequencytrail heatmaps illumos iSCSI javascript joyent L2ARC latency limits linux macosx
methodology mysql NAS nfs off-cpu omnios performance personal PICs pid provider slides
SLOG smartos solaris SSD statistics talk testing usemethod video visualizations ZFS
People
Adam Leventhal dtrace.org
dtrace.org/blogs/brendan/2012/03/01/the-use-method-solaris-performance-checklist/ 6/7
12/14/13 Brendan's blog The USE Method: Solaris Performance Checklist
Links
Brendan's homepage
Joyent
SolarisInternals
Meta
Log in
Entries RSS
Comments RSS
WordPress.org
Copyright 2013 Brendan Gregg, all rights reserved
Brendan's blog.
Powered by WordPress and Grey Matter.
dtrace.org/blogs/brendan/2012/03/01/the-use-method-solaris-performance-checklist/ 7/7