Alex Zhitnitsky
Table of Contents
Introduction...............................................................................................................Page 2
Chapter 1...................................................................................................................Page 3
Keep it Actionable: Whats the Purpose of Using Exceptions?
Lets break some myths. Exceptions are mostly ignored (and not that exceptional). This
chapter covers practical tips for producing meaningful and actionable exceptions.
Chapter 2...................................................................................................................Page 10
Source Code Crunch: Lessons from Analyzing over 600,000 Java projects
An overview of exception handling in over 600,000 Java projects. In this chapter youll see
data about how exceptions are actually used (and misused).
Chapter 3...................................................................................................................Page 17
Production Data Crunch: 1,000 Java Applications, 1 Billion Logged Errors
Now its time to see what happens in production. This chapter introduces the Pareto
logging principle showing that 97% of logged errors originate from 3% of unique events.
Chapter 4...................................................................................................................Page 23
Know Your Enemy: The Top 10 Exceptions Types in Production
In this chapter we dive into the most common exceptions in production, their frequency,
and possible solution strategies. The infamous NullPointerException is obviously #1.
Final Thoughts..........................................................................................................Page 30
Introduction
Chapter 1
Real exceptions on the other hand, are things you werent aware of when writing the
code, like an OutOfMemoryException, or even a NullPointerException that messes
things up unexpectedly. Issues that require you to take action to resolve them.
Exceptions may be thrown by the code run inside the future, but they are contained
and dont leak outside. The possibility of failure is made explicit by the Failure(t)
branch and its very easy to follow code execution.
In the new Java 8 CompletableFuture feature (of which we also recently wrote), we
can use completeExceptionally() although its not as pretty.
A common solution is the DB library using a base exception, say, DBException, from
which library exceptions inherit. This allows the library user to catch all library errors
with one try block. But what about the system errors that may have caused the library to err? The common solution is to wrap any exception happening inside it. So if
its unable to resolve a DNS address, which is more of a system error then a library
error, it will catch it and rethrow this higher level exception which the user of the library should know to catch. Try-catch nightmare, with a hint of nested exceptions
wrapping other exceptions.
If we put Actors into the mix, the control flow even gets messier. Async programming
with exceptions is a mess. It can kill an Actor, restart it, a message will be sent to
some other Actor with the original error and you lose the stack.
Performance
Stability
And Hide real exceptions that you want to see and act on
The solution is doing the hard work of pruning away the noise and creating control
flows that make more sense. Another creative solution is changing the log levels, if
its not an actionable exception, dont log it as an error. Thats only a cosmetic solution but might get you to 80% of the work.
Ultimately, logs and dashboards are only cosmetics, theres a need to fix the issue at
its core and avoid unactionable exceptions altogether.
To check out the current state of exceptions and logged errors in your application,
attach the Takipi agent and youll have a complete understanding of how code
behaves in your production environment (and how to fix it) in a matter of minutes.
Check it out.
Final Thoughts
The bottom line is, do you have an Exception that doesnt result in code changes?
You shouldnt even be wasting time looking at it.
Chapter 2
10
The top 10 exception types in catch clauses, source: Analysis of Exception Handling
Patterns in Java
Well well, what do we have here? The research found that checked exceptions
account for almost three times the number of unchecked exceptions in Java
projects. Cant upset the compiler here.
Another insight is that developers often catch checked exceptions at the top level,
using the Throwable and Exception classes.
To learn more about how checked exceptions are handled, the researchers examined
the Exception and Throwable handlers. 78% of the methods that caught Exception
did not catch any of its subclasses, same as 84% of Throwable.
Meaningless catch clauses.
Next up, lets find out whats going on inside these catch clauses.
Maybe theres hope.
11
Top operations in checked exception catch clauses, source: Analysis of Exception Handling Patterns in Java
We see that log statements and e.printStackTrace() are at the top, making them the
top operations used in checked exception catch blocks, which helps debug the situation and understand what happened.
Sneaking up on them are the notorious empty catch blocks. Joshua Bloch describes
in Eective Java what would ideally happen, To capture the failure, the detail
message of an exception should contain the values of all parameters and fields that
contributed to the exceptions. Empty catch blocks are defeating this purpose.
All rights reserved to Takipi 2016
12
Another common use case is throwing an unchecked exception that replaces the
checked exception.
Mario Fusco summed it up pretty good on his twitter feed:
Top operations used in exception handling (Github), source: Analysis of Exception Handling Patterns in Java
13
20% of the total (6,172,462) catch blocks are empty. This is quite bad. Connecting
the dots with the fact that exceptions which are higher at the hierarchy are used
more frequently than specific types, the researchers arrived to the conclusion that
most participants seemed to give a low priority to exception handling as a task, or
included exceptions in their code only when the language forced them to handle
checked exceptions.
Eventually, the product quality suers.
14
15
Final Thoughts
Exceptions should be reserved to exceptional situations, but other things happen
in practice. Checked exceptions become unchecked, empty catch blocks are all over
the place, control flow is mixed with error flow, theres lots of noise and critical data
goes missing. Its a mess.
This was a main motivation to us for building Takipi, a Java agent that monitors JVMs
in production and takes care of filling in the blanks with everything you need to know
about how exceptions behave (and how to solve them).
16
Chapter 3
17
Log Reality
Heres a dose of reality, you will only log what you think will break an application, and
youre constrained by how much you can log without incurring unnecessary
overhead on your application. This is why debugging through logging doesnt work in
production and why most application issues go undetected.
Lets assume you do manage to find all the relevant log events, thats not the end of
the story. The data you need isnt usually not in there, and leaves you adding
additional logging statements, creating a new build, testing, deploying and hoping
the error happens again. Ouch.
NullPointerException
NumberFormatException
IllegalArgumentException
RuntimeException
IllegalStateException
NoSuchMethodException
ClassCastException
Exception
ParseException
InvocationTargetException
All rights reserved to Takipi 2016
18
So there you have it, the pesky NullPointerException is to blame for all thats broken
in log files. Ironically, checking for null was the first feedback I got in my first code
review back in 2004 when I was a java developer.
Right, here are some numbers from a randomly selected enterprise production
application over the past 30 days:
25 JVMs
29,965,285 errors
~8.7TB of storage
NumberFormatException
NoSuchMethodException
Custom Exception
StringIndexOutOfBoundsException
IndexOutOfBoundsException
IllegalArgumentException
IllegalStateException
RuntimeException
Custom Exception
Custom Exception
19
Log management software licenses to parse, transmit, index and store this
data over your network
20
The easiest way to solve operational noise is to fix application errors versus ignore
them. Not only will this dramatically improve the operational insight of your teams,
youll help them detect more issues and troubleshoot much faster because theyll actually see the things that hurt your applications and business.
The Solution
If you want to identify and fix the top 10 errors in your application, download Takipi
for free, stick it on a few production JVMs, wait a few hours, sort the errors captured
by frequency and in one-click Takipi will show you the exact source code, object and
variable values that caused each of them. Your developers in a few hours should be
able to make the needed fixes and Bob will be your uncle.
The next time you do a code deployment in production Takipi will instantly notify you
of new errors which were introduced and you can repeat this process. Heres two
ways we use Takipi at Takipi to detect new errors in our SaaS platform:
Slack Real-time Notifications which inform our team of every new error introduced in
production as soon as its thrown, and a one-click link to the exact root cause
(source code, objects & variable values that caused the error).
21
Email Deployment Digest Report showing the top 5 new errors introduced with direct links to the exact root cause.
Final Thoughts
We see time and time again that the top few logged errors in production are pulling
away most of the time and logging resources. The damage these top few events
cause, each happening millions of times, is disproportionate to the time and eort it
takes to solve them.
22
Chapter 4
23
To pull out the data, we crunched anonymized stats from over a 1,000 applications
monitored by Takipis error analysis micro-agent, and checked what were the top 10
exception types for each company. Then we combined all the data and came up with
the overall top 10 list.
Every production environment is dierent, R&D teams use dierent 3rd party libraries,
and also have custom exception types of their own. Looking at the bigger picture,
the standard exceptions stand out and some interesting patterns become visible.
24
True story
25
1.
2.
26
1.
2.
Rethrows:
One famous story around checked vs. unchecked and the last use case we
described here comes from Amazons AWS SDK which ONLY throws unchecked
exceptions and refuses to use checked exceptions.
Takipi RuntimeExceptionMonster
27
28
1.
You need an exception and just too lazy to specify what it actually is.
2.
Conclusion
The world of Java exceptions is indeed quite colorful, and its amazing to see how
much impact the top 10 exceptions have on our logs. 97% of all logged errors come
from 10 unique exceptions.
29
Final Thoughts
We hope that youve found this guide useful. The research and insights originate from
our teams experience building Takipi and supporting thousands of Java applications
in production. The only monitoring tool that provides full state, source, and stack for
every error in production.
Comments or questions? Please get in touch through hello@takipi.com
Takipi detects 100% of errors, caught and uncaught exceptions, HTTP and log errors.
30