The Pragmatics Of Java Debugging
The Pragmatics Of Java Debugging
Mar. 1, 2001 12:00 AM
Essential to the development of complex systems are tools that help the developer locate, analyze, and fix problems. Debuggers provide support for this by letting a developer inspect the internal state of a program at runtime, as well as suspend and resume execution statement by statement.
The originators of the Java programming language defined a debugging architecture, but since its conception Java has advanced into new areas of deployment topologies and optimization technologies that present a further set of problems. This article covers some of the background behind these issues as well as the activity in the Java community to provide solutions. Examples of debugging solutions are drawn from the IBM VisualAge for Java integrated development environment (IDE), al-though the issues are applicable to other environments as well.
Dynamic interpretation of the bytecodes into machine instructions gives Java the advantages of portability and operating system neutrality, but it comes at the price of reduced performance.
When a program gets compiled into bytecodes, the compiler focuses on making the bytecodes executable by the target JVM. However, if the program is going to be debugged, the developer needs to be able to trace the bytecodes back to their source. Java is a very dynamic language, so the physical layout of a class is not determined at compile-time but is actually generated by the JVM when it loads the class. Therefore all class, method, and field names are preserved in the class file, which also includes the name of the source file and a line number table that maps ranges of bytecodes to their corresponding line in the source file.
The JVM provides an API for setting breakpoints and manipulating the stack frame. This is the API that the Java Development Kit (JDK) debugger uses and it also surfaced in the package sun.tools.debug. The JDK debugger is a command-line debugger that, when executed (with the jdb command), lets the developer perform basic tasks such as inserting breakpoints on statement lines, catching thrown exceptions when they're raised, and printing and advancing the stack trace. The jdb program is limited in its functionality due to its command line interface, but there are a number of good debuggers that have graphical interfaces to let the developer view and monitor the program while debugging it. The IBM Distributed Debugger and Symantec Visual Cafe are two such debuggers, as is the integrated debugger that comes with the VisualAge for Java IDE.
To reduce the size of the class file, the javac compiler normally removes symbolic information not required at runtime, such as method argument and local variable names. The debug API returns generated names such as arg_1, instead of the original argument name given by the developer. If you need to debug a program with the original names the javac option -g can be used.
Writing a Program to Aid Debugging
anObject.setSomething(anotherObject.getSomething() + yetAnotherObject.getSomethingElse() );In the debugger, you can't see the result of getSomething()or getSomethingElse(). The code could be rewritten as:
Object something = anotherObject.getSomething();When you single-step through this code, you can see the result of each method result in turn. The debugger API lets you insert breakpoints at the start of each statement, so writing code that doesn't combine many expressions into a single statement also provides more target points to insert a potential breakpoint.
Debugging a Program Running Inside a JVM
When the JVM raises an exception or hits a breakpoint, the debugger visually shows a stack trace of the program so developers can inspect the contents of the program variables. The Distributed Debugger also shows the source code if it's available for the debugger to use (see Figure 1).
When the debugger has halted program execution and you've inspected the program state, you can use the step functions to continue execution statement by statement, or just resume the program until it reaches the next uncaught exception or breakpoint. This lets you understand the dynamics of the program execution and locate errors. Once you've located a problem you might see the errant code and want to change some of the source and test the change. To do so typically requires that you exit the program, recompile the .java source file, and rerun the program with the new source. In an environment with fast turnaround time where you're trying to quickly test and fix code, this stop-change-compile-start cycle can be time-consuming.
In addition, if the program is being debugged in a server environment, replacing the server class files might be difficult. If the server isn't designed for development it might place locks on jar files, and there may be no way to reload a modified class, requiring you to restart the server, then re-create the error condition. Servers may take a long time to restart and, if the server is a production server, downtime may be difficult to obtain. Some servlet engines, such as WebSphere, can reload a servlet, but if the servlet uses other classes such as JavaBeans or EJBs, and you modify them, you may still need to restart the server.
Debugging a Program Running Inside an IDE
Developers can also execute an ad hoc piece of java code or modify program variables from within a running program. When the program is suspended in the debugger, the developer can select any method in the call stack and tell the debugger to drop to this frame. This is rather like a rewind button and is useful if the debugger gets invoked by an exception and the developer needs to go back a few method calls and retrace some steps to see the root cause of a problem. NullPointerException is a good example of where the exception is usually thrown too late; the problem isn't the method call on a null variable; it's the earlier method call that was supposed to set the variable to a non-null value that needs debugging. Figure 2 shows the integrated debugger that comes with VisualAge for Java.
The VisualAge for Java debugger works on a Java source running within its own virtual machine. The virtual machine is part of the development environment, which means that it can't be easily replaced. To debug code executing in a different JVM, you can use the Distributed Debugger, but this sacrifices the ability to execute ad hoc java code, rewind program execution, change variable contents, and modify source inside a running program.
VisualAge for Java also supports this fast code-debug-fix cycle for the WebSphere environment. A stripped-down version of the WebSphere environment, also known as the WebSphere Test Environment, is shipped as part of VisualAge for Java. The WebSphere Test Environment executes as part of VisualAge for Java's internal virtual machine, which means you can test and debug server components such as servlets or EJBs using the techniques described previously. When you've completed the first pass of testing your code, you can export and deploy it in a true WebSphere environment where you can further debug it with the Distributed Debugger. In addition to hosting the WebSphere Test Environment (which emulates the true WebSphere environment inside the IDE), you can also load other environments such as Jakarta Tomcat and New Atlanta ServletExec into VisualAge for Java, which lets you do integrated development and debugging inside the IDE before deploying the server code into a production environment.
Java Debug Topology
To allow the debugger to work with both local and remote programs, it's split into two portions: the debugger user interface that the developer uses to view and control the program being debugged, and the debug engine that the interface talks to and in turn debugs the JVM. This separation of interface from engine, which is part of the VisualAge for Java Distributed Debugger, allows multiple engines to be controlled from the same interface. This means that developers can seamlessly debug Java on their client through remote calls on a server such as an IBM OS/390 or an AS/400, or a Microsoft Windows NT server.
Translated Java Programs
A number of vendors, such as IBM and Oracle, recognized the need to let the developers debug the source they wrote rather than the translated source. In the C language, a similar problem is caused by the preprocessor, but the source line mapping is preserved through the use of the #line pragma. However, the Java Language Specification did not include a preprocessor or the equivalent of the #line pragma, so each vendor had to implement a proprietary way to preserve line numbers, typically by inserting comments into the translated Java source. Java Specification Request (JSR) 45 defines a standard line-mapping table to preserve the correspondence from the original source (e.g., JSP or SQLJ) to the translated Java source, which could then be read by the javac compiler or another program to postprocess the line number information stored in the class file.
After a class file has been modified to reflect the original source file name and line number information, standard debuggers can be used to debug the untranslated source. The difference can be seen in Figures 3 and 4. Figure 3 shows the VisualAge for Java debugger with the breakpoint in the JSP source statement, whereas Figure 4 shows the breakpoint in the Java servlet source that the JSP was translated into.
The Distributed Debugger can be opened from within the OLT, which lets you examine and profile program flow, as well as perform the traditional debugging tasks of setting breakpoints, inspecting program variables, and controlling program flow. The OLT is useful for viewing the dynamics of a distributed application, because it includes a trace server that receives trace messages from each process involved in the distributed application. The individual trace messages are assembled so you can follow the sequence of events from process to process.
Another problem that exists with debugging server programs is that when a problem occurs in the server, it's often difficult to re-create the same problem in another environment. To do so requires staging environments that must mimic the server environment in terms of hardware resources as well as runtime conditions such as server load. The reality is that some problems that occur in a production environment can never be re-created in a development environment. The Distributed Debugger can be configured so that a problem on a server can actually call back and invoke a client to let the developer debug the server environment from their console. The ability to have a fully functional debugger opened on a server program that's thrown an exception gives you a much greater view of the problem than just a textual stack trace.
Many commercial JVMs aren't able to cope with this and disable JIT compilation when they execute a program in debug mode. The IBM and Sun JDKs both disable JIT in debug mode, but the VisualAge for Java IDE does not. Since interpreted bytecodes run 10-20 times slower than machine code, debugging can become a tedious process, especially if you're debugging a complex program such as a Web application server. This is another factor that makes debugging in the WebSphere Test Environment more productive because the VisualAge for Java IDE's JVM always performs JIT optimization.
Disabling JIT can also lead to problems, because it means you're not actually debugging the same program that you'd otherwise be executing. Subtle bugs such as those caused by race conditions or nonsynchronized object lock conflicts might not appear in the slower running debugged program.
This is rather like the quantum physics problem of Schroendinger's cat. In this thought experiment the physicist is presented with a closed box that contains a cat and a cannister of poison gas that's released when a radioactive atom decays. Since the atom exists in a superposition of decayed and nondecayed quantum states, the cat exists in a superposition of dead and alive states. Only by observing the system is it forced into a definite state. Is the cat dead or alive? For the physicist there's no real way of knowing because once the box is opened, the act of observing the cat may be the act that kills it. Just as with quantum mechanics where the observer of the cat affects the system being observed, so too can the debugger affect the system being debugged.
Bytecode optimization is where the optimizer applies heuristics to rewrite the bytecode output of the Java compiler to make it more efficient for the JVM to process. Techniques used are:
The output of bytecode optimizers is Java bytecodes, so they can still run within any JVM. Examples of bytecode optimizers are Dash O and Jove. Since static optimizers alter the bytecodes, the mapping from bytecodes to source code may also be affected, which could prevent debugging. In general, debugging should be performed on unoptimized code if possible.
Machine code optimization is when the Java bytecodes are translated into machine code at compile time rather than at JVM execution time. This is analogous to how languages such as C and Fortran are compiled, but with Java, the optimizer translates the bytecodes into machine code. Examples of such static optimizers are TowerJ from Tower and the IBM High Performance Java Compiler (HPJ) that comes with VisualAge for Java.
Once HPJ has optimized a program, the JVM is no longer used to execute the bytecodes since HPJ generates a self-contained executable program. This has the performance benefits of a natively compiled language, but it also means the program will run only on a platform for which the HPJ can generate an optimized program. These platforms include Windows, OS/2, AIX, AS/400, and OS/390. Because the HPJ bypasses the JVM, the standard Java debug APIs will no longer work. Instead, the Distributed Debugger, which handles both bytecodes and machine code, must be used to debug HPJ-optimized programs.
Java Platform Debug Architecture
At the highest level, someone writing a graphical debugger can hook into the JDI instead of the API previously described in the sun.tools.debug package. This means that the debugger will automatically work with all JVMs and platforms that Sun supports. If another company writes a different JVM, they'll also write their own JDWP implementation, which means the graphical debugger will be able to debug the other company's JVM. If the tool developer writes a debugger that's not written in Java, rather than program to the JDI layer they can program to the lower layers such as JDWP. The JDPA will work well if vendors who write their own JVMs and others who write their own debuggers all program to this interface. This consistency will mean that the debug experience for a developer working in a complex server environment with heterogeneous JVMs and JDK levels will become much more pleasant than it currently is.
We've provided some background into the problems of debugging Java programs, as well as some of the currently available solutions and some you should see in the near future. We welcome all feedback.
Reader Feedback: Page 1 of 1
SOA World Latest Stories
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
SYS-CON Featured Whitepapers
Most Read This Week