Here is a typical example of what software developers and operations people deal with all day.
The Spring Framework is one of the most popular Java Frameworks in existence. It solves the problem of configuration for software projects; any given piece of software has a number of things that need to be configured, such as where to find files and how to connect to a database. It's important, for security purposes, that this information not be baked into the source code. Practically, software developers working on the code need to connect to a database on their own computer, and when the software is deployed to a production server, we need to connect to a database on the production server.
Spring is intelligent in a number of ways, particularly, Spring starts with a model of the application it is embedded in, and based on that model, decides which order parts need to be initialized in, and when the application shuts down, it automatically shuts things down in the right order. As applications get larger, it is maddeningly hard to get things like that right -- because Spring assembles your application by configuring Java objects directly, you get a powerful configuration system for your application "out of the box", without writing any code, which would time-consuming, error-prone, and idiomatic to every application you build.
On the left is a minaturized screenshot of the log output when a Spring application failed to boot; we're showing just the first 1000 lines of output out of 4775, but the key thing is that the real error is described on lines 467 and 468, which are marked in red. The real error, it turns out, is that it is trying to connect to a database server that doesn't exist, which could very well be one of the most likely reasons why Spring applications fail to boot.
The error message case is painful, however, because starting an application that is not correctly configured for a database server is one of the most common things that can go wrong. It's likely to go wrong if a person has just downloaded the application and is trying to install it. It could go wrong at 3AM on a Saturday morning. It's obnoxious enough if it is seen by someone who is familliar with the architecture and code, but often it is seen by a fresher developer, a tester, an ops person, or an end user, who has little idea of where to look for the 0.04% of this error message is relevant -- and they're likely to be seeing it when they are facing a deadline, juggling a number of tasks, or fielding complaints from someone who can't get work done because the server is down. This kind of error message causes stress, hurts the quality of life in IT, and adds to a feeling that technology is out of control.
This type of "needle in a haystack" situation is common in intelligent systems work. For instance, somewhere in the nearly 2 gigabyte Enron Email corpus there is the evidence of a major crime, but it is buried in the middle of notifications about company software games, failing servers, seminars on how to use Microsoft Outlook, and guys chatting about what they did back in the Army.
In the Enron case, there is the difficulty of understanding not just the text, but the business domain, including topics such as commodity trading. The case of this error message is simpler, because the long error message comes out of the very intelligence of the Spring framework. The Spring framework is not primarily trying to create the database connection, but it is trying to create some other object, which requires it to create some other object, which requires some other objects, which ultimately requires the database connection. Spring knows the complete series of events, including the failures downstream caused by the initial failure (think, for example, of a nuclear meltdown that is initiated by some simple event such as the failure of a valve or an emergency generator.) The model of the application is there, and Java returns a detailed explanation of the failure in the form of a stack trace, but because Spring is incapable of looking at the application model and the stack trace together, it spews a 4775 line error message and lets the operator sort it out.
It's not fair to single out Spring for this, since the same problem arises for many common ops and programming tasks.
The other day I was compiling a C program that uses GNU autoconf, a remarkable tool that inspects the environment it is running in, and automatically finds the libraries and header files to build an application. On Linux, the main effort involved is in adding any packages that are missing. In one case, I found the error message printed by the ./configure script for one missing package was cryptic, causing me to look in the log file, which contained, among other things, the C source code for a large number of programs that configure had tried, accompanied by compiler output. Once more, debugging the problem was a matter of finding a few relevant lines, then searching on the web to find out which package supplies the needed libraries. The ./configure script displays a complete trace of what it did, which aids in debugging, yet, in 2016, we are still bombarding users with entirely too much information.
Real Semantics is a framework for building data-rich applications, and that means it interacts with many kinds of applications in its work. In a typical case, it creates a cloud server and installs a number of software applications on that server, together with packaged data, and then configures the applications to work together. Real Semantics uses Spring internally and frequently installs applications that use Spring, as well as other configuration frameworks.
Real Semantics greatly simplifies the construction of complex systems, because, working from a model, it automates many of the decisions involved in setting up and configuring software. It works quickly and repeatably, but errors still happen.
When you greatly expand the speed and scale at which people can work, you encounter new problems. Back in 2000, assembling all of the JAR files for a complex Java application was a difficult manual task -- you weren't going to include any dependencies you didn't need. Now with tools such as Maven and Gradle, it's easy to create a project that has thousands of dependencies, but then you have a whole new problem, managing the complexity of and interactions between thousands of dependencies.
The same is true for error handling. Applications managed by Real Semantics can fail in the build process or thereafter, and understanding the failure requires knowledge about Real Semantics, the application, and the environment in which it runs. Fortunately, error handling is a first-class concern for us, and we provide multiple facilities to improve error handling.
InputStream
is a useful abstraction inside Java, because it could hold data stored in a file, in memory, transferred over the network, or compressed, but a function parsing an InputStream
has no idea where it came from.