Wrong by Default

Posted

A few programming languages use a “defer” pattern for resource cleanup. This is a construct such as a defer keyword which schedules cleanup code to run at the end of the enclosing block (or function). This is available in Zig, Go and even in GCC C.

fn printFile(path: str) !void {
	var f = try open(path)
	defer f.close()
	for line in f {
		print(try line)
	}
	// File is closed here.
}

This provides a few key benefits.

  1. The cleanup is right next to the allocation. It is easy to spot and keep in sync.
  2. The cleanup automatically occurs even for early returns and error cases.
  3. The cleanup automatically occurs in the right order (reverse order of the keywords).

This is a huge improvement to manually writing and running cleanup code. Each point above is a common source of bugs that are now easy to avoid.

However, defer has one huge downside. You need to remember it.

Compare to something like C++ or Rust:

fn printFile(path: &str) -> std::io::Result<()> {
	let f = std::fs::File::open(path)?;
	for line in std::io::BufReader::new(f).lines() {
		println!("{}", line?);
	}
	Ok(())
	// File is closed here.
}

Sure, adding one line of code isn’t hard. But you can forget. How do you even know what types need to be closed, unlocked or otherwise disposed of? Automatic disposal (known as RAII) takes this off the programmers mind and avoids bugs. Everything that needs to be cleaned up is cleaned up when it is no longer used.

Java has the same problem. Many programmers love garbage collectors because they clean up after you. But the problem with most garbage collectors is that they only clean up memory! There are many other types of resources such as locks, files, threads and sockets that you need to handle yourself. Some of these can be handled by finalizers but due to the possibly long length of time before they run they are not suitable for some resources such as locks.

A good resource management solution such as RAII can handle all resources that your program uses. For the longest time the best solution for java was finally blocks.

FileInputStream inputStream = new FileInputStream("story.txt");
try {
	// Read file here...
} finally {
	inputStream.close();
}

While this is clearly a lot more boilerplate than defer and results in an unpleasant rightward shift, it is effectively the same. And most importantly you still need to remember which types need to be closed.

This has gotten slightly better in Java 7 with the try-with-resources statement.

try (FileInputStream inputStream = new FileInputStream("foo.txt")) {
	// Read file here...
}

But even still, manual action is required for correctness. However, the best part of this feature may be the java.lang.AutoCloseable interface. Finally there is a way to know which types need to be cleaned up. This machine-visible information may allow effective lints to be built. If you ever get an AutoClosable type you better use it in a try-with-resources statement or quickly pass it to someone else (and shift the burden to them). But complex cases would be difficult to reliably detect. For example if you store the type in a data structure and clean it up after it is removed. Even simply passing references to multiple places creates ambiguity on who is responsible for cleaning it up.

The moral of the story is that the best way to avoid bugs is to make them difficult to write. Ideally make bugs harder to write than correct code. defer and try-with-resources are “wrong by default” and require explicit work on behalf of the programmer to become correct. RAII is correct without the programmer doing anything extra.

This post was inspired by the defer keyword, but it’s a much more general problem than that. I remember a document in Google about the --dont-be-stupid flag. These commonly get added when a bug is found, but the fix contains an behavioural change that some client depends on, or the change needs to be carefully rolled out. After a few years most services will have dozens of these flags, all hardcoded on in the service configuration. Of course every once in a while a new instance of the service will be set up and this bug will reappear. Backwards compatibility and gradual rollouts are good, but at some point you need to update the code to be right by default.