Good technical writing is hard

A few days ago, I randomly tossed this out on Twitter without context:

Technical writing requires assuming the reader is simultaneously highly intelligent and utterly ignorant, without making them feel like you think they're utterly ignorant.
That shit is hard, yo.

Someone asked for ideas on how to achieve that goal, and it seemed like a topic worthy of discussion so here we are.

Technical writing is not for Dummies

I would expand the statement above a bit, actually. Good technical writing requires:

assuming the reader is highly intelligent but accounting for that fact that they may well not be;
assuming the reader is utterly ignorant of the topic you're writing about without making them feel ignorant.

That is legitimately hard. And it's a reason good technical writers (aka documentation writers) are highly valuable. Let's break it down piece by piece.

When writing technical documentation (which includes documentation, code comments, architecture documents, technical books, academic papers, conference talks, and so on), you don't know who your audience is. Or rather, you know that your audience is going to be very heterogeneous, of many different skill levels, experience levels, and perspectives. (If not, it means your audience is very tiny. Sad panda.)

That creates a problem, because technical knowledge is highly stacked. That is, understanding thing A requires first understanding thing B, which requires understanding thing C, etc. It's basically yak shaving, but you can't avoid it. The problem is that you don't know what your audience already knows about C, D, and G, so explaining A may not even work.

It's also dangerous to assume that your audience will make the logical leaps you intend them to make. Some may, but many will not. Just because 2 + 3 implies 5 for you doesn't mean it is obvious to someone encountering 3 for the first time. But that obviousness means you may miss it, and then the audience won't make the connection.

Teaching upwards

As an example of how to address these challenges, I'm going to use my presentation "The Container is a Lie." I've given it numerous times with very positive feedback, so something is working. Both the slides and a recording of the talk are available online. You can either watch them first or flip back and forth with this post; whichever works for you.

The point of the talk is to explain what "containers" are in modern computer systems, and specifically why container doesn't mean "that thing Docker does with all of its millions of command line switches." The point was to go a level down from Docker to show that there were other options, including the one offered by Platform.sh, the company I worked for at the time.

Going a level down, though, meant talking about Linux namespaces, which are new and complicated and confusing to most developers, who if they've heard of anything have heard "Docker go brrr" and that's it. So I needed to go one level further down than that to get to a level that most of my assumed audience (developers, especially web developers) probably already at least sort of understood, which is the idea of a process. But I can't count on everyone's understanding of a process being the same, or equally complete.

This problem brings up two key guidelines for technical writing:

Know your audience. It will be varied, but you can't start with the Big Bang and move forward. You need to start from some assumed knowledge level. Make that explicit, at least to yourself. Sometimes it makes sense to make it explicit for the audience, too. Many technical books have a "Who this book is for" section that does precisely that, so the audience can self-sort into those that are likely to be able to follow and those that are not.
Explain the thing you assume people already know first.

The second point is critical, and has several advantages.

One, it's highly likely that not everyone's understanding is actually the same. Some people will understand what a process is but not what a process ID is. Some may not realize that processes are in a hierarchy. Some may not understand how virtual memory works. Odds are at least some of that section is new to somebody, so covering it all means that by the end, everyone knows, or at least has been exposed to, all the relevant bits.

Two, if you assume most of the audience already knows most of it, then most of the audience is now starting off the talk (or documentation, or book, or whatever) with a win. "Ah, I understand that" (most of which they already understood anyway). That early win puts the audience in a position to be more receptive to the more complex topics you'll be introducing later.

Three, it has everyone approaching the topic from the same angle. There may be other details that aren't relevant, or are too complex, or just not the right mental model. By establishing a common mental model early on using material most people already know, you have a solid foundation on which to build.

Once you've established a baseline of knowledge, and filled in any small but relevant gaps people may have, you can start building on it. For example, once I established that a "process" is a collection of a process ID, instruction stream, memory, etc. that the OS uses to keep track of different programs and simulate multi-tasking by lying to programs, it's a very short leap to "tell different lies to different processes," which is what Linux namespaces are. That's the bridge from one knowledge level to the next. Then I can go through and explain how different namespaces work.

Once you get an understanding of namespaces, some in the audience may begin to realize how they get used for this thing we call "containers." But not everyone will. Not everyone is going to make that mental leap, or make it at the same time. It would be great if they did, but humans gonna human. That means after explaining all of the different moving parts, I explicitly talk about bringing them together. Some people will already be ahead of me on that, and that's fine. Others will only get it once I've explained it, possibly multiple times. That's why I reiterate the same point a few times.

But the connection is still made explicitly. After teasing it enough that many people will make the connection themselves, and thus feel good about having made the connection and getting there ahead of me and being so smart, I make the point explicitly. "That's what a container is. It's the combination of all the namespaces used at once, which gets a fancy name because it's convenient." Explicitly lay out that 2 + 3 = 5.

Now, armed with that knowledge, we can move on to orchestration, read-only file systems, and other concepts that now make sense only once you have a solid understanding of what "container" means, which wasn't possible until you had a solid understanding of what "process" means. And from there, we can compare Docker's flavor of containers with Platform.sh's flavor of containers... all of which is a multi-layer process to get to the core point I wanted to get across with the talk: There's more to containers than Docker, Docker doesn't have a monopoly on container tech.

I just needed to start with CPU instructions to get there.

Examples, for example

Another way to drive a new idea home is through examples. In some cases examples are hard to produce because the topic is complex or it would take too long, but really, try. Most people learn by doing, at least to an extent. Almost everyone learns by relating something new to something they already know. Take advantage of that!

It may seem like pointless repetition to explain something, then give multiple examples of it. It is definitely repetition, but definitely not pointless. Some people may be fine after an abstract explanation. Others may need a concrete example. Others may need a concrete example but not that one, because it doesn't make sense to them for whatever reason. If you want to cover everyone, you need to provide multiple vectors that will each cover only some people.

It's the same mindset as accessibility, in a way. Good accessibility involves providing text and image and color and position and markup information, on the assumption that while some people may only be able to receive some of those signals nearly everyone should get at least one, and those that can get most or all of them then get an even richer signal. Good documentation, similarly, provides background, and a formal description, and an informal description, and multiple representative examples. Not everyone will understand every piece of that, but nearly everyone should understand at least some of it, and since the signal is the same in all cases, nearly everyone gets something out of it.

Which is exactly what I just did. Look back: three paragraphs, all saying "have clear examples" but from three different angles. One introduces the idea in the abstract. The second makes another pass to give more detail. The third takes yet another pass on it using an analogy. Odds are at least one of those approaches resonated with you, but it's not the same approach for everyone. But I wager nearly everyone reading this post got the point from at least one of those approaches.

As an added bonus, there are likely people who didn't think of accessibility in that way yet. However, the analogy is symmetrical. If "lots of examples means lots of signal options" makes sense to someone, then they can apply that same analogy back the other direction. Now they've learned something about accessibility while they thought they were learning about documentation. (I'm sneaky like that.)

Who's on first

A common question is who should be responsible for writing documentation. My own stance on this question has shifted over time, from being a developer, a teacher, and working in developer relations and primarily responsible for documentation.

A common refrain is that the best person to document something is the person who just learned it. There's some truth to that. When you learn something, your brain literally, physically changes, and afterward it becomes harder to imagine what it's like to not know something. If it's something you've just learned, you still have echoes of your previous non-understanding so you have a brief window where you intuitively know what the hard parts were. Additionally, teaching something is one of the best ways to ensure you know it yourself, because you're forced to restate and filter it through your own understanding, which can help you find any remaining gaps in your understanding.

There's three problems with that approach, however. One, documentation is going to be written once, but read many times. It doesn't work to have everyone writing documentation of the same thing. Two, writing good technical documentation, as you have probably figured out by now, is neither simple nor easy. It's an advanced, learned skill, which most people don't have. Three, the new person is trying to learn it themselves; that's why they're reading what paltry documentation may exist! Asking them to figure it out and then also document it is putting more work on the person least in a position to put in the work.

Sadly, that last point is where that approach falls down the most. In practice, I see the "best person to document it is the newbie" line used mostly, and mostly unconsciously, as a way to avoid responsibility for doing something many see as boring. "I don't want to write docs, so, uh, guess what, you can do that for me as your Open Source contribution! Yay!" That's not Open Source; that's shirking responsibility. If you don't care if someone uses whatever your project is, fine. But if you do care, and want it to get used, then making your audience document it for you is kind of abusive.

On the flip side, "the person who wrote it understands it best, so they should be the one to document it." There's validity here, too. If the author doesn't understand it well enough to explain it, they don't understand their own work well enough. On the other hand, the same problem exists here as with putting the work on the newbie: Namely, good technical writing is a learned skill, and a good developer may not always be a good technical writer.

A developer who is a good technical writer is highly valuable. It's a skill that far more developers desperately need to invest the time learning, and employers need to give them the time to do so. Given two equally capable developers, the one who is also a good technical writer is the better developer and more valuable on a team. In fact, I'll go so far as to say that a slightly worse developer, if they're also a good technical writer, is still better to have on your team than a developer who produces great code but can't explain it well to others.

There's a third answer I'll dismiss out of hand: If it's not obvious and self-evident without documentation then you did it wrong and should try again. This is elitist claptrap. Sure, things should be built to be as self-evident and self-explanatory as possible, but often the most possible is still far short from "anyone can figure it out from just looking at it." Try to pick up a 3D graphics and animation package like Lightwave or Blender or 3D Studio Max without any prior knowledge beyond Google Docs. The problem space is just inherently in need of explanation.

Technical writing is an advanced, learned skill that takes time, training, and practice. You want and need someone who has that skill on your team, and that person or people should be primarily responsible for producing technical documentation. That person is at minimum an editor, ensuring that documentation input from others (both the development team and passing newbies) is consistent, clear, layered properly, and so on. That also means "documentation via wiki" is absolutely horrible and should never be used, ever. You do need a choke point to ensure quality, and that choke point needs to have ample time and support given to them. You need a specialist taking care of a specialist skill. And you need to defer to that specialist's expertise, just as much as you would defer to a database expert's expertise on query optimization or a networking specialist's expertise on network configuration.

That could also be the developer. If a person is both a good developer and good writer, score! They should at minimum do a first pass at the documentation, keeping in mind the layering approach above. If not, the developer still needs to be responsible for educating whoever is responsible for producing the documentation. (And that needs time allocated to it.)

Conversely, a technical writer that is also a capable practitioner in the topic is going to be a better technical writer than one who isn't, because they can develop a deeper understanding, at multiple levels, from which to explain the levels that the audience needs.

The developer-technical writer communication could take many forms. It could be a sample PR against documentation that the documentation lead improves on. It could be a chat in Slack or a Hangout. It could be a presentation with a slide deck. If the technical writer is also a developer then it could include a tour of code. What it does not mean is "here's a link to the code, figure it out." That is, again, abdication of responsibility. If you can't explain your work to someone else, the problem is you, not them. You are a less good developer if you cannot communicate well to your peers, and a technical writer is one of your peers. At least get good enough that you can explain it to someone who knows how to translate it into an audience-facing explanation.

Let specialists do their job

In short, documentation, like teaching (which it is), is a skill, and may require dedicated skilled people. They can and should be supported by the practitioners (developers), and should be open to contributions from the audience (user base), but the primary responsibility does not fall on the audience to document for themselves. Even then, the practitioners still have a responsibility to be good enough communicators that they can convey the necessary information to the technical writers.

And if, even then, it's still hard to explain, document, or has lots of edge cases that make the written description and examples highly complicated? Then, yes, it is too complicated and it needs to be redesigned to be less complicated. If you cannot explain it, it's too complicated.