Aboutmatt.net

Blog, projects and et cetera..

Should I Store My Project Documentation in My Source Code Repository?

Reading time: about 10 minutes

Thing or something

There are some things you’ll often hear a programmer says, and others you will not. The majority will say, source control is good. But it’s not often you hears one say, documentation is good. That is my view. It is immensely important to the success of any software project. The question that interests me is where you keep your documentation? Should it be under some sort of revision control? If yes, should it be the same revision control system as your source code?

Who needs these documents?

After a good bit of thought I figured it might be helpful to break the question down further. Documentation is something that is shared between all the stakeholders of a project. Those could be programmers, “Product Owners”, requirement engineers, architects, project managers, CTOs, CIOs software division heads. They could also be other departments like Q/A, marketing, localization, services/support and release/delivery.

Given the breath of people who are affected by these assets it would seems important to decided on the right tools and process for handling this information. What’s important then is establishing what your priorities are.

  • Is the documentation a part of the product? (providing a 3rd party library for sale)
  • How do we get documents to and from one another without taking up each others time?
  • How do we keep a track of changes to documents for traceability?
  • How do we make sure everyone, both technical and non-technical users, who needs access to information can get to it?
  • What tools are people going to need to work with the documents?
  • Will there be a need to provide training?

My experiences with documentation and Git

My personal experience with this is keeping high level and low design documentation in a Git repository. The solution at the time was a Rakefile (Rake is a Ruby version of the make build system) that was designed to pick up the project docs (written in markdown) and recreates the directory structure in an output folder and renders all the files into a HTML within this output folder. Things got messy when there were images and other assets that needed to come along for the ride and linking between documents was a bit of a nightmare. We used Jenkins to build software, so each build we’d create our docs which we could get at them through a URL, but we had massive problems with that later when our Jenkins set up changed slightly (all our URL got messed up).

We went through a variety of issues with the docs, such as getting access to them for editing, because you’d have to get access to repository. You’d sometimes end up in situations where’d have to merge them (which wasn’t that bad), but mostly people weren’t happy with markdown partly because we were using the Kramdown rendering engine and all the side by side rendered markdown editors didn’t use that renderer and the document they were working on didn’t render the same on our servers because we had custom CSS.

If I had to assess the way we did those documents on that project I would say it sort of worked but needed a lot of attention to get all the issues ironed out (which didn’t happened while I was there).

Measuring up that set-up with the goals above:

  • It was difficult to get the documents for editing.
  • We could share the documents with other departments, but they didn’t quite understand the structure.
  • There was traceability since they were in Git.
  • Since we were serving a HTML render version the other team members could see the docs.
  • You only needed a text editor for editing, but team members also used tools like:
  • We probably needed to write a document to explain our documentation process.

Patching the problem

I started to think about the problem and reasoned what might help would be using something other than straight Ruby and Kramdown to produce the rendered output and something other than Jenkins for serving the documentation. The following is a list of projects that seemed promising.

  • Git-wiki: https://github.com/sr/git-wiki
    • Simple experimental project, “I wrote git-wiki as a quick and dirty hack, mostly to play with Sinatra”.
    • I didn’t think it was suitable, but one of the forks might have been.
  • Olelo: https://github.com/minad/olelo
    • “Wiki with git back end”
    • Fork of Git-wiki
    • Docs managed in markdown format
    • Renders docs on the fly.
    • Good inter file linking.
  • Gollum: https://github.com/gollum/gollum
    • “A simple, Git-powered Wiki with a sweet API and local front-end.”
    • The software behind Githubs project Wiki’s.
    • Docs managed in markdown format.
    • Renders docs on the fly.
    • Edit documents through web interface.
    • Didn’t work for us because it wasn’t compatible with Windows.
  • Jingo: https://github.com/claudioc/jingo
    • “Node.js based Wiki”
    • Docs managed in markdown format.
    • Renders docs on the fly.
    • Good inter file linking.
    • Edit documents through web interface.
    • Never got a chance to try it.
  • mkdocs: http://www.mkdocs.org/
    • Liked the simplicity of this one.
    • Docs managed in markdown format.
    • Statically generated site.
    • Interfile linking.
    • Themes.
    • Menu with a list of all documentation.
    • Doesn’t understand branches…etc
  • flatdocs: http://ricostacruz.com/flatdoc/
    • “Flatdoc is a small JavaScript file that fetches Markdown files and renders them as full pages.”
    • JavaScript/Browser based solution.
    • Docs managed in markdown format.
    • Small and lightweight missing many features we needed on that project.
  • ditto: https://github.com/chutsu/ditto
    • “Lightweight Markdown Documentation System”
    • Javascript/Browser based solution.
    • Inspired by flatdocs. Basically the same.
    • Aim at literate style documentation for JavaScript libraries.
    • Tried to Github.
  • markdoc: [http://markdoc.org/][GHmarkdoc]
    • “Markdoc is a lightweight Markdown-based Wiki system. It’s been designed to allow you to create and manage Wiki’s as quickly and easily as possible.”
    • Docs managed in markdown format.
    • Statically generated site.
    • You need to host it yourself.
    • Doesn’t understand branches…etc
  • ikiwiki: http://ikiwiki.info/
    • “Ikiwiki is a wiki compiler. It converts Wiki pages into HTML pages suitable for publishing on a website.”
    • Understands source control.
    • Docs managed in markdown format
    • Good inter file linking because it’s a Wiki :).
    • Seems very promising, never got a chance to try it out.
  • Dokuwiki: https://www.dokuwiki.org/
    • “DokuWiki is a simple to use and highly versatile Open Source Wiki software that doesn’t require a database”
    • Support for markdown through plugins
    • Proper access control including LDAP if you wanted to use your Windows credentials.
    • Seems very promising, never got a chance to try it out.
  • Gitblit: http://gitblit.com/features.html
    • “Gitblit is an open-source, pure Java stack for managing, viewing, and serving Git repositories. It’s designed primarily as a tool for small work-groups who want to host centralized repositories.”
    • Understands Git and your source code.
    • User authentication with LDAP support.
    • Can render markdown documents.
    • Good inter file linking.

You can see that I was leaning towards Wiki’s and even a full on Git web interface. I think I was beginning to understand and see all of the different goals and needs of a good documentation system and that is the reason I was doing that.

Collaboration versus tools

As I studied the options more and more I realised why it is difficult to use your own code RCS for your docs. Keeping your docs besides your code base made it difficult for other team member to create or contribute to them. The tools for writing the documents were also quite alien to them. People are used to using things like Word or even MediaWiki for creates documents. They don’t really know markdown, git, svn…etc and definitely don’t want to use Notepad (lets face it, we’ll tell them to use Notepad++ or Sublime Text all we want. It’s just not important to them).

The success stories of docs beside code are usually open source project. One of the main reason I believe this is the case is ease of use. Open source projects are usually about the code. Documentation is something that helps to explain the design and provides examples. If there is an API then documentation explains how to use it, some of the documents will be partly (or entirely) generated from the code itself. So in that case the primary concerns is around collaborating between the different programmers working on the code base.

The trouble is, in an ordinary work place the main aim of documents are around collaborating of the technical, less technical and non-technical staff.

An equally bad approach

Microsoft Office rules the commercial world. Many people are very familiar with it and know how to use Microsoft Word. Which is probably why you will often find most documentation being written using that tool. Given that you’ll often find Word + SVN/Sharepoint as a fairly common pattern in the commercial software world. There are probably plenty of examples of places that just have a bunch of Word documents on a network drive (this is just plain bad because of traceability).

These kind of approaches is great, and Word has come a long way over the years. You can even add annotations and in-line comments to documents (not in real-time as far as I know). The trouble is Git doesn’t version binaries files very well. I know that SVN does have ways of working with Word documents, that is why you often see the two together, but we were using Git so Word didn’t make sense.

Conclusion

In the end, I don’t think our experiment with keeping docs next to code worked very well. Keeping, accessing and using the documents that were kept in source was too difficult. Having a better system on-top of those docs like the list I enumerated above probably would have gone a long way to solve that problem, but I’m not sure. My advice would be that you should be careful about that decision, it could turn out to be more trouble than it’s worth.

As the user bstpierre noted in the Stackoverflow question, “What Part of Your Project Should be in Source Code Control?”:

"Project documentation is cumbersome to maintain in a source control system. Project docs are always ahead of the code itself, and it's not uncommon to be working on documentation for the next version while working on code for the current version. Especially if all your project docs are binary docs that you can't diff or merge."

References

Below are a series of articles and tools you might find useful when deciding to take on this challenge.

Title Image

SO Questions

Tools for editing Markdown

Tools To Bolt onto Git