Including Git Revision Identifiers in LaTeX

I keep most of my LaTeX source files under the revision control system Git.

Stupid git


Git is opinionated software. It allows you to do many things, but others run counter to its dogmas. In particular, Git does not modify source files.

Other systems do. For example, you can ask Subversion to replace the placeholder $Revision in the committed source file with the latest revision number. This is useful for displaying revision numbers or dates in your LaTeX documents, for example.

From Git’s perspective, modification of source file is pure evil. Not only is this feature absent from Git, but the very request betokens moral corruption.

From Git’s perspective, the client programme (in this case, LaTeX) is responsible for including revision information in the output. Git humbly makes that information available. But it does not modify the source.

Stephan Hennig’s vc bundle consists of a number of scripts that extract revision information from various version control systems, including Git, and make it available as LaTeX macros. The information is taken from a separate, automatically generated input file vc.tex, so that the main LaTeX source file remains untouched by Git.

This shifts the problem to the generation of vc.tex. I’ve played around with various solutions based on Makefiles, and here is my current setup.

The vc.tex file is automatically generated and defines three LaTeX macros. Typically, it looks like this:

%%% This file is generated by Makefile.
%%% Do not edit this file!
%%%
\gdef\GITAbrHash{f61c739}
\gdef\GITAuthorDate{Fri May 13 10:34:51 2011 +0200}
\gdef\GITAuthorName{Thore Husfeldt}

The main LaTeX source includes the vc.tex file in the preamble and can now freely use these macros. For example, the revision information can be included in a footnote on the title page.

\documentclass{article}
...
\input{vc.tex}
...
\begin{document}
\maketitle
\let\thefootnote\relax
\footnotetext{Base revision~\GITAbrHash, \GITAuthorDate, \GITAuthorName.}
...

The responsibility of producing an up-to-date vc.tex rests on the Makefile:

latexfile = main

all: $(latexfile).pdf

$(latexfile).pdf : $(latexfile).tex vc.tex
	while (pdflatex $(latexfile) ; \
	grep -q "Rerun to get cross" $(latexfile).log ) do true ; \
	done

vc.tex: .git/logs/HEAD
	echo "%%% This file is generated by Makefile." > vc.tex
	echo "%%% Do not edit this file!\n%%%" >> vc.tex
	git log -1 --format="format:\
		\\gdef\\GITAbrHash{%h}\
		\\gdef\\GITAuthorDate{%ad}\
		\\gdef\\GITAuthorName{%an}" >> vc.tex

The interesting rule is the last one. It runs git log to produce vc.code. The hardest thing for me was to get the dependencies right. I think I’ve got it now. The input file vc.tex needs to be regenerated whenever it predates the last commit. As far as I understand the Git internals, the modification time of .git/logs/HEAD should give me a reliable timestamp for when the last commit happened, so I made my rule for vc.tex depend on that.

Of course, it’s a cheap operation, so we could generate vc.tex anew every time we run pdflatex. But then every call to make would recompile the source (because vc.tex has changed). To avoid that, we could leave vc.tex out of the dependencies for $(latexfile).pdf. But then a commit (which modifies the revision number but not any source files) would not lead to an automatic recompile. The LaTeX document would only display new revision information whenever it is edited after that revision.

If there’s a cleaner way of checking for “is vc.tex outdated compared to the Git repository”, please tell me.

TODO: Make the LaTeX document reflect that it corresponds to uncommitted edits after the latest revision. This should be doable by comparing the modification times of the LaTeX source files and .git/logs/HEAD. A cruder way is to let git status tell the Makefile if working directory is “clean”.

UPDATE (21 Feb 2012): Since this post was written, various other approaches have appeared. (Thanks to the commenters for pointing them out.) The idea of using a post-commit hook instead of a Makefile is now on CTAN: gitinfo package.

17 thoughts on “Including Git Revision Identifiers in LaTeX

  1. Rasmus Pagh

    System tweaking is fun! But what does this give you that a time + date stamp in the LaTeX output would not? :-)

    Reply
    1. thorehusfeldt Post author

      Broadly, that’s a question about if version control in general is worth the effort. (That question is certainly valid.)

      But, to answer your specific question: (1) Because the LaTeX timestamp (\today) records the moment of compilation. In includes no information about the state of the source file. I could check out a revision from one year ago and compile it, and the timestamp would be today’s. (2) Even if you managed to include the “age of source file” (rather than “time of compilation”) in the LaTeX output, Imagine you’re collaborating with somebody. It’s much better to say “I’m looking at version f3fca92” than saying “I’m looking at yesterday’s version.” We could be three people all having a different opinion of what “yesterday’s version” means.

      Reply
      1. Rasmus Pagh

        Point taken. I suppose I was assuming that the compiled and printed version would always be committed to the repository, so one could infer the version from the revision history.

      2. thorehusfeldt Post author

        Joy of joys! Less than two days have passed, and I have encountered a real-life example!

        I uploaded a recent paper to the Arxiv, the same version that I sent off to a conference proceedings volume a few days ago. Both Springer and Arxiv prefer TeX source files instead of PDFs and then compile the document on the server. I submitted the same source file to both, but with a week’s difference. The resulting documents correctly display the revision identifier of the source file, claiming to be the same state of the paper’s contents, even though they were compiled at different times. In fact, they have resulted in different PDF documents, since Springer proceedings and Arxiv add various markup such as page numbers. What’s under version control is the contents.

  2. Holger

    Before I saw this post, I was intrigued by the identifier in your Arxiv submission and gazed at its wisdom. Well done!

    Reply
  3. BeSlayed

    What editor are you using, by the way? I have it set up a rather different way in Emacs which seems more straightforward to me (no make).

    Reply
  4. thorehusfeldt Post author

    I use Emacs myself. But an Emacs-based solution does not play well with others. It’s already a big step to ask co-authors to use a revision control system, let alone ask them to use git, whose learning curve is a wall. But ask them to switch to Emacs? No way.

    Reply
  5. BeSlayed

    The reason I ask is because I don’t really need to use a Makefile, and I seem to be able to just use Emacs with `\write18{./vc}`. But I was wondering if your solution offers additional advantages over this?

    [In terms of co-authors, worrying about Git and/or Emacs would be the least of my worries. I’d be overjoyed if they’d use LaTeX.]

    Reply
  6. thorehusfeldt Post author

    Sure, the write18-solution is the one recommended in the vc bundle I linked in the article body, section 2.3. It’s good. But it requires that you, and your co-authors, enable the \write18 feature in your latex distribution. Certainly doable, and to the extent that your control your compilation environment, relatively painless. I am very far from announcing that I’ve found a perfect solution to this problem, and it’s plausible that vc bundle is the way to go. I’m eager to see more solutions, in particular, field-tested ones.

    However, even if you can convince your co-authors to modify their tex installation, as soon as your paper is accepted to a conference or submitted to the arXiv, your LaTeX source is going to be compiled on a machine that you do not control. \write18 maybe be enabled or not, but you’re certainly not going to change the publisher’s defaults. There are ways around this, of course. But by now you have convinced your co-authors to switch to Emacs, modify their LaTeX installation, and added some more lines of LaTeX (about conditional compilation) to your common source code.

    A benefit of my solution is that the source compiles (even without using the Makefile!) to the desired output with minimal assumptions about the compilation environment.

    Reply
    1. BeSlayed

      Right. I was just making sure if I wasn’t missing anything by using \write18 (at least in my mono-authored papers). Using \write18 is a little bit of pain, since, as you mention, it needs to be enabled (which took a little bit of research to find out how to *only* enable \write18{./vc} and not other \write18 (for potential security concerns)), but setting up Emacs/AUCTeX to use a makefile requires more tweaking than this. I’ll keep the makefile method in mind in case I happen to have any LaTeX-coauthors.

      It would be great if there were a more general, easy-to-implement solution, as you say.

      Reply
  7. Robbie Morrison

    You suggest (final paragraph) using ‘git status’ to find out whether the working directory is “clean”. But as far as I know, ‘git ls-files –modified’ (or similar) would be preferable, this being a lower level and more stable git plumbing command.

    Reply
  8. Nomen Nescio

    Have you thought of using git hooks? I think a post-pull-hook and a post-commit-hook would be all that’s needed. Anyway, thanks for the blog post.

    Reply
  9. thorehusfeldt Post author

    In fact, Nomen, originally I thought that a post-commit-hook would be the obvious solution to this, and can’t quite remember why I abandoned it. I encourage you to try!

    No matter how you slice and dice it, some process has the responsibility to put the id of the last commit “into” the TeX source. You could even let post-commit-hook modify the source itself (though you’ll probably burn in hell for it). But that would immediately make your working directory unclean after each commit, confusing git’s status reports. (Also, it’ll rot your soul.)

    So, as far as I can see, even a post-commit-hook needs to write a short bit of TeX into an external file, pretty much like the vc.tex file in my solution. So I think the complexity in terms of number of files is more or less the same; the main difference is that the post-commit-hook solution would produce this file at the time of commit, while the make solution does it of at the time of TeX compilation. (This is actually somewhat more flexible, since you could put other interesting information, such as cleanliness of the working directory into the final document.)

    So both solutions need something like vc.tex. They differ in which process promises to keep vc.tex up to date. If you have (and share) a makefile anyway, it makes sense to use that. Otherwise you can put it into post-commit-hook, but need to remember to add that hook to the repository, so that your co-authors will run it as well.

    Reply
  10. kolesarm

    The new gitinfo package implements this with post-commit hooks

    http://www.ctan.org/pkg/gitinfo

    It seems like a simpler solution to me, since I don’t need to mess with write18 or makefiles. The downside of course is that I don’t know whether the latex document has been modified since the last commit

    Reply
  11. Pingback: Adding Version Info in LaTeX with gitinfo | Techne

  12. Pingback: Lo que he aprendido: registro de cambios en un documento LaTeX con git | Onda Hostil

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s