I keep most of my LaTeX source files under the revision control system Git.
Git is opinionated software. It allows you to do many things, but others run counter to its dogmas. In particular, Git does not modify source files.
Other systems do. For example, you can ask Subversion to replace the placeholder $Revision in the committed source file with the latest revision number. This is useful for displaying revision numbers or dates in your LaTeX documents, for example.
From Git’s perspective, modification of source file is pure evil. Not only is this feature absent from Git, but the very request betokens moral corruption.
From Git’s perspective, the client programme (in this case, LaTeX) is responsible for including revision information in the output. Git humbly makes that information available. But it does not modify the source.
Stephan Hennig’s vc bundle consists of a number of scripts that extract revision information from various version control systems, including Git, and make it available as LaTeX macros. The information is taken from a separate, automatically generated input file vc.tex, so that the main LaTeX source file remains untouched by Git.
This shifts the problem to the generation of vc.tex. I’ve played around with various solutions based on Makefiles, and here is my current setup.
The vc.tex file is automatically generated and defines three LaTeX macros. Typically, it looks like this:
%%% This file is generated by Makefile. %%% Do not edit this file! %%% \gdef\GITAbrHash{f61c739} \gdef\GITAuthorDate{Fri May 13 10:34:51 2011 +0200} \gdef\GITAuthorName{Thore Husfeldt}
The main LaTeX source includes the vc.tex file in the preamble and can now freely use these macros. For example, the revision information can be included in a footnote on the title page.
\documentclass{article} ... \input{vc.tex} ... \begin{document} \maketitle \let\thefootnote\relax \footnotetext{Base revision~\GITAbrHash, \GITAuthorDate, \GITAuthorName.} ...
The responsibility of producing an up-to-date vc.tex rests on the Makefile:
latexfile = main all: $(latexfile).pdf $(latexfile).pdf : $(latexfile).tex vc.tex while (pdflatex $(latexfile) ; \ grep -q "Rerun to get cross" $(latexfile).log ) do true ; \ done vc.tex: .git/logs/HEAD echo "%%% This file is generated by Makefile." > vc.tex echo "%%% Do not edit this file!\n%%%" >> vc.tex git log -1 --format="format:\ \\gdef\\GITAbrHash{%h}\ \\gdef\\GITAuthorDate{%ad}\ \\gdef\\GITAuthorName{%an}" >> vc.tex
The interesting rule is the last one. It runs git log
to produce vc.code. The hardest thing for me was to get the dependencies right. I think I’ve got it now. The input file vc.tex needs to be regenerated whenever it predates the last commit. As far as I understand the Git internals, the modification time of .git/logs/HEAD should give me a reliable timestamp for when the last commit happened, so I made my rule for vc.tex depend on that.
Of course, it’s a cheap operation, so we could generate vc.tex anew every time we run pdflatex. But then every call to make would recompile the source (because vc.tex has changed). To avoid that, we could leave vc.tex out of the dependencies for $(latexfile).pdf. But then a commit (which modifies the revision number but not any source files) would not lead to an automatic recompile. The LaTeX document would only display new revision information whenever it is edited after that revision.
If there’s a cleaner way of checking for “is vc.tex outdated compared to the Git repository”, please tell me.
TODO: Make the LaTeX document reflect that it corresponds to uncommitted edits after the latest revision. This should be doable by comparing the modification times of the LaTeX source files and .git/logs/HEAD. A cruder way is to let git status tell the Makefile if working directory is “clean”.
UPDATE (21 Feb 2012): Since this post was written, various other approaches have appeared. (Thanks to the commenters for pointing them out.) The idea of using a post-commit hook instead of a Makefile is now on CTAN: gitinfo package.
System tweaking is fun! But what does this give you that a time + date stamp in the LaTeX output would not? :-)
Broadly, that’s a question about if version control in general is worth the effort. (That question is certainly valid.)
But, to answer your specific question: (1) Because the LaTeX timestamp (\today) records the moment of compilation. In includes no information about the state of the source file. I could check out a revision from one year ago and compile it, and the timestamp would be today’s. (2) Even if you managed to include the “age of source file” (rather than “time of compilation”) in the LaTeX output, Imagine you’re collaborating with somebody. It’s much better to say “I’m looking at version f3fca92” than saying “I’m looking at yesterday’s version.” We could be three people all having a different opinion of what “yesterday’s version” means.
Point taken. I suppose I was assuming that the compiled and printed version would always be committed to the repository, so one could infer the version from the revision history.
Joy of joys! Less than two days have passed, and I have encountered a real-life example!
I uploaded a recent paper to the Arxiv, the same version that I sent off to a conference proceedings volume a few days ago. Both Springer and Arxiv prefer TeX source files instead of PDFs and then compile the document on the server. I submitted the same source file to both, but with a week’s difference. The resulting documents correctly display the revision identifier of the source file, claiming to be the same state of the paper’s contents, even though they were compiled at different times. In fact, they have resulted in different PDF documents, since Springer proceedings and Arxiv add various markup such as page numbers. What’s under version control is the contents.
Before I saw this post, I was intrigued by the identifier in your Arxiv submission and gazed at its wisdom. Well done!
What editor are you using, by the way? I have it set up a rather different way in Emacs which seems more straightforward to me (no make).
I use Emacs myself. But an Emacs-based solution does not play well with others. It’s already a big step to ask co-authors to use a revision control system, let alone ask them to use git, whose learning curve is a wall. But ask them to switch to Emacs? No way.
The reason I ask is because I don’t really need to use a Makefile, and I seem to be able to just use Emacs with `\write18{./vc}`. But I was wondering if your solution offers additional advantages over this?
[In terms of co-authors, worrying about Git and/or Emacs would be the least of my worries. I’d be overjoyed if they’d use LaTeX.]
Sure, the write18-solution is the one recommended in the vc bundle I linked in the article body, section 2.3. It’s good. But it requires that you, and your co-authors, enable the \write18 feature in your latex distribution. Certainly doable, and to the extent that your control your compilation environment, relatively painless. I am very far from announcing that I’ve found a perfect solution to this problem, and it’s plausible that vc bundle is the way to go. I’m eager to see more solutions, in particular, field-tested ones.
However, even if you can convince your co-authors to modify their tex installation, as soon as your paper is accepted to a conference or submitted to the arXiv, your LaTeX source is going to be compiled on a machine that you do not control. \write18 maybe be enabled or not, but you’re certainly not going to change the publisher’s defaults. There are ways around this, of course. But by now you have convinced your co-authors to switch to Emacs, modify their LaTeX installation, and added some more lines of LaTeX (about conditional compilation) to your common source code.
A benefit of my solution is that the source compiles (even without using the Makefile!) to the desired output with minimal assumptions about the compilation environment.
Right. I was just making sure if I wasn’t missing anything by using \write18 (at least in my mono-authored papers). Using \write18 is a little bit of pain, since, as you mention, it needs to be enabled (which took a little bit of research to find out how to *only* enable \write18{./vc} and not other \write18 (for potential security concerns)), but setting up Emacs/AUCTeX to use a makefile requires more tweaking than this. I’ll keep the makefile method in mind in case I happen to have any LaTeX-coauthors.
It would be great if there were a more general, easy-to-implement solution, as you say.
You suggest (final paragraph) using ‘git status’ to find out whether the working directory is “clean”. But as far as I know, ‘git ls-files –modified’ (or similar) would be preferable, this being a lower level and more stable git plumbing command.
Have you thought of using git hooks? I think a post-pull-hook and a post-commit-hook would be all that’s needed. Anyway, thanks for the blog post.
In fact, Nomen, originally I thought that a post-commit-hook would be the obvious solution to this, and can’t quite remember why I abandoned it. I encourage you to try!
No matter how you slice and dice it, some process has the responsibility to put the id of the last commit “into” the TeX source. You could even let post-commit-hook modify the source itself (though you’ll probably burn in hell for it). But that would immediately make your working directory unclean after each commit, confusing git’s status reports. (Also, it’ll rot your soul.)
So, as far as I can see, even a post-commit-hook needs to write a short bit of TeX into an external file, pretty much like the vc.tex file in my solution. So I think the complexity in terms of number of files is more or less the same; the main difference is that the post-commit-hook solution would produce this file at the time of commit, while the make solution does it of at the time of TeX compilation. (This is actually somewhat more flexible, since you could put other interesting information, such as cleanliness of the working directory into the final document.)
So both solutions need something like vc.tex. They differ in which process promises to keep vc.tex up to date. If you have (and share) a makefile anyway, it makes sense to use that. Otherwise you can put it into post-commit-hook, but need to remember to add that hook to the repository, so that your co-authors will run it as well.
The new gitinfo package implements this with post-commit hooks
http://www.ctan.org/pkg/gitinfo
It seems like a simpler solution to me, since I don’t need to mess with write18 or makefiles. The downside of course is that I don’t know whether the latex document has been modified since the last commit
Another option is to have latex nose around looking for a .git subdir, so no hook needed. This is the approach taken by git://github.com/mpg/git-info.git which is what I use.
Pingback: Adding Version Info in LaTeX with gitinfo | Techne
Pingback: Lo que he aprendido: registro de cambios en un documento LaTeX con git | Onda Hostil