Exponential Time Algorithms at ESA 2011

The list of accepted papers for ESA 2011 is online. Below is my own quick take on which papers are about exponential time algorithms.

ESA 2011 is colocated with IPEC under the ALGO 2011 umbrella, so there will be plenty of exciting results that week.

  • Dimitrios Thilikos. Fast sub-exponential Algorithms and Compactness in Planar Graphs.
  • Gwenaël Joret, Christophe Paul, Ignasi Sau, Saket Saurabh, Stéphan Thomassé, Hitting and Harvesting Pumpkins, arXiv:1105.2704.
  • Fedor Fomin, Ioan Todinca and Yngve Villanger, Exact algorithm for the maximum induced planar subgraph problem, Todinca’s slides from Worksh. Graph. Decomp. 2010.
  • Marek Cygan, Marcin Pilipczuk, Michal Pilipczuk and Jakub Wojtaszczyk, Scheduling partially ordered jobs faster than 2n.

I’m probably missing some results, but without online abstracts it’s hard to tell. I cannot judge a paper from its title alone. Comments and corrections are welcome, in particular links to online manuscripts.

Filter Bubble in Weekendavisen

I managed to excite Danish weekly Weekendavisen about the societal impacts of internet personalisation, along the lines of Eli Pariser’s recent book The Filter Bubble. This resulted in a nice, meaty two page article starting on the front page of the paper’s science section Idéer.

This is part of my ongoing effort to unleash the meme of the Algorithmic Lens in the public discourse. Two years ago I initiated an article about the algorithmic lens on the sciences in the same paper.

The Social Science Filter Bubble

The creation of such an article is a give-and-take between me and the journalist. I’d collaborated with him previously on the production of a TV programme about how Google’s page rank algorithm works, so we had established a level of common trust. Still, we come from vastly different epistemological backgrounds.

Here’s a detail that made me smile.

I originally suggested a formulation along the following lines:

Dewey and Habermas are fine, but today the public sphere is shaped by algorithmic processes, and we’re just in the beginning of that development. And just like we turn to Newton to understand gravity, we must turn to Turing and his disciples to understand some of the forces that influence the behaviour of individuals and groups in the information society.

Plenty of dropped names, and it was rejected for being too opaque. Consequently, in the final version, the references to philosophers Dewey and Habermas remain, but Newton and Turing have been removed. So, the readership of Weekendavisen is of course expected to know the public sphere and the civil society (as rightly they should). But they are protected from having to google who the other two guys are.

This is an example of the “old” filter bubble, where media through self-selection and editorial policy shape a common frame of reference, an understanding of what a Citizen is supposed to educate herself about.

In the Brave New World to come, an algorithm would have processed my quote. It would perhaps have replaced “Dewey and Habermas” by “Philosophers” or “Some social scientists” or “Dead White Males” on a per-reader basis, based on the epistemological priors, background knowledge, and ideological preferences of the reader.

Further reading

If you came here from Weekendavisen because you googled my name – provided your personalisation settings allowed me to burst your filter bubble –, and want to read more, check out thefilterbubble.com. That’s Eli Pariser’s blog, including links to his book, his TED talk video, and a list of 10 ways to pop your filter bubble.

Alternatively, check out my recent (Danish) TV production How Google works, which explains the PageRank algorithm. Now, the Good Olde Days.

Or you can hire me to give a talk; I have several ready, such as Hvordan Google virker or The Algorithmic Lens. See the full list at Popular Science talks.

Update: I also appeared on Danish public radio about this: P1 Morgen 4 Jun 2011, 7m45s, around 9:30.

Update: And even the the local paper Sydsvenskan: De digitala skygglapparna (4 July 2011) (in Swedish).

Update: And even in Svenska Dagbladet:
Isolerade i nätbubblan (15 August 2011) (in Swedish).

Image source: Wikimedia Commons

Including Git Revision Identifiers in LaTeX

I keep most of my LaTeX source files under the revision control system Git.

Stupid git


Git is opinionated software. It allows you to do many things, but others run counter to its dogmas. In particular, Git does not modify source files.

Other systems do. For example, you can ask Subversion to replace the placeholder $Revision in the committed source file with the latest revision number. This is useful for displaying revision numbers or dates in your LaTeX documents, for example.

From Git’s perspective, modification of source file is pure evil. Not only is this feature absent from Git, but the very request betokens moral corruption.

From Git’s perspective, the client programme (in this case, LaTeX) is responsible for including revision information in the output. Git humbly makes that information available. But it does not modify the source.

Stephan Hennig’s vc bundle consists of a number of scripts that extract revision information from various version control systems, including Git, and make it available as LaTeX macros. The information is taken from a separate, automatically generated input file vc.tex, so that the main LaTeX source file remains untouched by Git.

This shifts the problem to the generation of vc.tex. I’ve played around with various solutions based on Makefiles, and here is my current setup.

The vc.tex file is automatically generated and defines three LaTeX macros. Typically, it looks like this:

%%% This file is generated by Makefile.
%%% Do not edit this file!
%%%
\gdef\GITAbrHash{f61c739}
\gdef\GITAuthorDate{Fri May 13 10:34:51 2011 +0200}
\gdef\GITAuthorName{Thore Husfeldt}

The main LaTeX source includes the vc.tex file in the preamble and can now freely use these macros. For example, the revision information can be included in a footnote on the title page.

\documentclass{article}
...
\input{vc.tex}
...
\begin{document}
\maketitle
\let\thefootnote\relax
\footnotetext{Base revision~\GITAbrHash, \GITAuthorDate, \GITAuthorName.}
...

The responsibility of producing an up-to-date vc.tex rests on the Makefile:

latexfile = main

all: $(latexfile).pdf

$(latexfile).pdf : $(latexfile).tex vc.tex
	while (pdflatex $(latexfile) ; \
	grep -q "Rerun to get cross" $(latexfile).log ) do true ; \
	done

vc.tex: .git/logs/HEAD
	echo "%%% This file is generated by Makefile." > vc.tex
	echo "%%% Do not edit this file!\n%%%" >> vc.tex
	git log -1 --format="format:\
		\\gdef\\GITAbrHash{%h}\
		\\gdef\\GITAuthorDate{%ad}\
		\\gdef\\GITAuthorName{%an}" >> vc.tex

The interesting rule is the last one. It runs git log to produce vc.code. The hardest thing for me was to get the dependencies right. I think I’ve got it now. The input file vc.tex needs to be regenerated whenever it predates the last commit. As far as I understand the Git internals, the modification time of .git/logs/HEAD should give me a reliable timestamp for when the last commit happened, so I made my rule for vc.tex depend on that.

Of course, it’s a cheap operation, so we could generate vc.tex anew every time we run pdflatex. But then every call to make would recompile the source (because vc.tex has changed). To avoid that, we could leave vc.tex out of the dependencies for $(latexfile).pdf. But then a commit (which modifies the revision number but not any source files) would not lead to an automatic recompile. The LaTeX document would only display new revision information whenever it is edited after that revision.

If there’s a cleaner way of checking for “is vc.tex outdated compared to the Git repository”, please tell me.

TODO: Make the LaTeX document reflect that it corresponds to uncommitted edits after the latest revision. This should be doable by comparing the modification times of the LaTeX source files and .git/logs/HEAD. A cruder way is to let git status tell the Makefile if working directory is “clean”.

UPDATE (21 Feb 2012): Since this post was written, various other approaches have appeared. (Thanks to the commenters for pointing them out.) The idea of using a post-commit hook instead of a Makefile is now on CTAN: gitinfo package.

Generation Z and the Alphabet

I teach generation Z, people who are now in their early twenties.

Generation Z follows Generation Y, which follows Generation X.

What will we call the next generation? We’ve run out of letters! “Generation [”?

Well, it doesn’t matter. Read on…

I was reviewing old exam questions in my introductory algorithms and data structures class. Here’s the question:

This looks innocent enough. However, several students were openly annoyed by this question.

What’s the problem? It has nothing to do with priority queues or heap order or anything else algorithmic. If you aren’t generation Z, you’ll never guess.

They asked me to please use numbers instead of letters. Why? I turns out that comparison between letters is no longer constant time! As one student put it, with a straight face, it’s really hard to determine if, say, Q is higher or lower than some other letter. Helpful students sagely suggested to their fellow students to just start by making a list of the alphabet on a separate piece of paper for this type of exercise. This was met with earnest nodding.

It was quite clear that I had made this question needlessly difficult by making it about letters.

This is a sterling example of a skill that is utterly natural to my generation, who has looked things up alphabetically countless times. I have no harder time comparing M and S than I have comparing 5 and 13. But, of course, Generation Z has never looked anything up alphabetically. It‘s an utterly useless skill honed in the olden days of outdated information technology, like knowing how a slide rule works or typing on a T9 mobile phone keypad. Generation Z finds this as hard (and as useless) as I find it comparing Ψ and Φ. I can do it, because I memorised the Greek alphabet with I was eight or so, and can still rattle it off in the right order. But it takes linear time in the size of the alphabet.

So, from now on, I guess I use plain old numbers in this type of exam questions.

Also, the generations following Generation Z can be safely called Generation W, V, U, etc. Nobody will notice.

Also, I feel old.

Exponential Time Algorithms at ICALP 2011

The accepted papers for ICALP 2011 have been announced.

Based on my quick perusal of the track A papers with abstracts, here’s a list of papers related to exponential time computation, together with references to online version — I’m probably missing some.

  • Isolde Adler, Stavros Kolliopoulos, Philipp Klaus Krause, Daniel Lokshtanov, Saket Saurabh and Dimitrios Thilikos. Tight Bounds for Linkages in Planar Graphs. PDF at Adler’s web page.
  • Sanjeev Arora and Rong Ge. New Algorithms for Learning in Presence of Errors. PDF at Ge’s web page
  • Hans L. Bodlaender, Bart M. P. Jansen and Stefan Kratsch. Preprocessing for Treewidth: A Combinatorial Analysis through Kernelization. arxiv:1104.4217.
  • Andrei Bulatov and Dániel Marx. Constraint satisfaction parameterized by solution size.
  • Amin Coja-Oghlan and Angelica Pachon-Pinzon. The decimation process in random k-SAT. arxiv:1102.3145
  • Marek Cygan, Marcin Pilipczuk, Michal Pilipczuk and Jakub Wojtaszczyk. Subset Feedback Vertex Set is Fixed Parameter Tractable. arxiv:1004.2972.
  • Daniel Lokshtanov and Dániel Marx. Clustering with Local Restrictions. PDF at Lokshtanov’s web page.
  • Danny Hermelin, Matthias Mnich, Erik Jan Van Leeuwen and Gerhard J. Woeginger. Domination when the Stars Are Out. arxiv:1012.0012

Unfortunately, nothing about ETH. That would have been fun.

Dansk algoritmeterminologi

This post in Danish.

Coddled egg on hash, letkogt æg på biksemad. Kilde: Wikimedia Commons.

Formålet med denne oversigt er at sammenfatte (og til en vis grad foreslå) dansk fagterminologi for algoritmer og datastrukturer. Movitationen er dels at lette oversættelsen af fagtermer både mellem og inden for begge sprog, dels at stille dansk terminologi til rådighed for dem, der måtte have behov for at udtrykke sig skriftligt eller i formidlingsøjemed uden for snævre fagkrede.

Blandt fagfæller vil man i de fleste mundtlige situationer kunne gøre sig bedst forståelig ved at anvende de engelske termer med tillempet dansk-engelsk udtale og syntaks (»til sidst merger du /arraysene/«). Denne tilgang kan også anbefales, hvis man primært er interesseret i at signalere tilhørighed til fagfællesskabet.

Jeg har stor sympati for den angelsaksiske tradition for at finde fagtermer som er både kødfulde og ofte lidt latterlige (mouse, bubble sort, stack), dels fordi det skaber nyttige analogier i en ellers fremmedgørende digital virkelighed, dels fordi uhøjtideligheden nedbryder den benovelse, men ellers kan føle i mødet med fagterminologi. Desværre går begge disse effekter tabt, når mindre gængse engelske ord (merge, browse, hash, array) bruges uoversatte.

Der findes mig bekendt intet trykt dansksproget material i algoritmer og datastrukturer ud over Schmidt og Scharzbachs noter »Programmeringsteori og datastrukturer« og »Grafalgoritmer og algoritmisk problemløsningsteknik« fra Aarhus Universitet fra 1990erne og Polyteknisk forlags »Find formlen – algoritmer og datastrukturer« fra 2007. Deres terminologi er medtaget her. Jeg er taknemmelig for
at blive gjort opmærksom på dokumenterede forekomster af ord jeg måtte have overset. Nogle af konstruktionerne forneden er dog helt mine egne forslag; jeg har markeret dem med et advarende udråbstegn.

afslutning

substantiv
eng. closure

transitiv afslutning

»et endeligt dimensionalt kompakt Hausdorffrum, der er afsluttet i primidealrummet mht. hylster-kerne topologien«

spredefunktion (!)

eng. hash function

spredeværdi (eng. hash value)

spredetabel (eng. hash table)

universel spredning (eng. universal hashing)

Kommentarer:

Den udbredte danske terminologi er at anvende det engelske ord hash i tillempet dansk udtale ([hasj]), som i hashfunktion, hashværdi, hashtabel, at hashe.

På engelsk betegner substantivet hash en blandet ret typisk baseret på genopvarmede rester fra i går, og to hash betyder at hakke. I den rige teoridannelse bag spredefunktioner findes der sågar et »leftover hash lemma«. Desværre forbinder vi på dansk ikke hash med biksemad, men med det arabiske ord for hamp, hashish, et udbredt rusmiddel.

Har man lyst til at bruge et ord på dansk, der holder den engelske metafor i live, kan man forsøge sig med hakkefunktion, som også morfologisk ligger tæt op ad originalet. (Min ordbog informerer mig om at hash kommer til engelsk fra de franske hacher, som også er rod til hatchet, mens det danske hakke har plattyske rødder.) Mere spændende er rodefunktion. Jeg har i en periode
sagt biksefunktion (bikseværdi, universalbiks, gøgebiks) til mig selv, som indeholder både det rodede og det kulinariske perspektiv. Jeg er meget splittet i denne sag og kommer formentlig på bedre tanker engang.

bredde først-søgning

substantiv
eng. breadth first search

Bemærk tegnsætning, jf. først til mølle-princip.

del og hersk

substantiv
eng. divide and conquer

»den hurtige fouriertransformation er en del og hersk-algoritme«

Alternativer: del og kombinér (brugt ved AU).

Bemærk tegnsætningen, jf. gør det selv-mand.

dybde først-søgning

substantiv
eng. depth first search

dynamisk programmering

substantiv
eng. dynamic programming

Kommentarer: Ordet programmering er i denne sammenhæng lige så misvisende på dansk som det er på engelsk. Betydningen er »at lægge en plan«, som på dansk er kendt fra 1959 ifølge DDO, ikke »at skrive instruktionerer til en maskine«. Metoden blev formaliseret og
navngivet af Bellman i 1950erne, som beskriver baggrunden for terminologien i sin selvbiografi Eye of the Hurricane, 1984.

flette

verbum
-r, -de, -t
eng. merge

flettesortering

Vi kan flette to sorterede lister in lineær tid.

komplet

eng. complete

Et problem er NP-komplet, hvis det tilhører NP og er NP-hårdt.

Alternativet er fuldstændig, som bruges ved AU. Komplet ligger morfologisk nærmere det gængse engelske begreb.

graf

I matematikken henviser betegnelsen graf både til tegningen af en matematisk funktion, og til en kombinatorisk struktur. Dette uheldige begrebssammenfald optræder på både engelsk (graph) og tysk (Graph), hvor terminologien blev skabt i 1930erne. Grafens elementer hedder hjørner (eng. vertices, ty. Ecken), knuder (eng. nodes, ty. Knoten), eller bare punkter (eng. points).
Elementerne er forbundet med kanter (eng. edges, ty. Kanten), eller bare linjer. Når grafen er rettet (eng. directed), hedder kanterne ofte buer (eng. arcs, ty. Bögen) eller rettede kanter (eng. directed edges). En internt knudedisjunkt vej med samme hovede og hale er en
kreds (eng. circuit) eller cykel. Hvis grafen udgør et træ, kaldes elementerne ofte for knuder på dansk.

Se: rettet graf, knude

grådig

adjektiv
eng. greedy

  • grådig algoritme
  • grådig knudefarvning

indsættelsessortering

eng. insertion sort

hob

substantiv, fælleskøn
-en, -ene
eng. heap.

Mads sætter et element i hoben. Lise fjerner et element fra hoben. Det mindst element ligger øverst i hoben.

binærhob eller binær hob (eng. binary heap).

rækkebaseret hob (eng. array based heap).

hobsortering (eng. heap sort)

Alternativer til hob er bunke (som anvendes ved AU) eller dynge, som begge ligger bedre i munden. Men hob ligger nærmere den etablerede engelske betegnelse heap og optræder ved både KU, SDU og DTU.

hægtet liste

eng. linked list

Bruges ved KU, RUC og DTU. Ved AU bruges betegnelsen kædet liste. Man kan også støde på lænket liste, som kan være attraktiv fordi den ligger morfologisk (men ikke inholdsmæssigt) nærmere den engelske terminologi.

dobbelthægtet liste

En hægtet liste er en rekursiv datastruktur som er enten tom eller en reference til en knude bestående af et element og en reference til en hægtet liste.

knude

eng. node

Det danske ord /node/ er noget andet.

kviksort

eng. quicksort

Et udbredte alternativ er at bruge engelsk stavning og en tillempet engelsk udtale af quicksort, jf. kviksølv, kviksand, men quickstep. Algoritmen blev navngivet af C. A. R. Hoare og kunne sagtens hedde hoaresortering i stedet.

Se splitelement

substantiv, fælleskøn
-en, -ene
eng. queue, fra fransk queue

Søren stiller et element i køen. Metter fjerner et element fra
køen.

Først ind-først ud-kø.

prioritetskø (eng. priority queue)

Forældet dansk stavemåde for er ligeledes queue (med samme udtale som , jf. stavemåden bøf for beuf).

Almindelig ved alle danske læreanstalter. De engelske verber for indsættelse og fjernelse, enqueue og dequeue, er så lidt mundrette for danske sprogbrugere, at de sjældent anvendes, selv i engelsk-præget talesprog. Svensk har det praktiske verbum att köa for at stille sig i kø, og det ligger nært at bruge att avköa for dequeue, men på dansk virker det kluntet at bruge »Søren køer elementet« og »Mette afkøer elementet«.

linearitmisk

adjektiv
-, -e
eng. linearithmic.
[lineɑˈʁidmisg]

Neologisme, sammentrækning af lineær og logaritmisk. Funktionstilvækst proportional med $Nlog N$.

  • flettesortering kører i linearitmisk tid

lineær probering

eng. linear probing

Probere og probering er blevet brugt på dansk i århundreder inden for metallurgien i samme betynding som eng. probe.

markovkæde

eng. Markov chain

opslag

substantiv
eng. query

Alternativ: forespørgsel

rettet graf (!)

eng. directed graph, no. rettet graf, sv. riktad graf.

urettet graf

rettet kant, rettet cykel, rettet vej

Der er ikke mig bekendt nogen vedtaget terminologi for rettede grafer på dansk andet end betegnelsen orienteret graf og uorienteret eller ikke-orienteret graf. Desværre betyder oriented graph på engelsk er noget andet. Betegnelserne rettet og urettet undgår den mulige misforståelse, er kortere, mundrette (ha!), og gængse ord i dansk både som adjektiv og verbum, jf. »ensrettet vej« og »rette et våben mod nogen«.

Jørgen Bang-Jensen fra SDU, som har forfattet standardreferencen om rettede grafer, siger digraf på dansk (og digraph på engelsk), hvilket jeg finder både fikst og mundret. Ordet digraf findes allerede på dansk og er betegner i retskrivningen to bogstaver, der sammen repræsenterer én lyd. Bemærk at denne terminologi dog ikke giver noget forslag til rettet kant (dikant?) eller problemet med at betegne en graf som urettet (udigraf?).

række (!)

eng. array

På DTU bruges tabel, hvilket kan kollidere med anvendelser som symboltabel (eng. symbol table, som ikke er en række), engelske ord som hash table og brugen af tabel (eller table) for todimensionelle rækker. Række ligger tættere op ad den engelske array, frem for alt i betydningen »ordnet opstilling«, fx »the soldiers were arrayed in the yard«.

En anden dansk tradition, brugt fx ved AU, er at bruge vektor for array. Det giver matematisk god mening; i nogle år kolliderede denne brug med datatypen Vector i Java, som var Javas standardimplementation af den fordoblingsbaserede dynamiske række. Brugen af Vector er blevet overskygget af collectionspakken, så ordet er sådan set ledigt igen.

separat hægtning

eng. separate chaining eller direct chaining eller bare chaining

shellsortering

eng. Shell sort

Sorteringsmetoden er opkaldt efter Donald Shell. Bemærk stavemåden med lille begyndelsesbogstav, jf. dieselmotor.

splitelement

eng. partition element eller pivot element

Visse kilder (men hverken Sedgewick lærebog, Hoares originalartikel eller Knuths bøger) kalder splitelementet for pivoteringselement (eng. pivot element), fra fransk pivot: »tap hvorom noget drejer sig«. Jeg har ikke været i stand til at finde en god begrundelse for denne terminologi – jeg har en mistanke om, at den henviser til
pivotering i militærformationer og er muligvis skabt af Cormen, Leiserson og Rivests lærebog. Betegnelsen er direkte misvisende, idet de andre elementer ikke pivoterer omkring splitelementet, som forresten heller ikke står stille. Nytten af at introducere mystificerende terminologi, hvad enten det er på engelsk eller dansk, for en ganske gemen opdelingsproces har aldrig åbenbaret sig for mig. (I forbindelse med simpleksalgoritmen i lineær programmering giver pivotering derimod god mening.)

Alternativer: pivoteringselement, opdelingselement

stak

substantiv, fælleskøn
-ken, -ke, -kene
eng. stack, fra oldnordisk stakkr.

Søren trykker (eng. push) et element på stakken. Mette popper staktoppen.

Først ind/sidst ud-kø

*staktop (eng. stack top)

Indsættelsesoperationen hedder ofte push på engelsk, som oversættes med tryk bedre end med skub, hvis man vil bevare analogien til en fjederstøttet tallerkenstak. Der er ikke noget farverigt danskt ord for eng. pop, og »poppe op« og »pop op-bog« forekommer allerede på dansk. Man kan selvfølgeligt helt forlade push og pop-metaforerne og blot »lægge på stakken« og »fjerne fra stakken«, hvilket er både klart og ukontroversielt og derfor måske den mest anbefalelsesværdige løsning. Vil man lægge sig nærmere op ad den oprindelige tyske terminologi, kan man forsøge sig med kælderlager, indkældre og udkældre, som i for sig er kraftige og mundrette ord, men (så vidt jeg ved) helt uetablerede ved danske læreanstalter.

symboltabel

eng. symbol table

søgetræ

eng. search tree

binært søgetræ
2-3-søgetræ, jf. 1-0-føring
rødt-sort søgetræ, jf. sort-hvidt tv
top-ned 2-3-4-træ
bund-op 2-3-4-træ

topologisk sortering

eng. topological sort

trie

substantiv, fælleskøn
-en, -er, -erne
eng. trie

Det engelske trie er en neologisme som betegner en træ-lignende datastruktur navngivet efter midterstavelsen i retrieval, men udtales alligevel mest som try, ikke som tree. Man kan på dansk more sig med at finde en overstættelse som holder ordspillet i live, men det er vanskeligt og – givet triers relative sjældenhed – unødvendigt. Trie bør vel opføre sig som de danske substantiver die og gie, dvs. fælleskøn og med diftong.

udspændende træ

eng. spanning tree

letteste, udspændende træ

Et udbredt alternativ for letteste, udspændende træ er minimalt, udspændende træ. Bemærk dog, at i »minimalt udspændende træ« (i talesprog eller skriftligt uden komma) er minimalt et adverbium, og bekriver (ganske meningsløst) graden af udspændthed, i stedet for træets totale vægt.

udvalgssortering

eng. selection sort

Alternativer: udvælgesessortering, udtagelsessortering.

vej

eng. path

Dijkstras algorithme finder korteste veje

Det er NP-hårdt at finde en hamiltonvej i en graf

Et alternativ er sti.

Alan Turing Art (Talk at Malmö Konsthall)

Henrik Olesen, from Some Illustrations to the life of Alan Turing

On Wednesday, 19 January 2011, I will give a general audience talk about Alan Turing at Malmö Konsthall, a contemporary art museum.

The occasion is a current exhibition of Henrik Olesen, one of Denmark’s most important contemporary artists according to the programme.

My presentation is based on a talk I gave in the Fall 2010 Teknik- och Naturvetarcirkeln for Folkuniversitetet, but for the art museum I tone down the technical aspects and instead build the talk around a presentation of verious artworks featuring Turing, including:

There is more, including theatre plays, stamps, and scientific and political awards.

See you there!

How Google Works—Film at 11!

During 2010 I have been involved in the production of a TV programme about algorithms, in particular Google’s PageRank. It went online just a few hours ago:

I just sat through the gruelling task of watching myself.

The background is that IT University of Copenhagen, like all Danish Universities, has agreed to produce popular science content for the Danish public service broadcasting corporation DR. Thus, the show will be aired on the Danish TV channel DR2 some time during 2011. For now, it’s online at ITU’s own digital channel ITView. It’s all in Danish, of course.

The part I like best, and which took the longest time and most passion to think about and produce, is in the middle. The conceit is that the PageRank algorithm could have been invented in the Middle Ages and used by the Catholic church to rank books instead of web pages.

A computing monk doing algebraic graph theory

So we shot some scenes with monks in an old monastery who executed the algorithm by hand, using quill and paper. Actually seeing the finished scenes made my day today. It looks great, thanks to the people involved in actually filming this. I take endless joy from seeing authentic-looking monks drawing directed graphs and pretending to compute eigenvectors.

If you look closely, in the still you can see a drawing that looks suspiciously like figure 2 from “Authoritative sources in a hyperlinked environment” (Jon Kleinberg, J ACM 46(5): 604-632 (1999)). The programme is mainly about the PageRank algorithm of Brin and Page, but I like to imagine that the clever monks have active research in alternative ranking algorithms.

Computing the Tutte Polynomial on GitHub

We described an algorithm to compute the Tutte polynomial in

  • Andreas Björklund, Thore Husfeldt, Petteri Kaski, Mikko Koivisto, Computing the Tutte polynomial in vertex-exponential time. 9th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2008, October 25-28, 2008, Philadelphia, PA, USA. IEEE Computer Society 2008, pp. 677–686. [PDF]

In fact, we even have an implementation of this algorithm, which has been available on request from the authors for a while. We have now put this implementation on the public repository GitHub:

You can either download a tar archive from there, or (if you use Git) clone the repository.

The code uses some tricks to speed up the implementation. for example, the coefficients are computed modulo several small prime moduli and then assembled with the Chinese remainder theorem.

However, the code remains a very faithful implementation of the underlying inclusion–exclusion idea and uses no other algorithmic ideas. This is the fastest worst-case implementation known to us and outperforms other implementations (for example, on dense graphs) around n(G) = 16. However, for specific input graphs you might meet in the wild, for instance graphs with few edges, small cuts, or many symmetries, other algorithms could perform much better. A very good implementation of many of these other ideas by Gary Haggard, David J. Pearce and Gordon Royle, can be found at

The work in our implementation is done in a C program “tutte_bhkk”, which can be run from the command line. The input is given in 0/1 adjacency matrix format; more precisely, the input is “N row1 row2 ... rowN”, where N is the number of vertices and rowJ is the Jth row of the adjacency matrix. For example, a triangle is given as

3 0 1 1 1 0 1 1 1 0

The output is a table of coefficients, where the entry at row i, column j gives the coefficient of the monomial xiyj for i, j=0, 1, 2, … in

T_G(x,y) = \sum_{F\subseteq E} (x-1)^{c_F(G)-c(G)}(y-1)^{c_F(G)+|F|-n(G)}

where G is the input graph, V is the vertex set of G, E is the edge set of G, c(G) is the number of connected components in G, cF(G) is the number of connected components in the subgraph of G with vertex set V and edge set F, and n(G) is the number of vertices in G.

For example, for the triangle we obtain the output

0 1
1
1

or equivalently, TG(x, y) = x + x2 + y.

The python module “tutte.py” is a very simple wrapper that serves two purposes.

First, it connects “tutte_bhkk” to the networkx library, which is a collection of graph algorithms and data structures for python. In particular, “tutte.py” exports the function tutte_poly(G), which returns the Tutte polynomial of a given networkx.Graph.

For example, you can write another python script like this:

from tutte import tutte_poly
from networkx import chvatal_graph

print tutte_poly(chvatal_graph())

Second, “tutte.py” can be called from the command line and serves as a convenient interface to “tutte_bhkk”. The input can be given either as an edge list on standard input, or in a compact shorthand format. The output is either a table of coefficients (default) or TeX.

Some examples:

$ python tutte.py --petersen
$ python tutte.py --short="0--1 1--2 2--0" --output=tex
$ python tutte.py
0 1
1 2
2 0
^D