Category Archives: Uncategorized

π out of 4 EATCS Members Don’t Care About Publication Models

Bummer.

The European Association for Theoretical Computer Science has around 1000 members, including many European TCS researchers and everybody who attended conferences like ICALP and ESA in a particular year. Traditionally, EATCS has had a close relationship to scientific publisher Springer, who publishes the proceedings of EATCS’s “flagship conference” ICALP.

A recurring theme at the EATCS general assembly meetings for the past many years has been the publication model for ICALP. Should “we” continue with Springer, or move to an open-access publisher like LIPIcs? After considerable deliberation, a ballot among EATCS members was held in late 2011.

For more background:

  1. EATCS general assembly at ICALP 2010 on my blog.
  2. EATCS general assembly at ICALP 2011 on my blog.
  3. EATCS ballot on Luca Aceto’s blog.

I have been absolutely thrilled that the EATCS council decided to send this question to its members. Well done. I want to see more of this.

Unfortunately, the outcome is not what I would have hoped. The EATCS council required a quorum of 25% of EATCS members for the ballot to be valid. I think this is a reasonable requirement – the ICALP publishing model is an important decision that should be safely moored.

Alas, the quorum was not reached. On 9 November 2011, EATCS president Burkhard Monien informed the EATCS that out of the 1094 members, only 260 had voted. And that’s just slightly less than 1 out of 4. It’s 23.8%.

Informative graphic explaining the magnitude of the value 13 in relation to the value 1094


I am, of course, somewhat miffed that 834 EATCS members don’t seem to think that the way their research is published is sufficiently important to execute three or four mouse clicks to make their voice heard one way or the other. After all, it’s only about publication counts, accessibility, dissemination, ethics, prestige, jobs, cvs, and most everything else by which we are evaluated.

Monien adds,

However, I also want to let you know that the majority of the 260 votes were in favor of LIPIcs. EATCS will observe the further development carefully.

So at least we know the opinion of those EATCS member who care. Next time we just need to remember to inform intelligent adults of the importance to actually vote.

Sorting Networks Activity

Sorting Network in action

A 9-input sorting network in action

Computer Science Unplugged is a wonderful initiative that collects activities that communicate some basic ideas of computer science – without using computers.

IT University of Copenhagen opened its doors to the public last Friday in connection with the city-wide “Culture Night”, so I used the opportunity to implement the Sorting Networks activity.

ITU has a big stair case in its atrium, which had the proper size for a 9-input sorting network. I measured everything up, found a good-looking network in Knuth’s The Art of Computer Programming, and sketched the layout for the friendly folks at Facilities Management.

The network was “drawn” on the stairs using adhesive tape, which took a few hours on the day before the event. It looked absolutely splendid. The activity was manned with enthusiastic undergraduate students for several hours, who herded eager and intrigued guests of all ages through the maze of comparators. A smashing success.

Also, a student volunteered to translate the Wikipedia page on sorting networks into Danish, so the outreach activity had a permanent, digital result as well: Sorteringsnetværk.

Originally, I had also promised myself to produce a large poster or two next to the activity, which should explain what’s actually going on, why sorting is important, and (in particular) show a few more concrete networks of various sizes. This would have given the whole activity a bit more scientific muscle. Alas, I failed to get this done in time. Better next time.

I also failed to show up in time for the kick-off, instructing the student volunteers in how the whole thing was supposed to work. Apparently, they spent a good number of attempts running the network in the reverse direction (from the top of the stairs and down, which arguably is the obvious direction), and tried to find the bug! So now they know that sorting networks don’t necessarily work in both directions. (But insertion sort does!) This may be a fresh research question: find an optimal bidirectional sorting network for 9 inputs.

All in all, a splendid event. Thanks to everybody who helped, or just wanted to get themselves sorted out.

ICALP 2011

I am back from sunny Zürich and ICALP 2011.

Attendee at ICALP 2011 business meeting. The image shows the free ice cream and name tag.

This was a very well-organised event hosted at ETH. Michael Hoffmann had everything under complete control and remained confident, alert, calm and accessible. I’ve tried go take a glimpse behind the curtains, such as looking at the two-page to-do list for technical staff in the lecture halls, and the level of detail and relevance is just off the scale. I hope Michael can summarise his preparation and tips for everybody’s benefit.

Future Publication of the ICALP Proceedings

The EATCS general assembly is by tradition a long and dry event without free beer. To break new ground, the organisers handed out free ice cream. ’tis not beer, but a welcome step in the right direction. I hope the organisers of next year’s ICALP, Warwick, take note.

Burkhard Monien’s EATCS Annual report 2010-2011 [PDF] is available on-line, so there’s no need to summarise it here.

The exciting question is the publication model for ICALP: will the EATCS flagship conference stay with Springer’s LCNS series (or some transformation of it) or not? I blogged about this last year, from the 2010 assembly:

Monien explained the EATCS council wants to get this right, ICALP is a big steamer, should not go in zig-zag course, and stick to its decisions. He was unsure if the next poll will be implemented in the Fall, but promised “beginning of next year, at the latest.” I’m not holding my breath.

Well, the Fall of 2010 has come and gone. The EATCS council had another meeting during ICALP 2011, and Monien summarised the decision. Here’s what I was able to jot down quickly from his presentation:

  • The decision is between Springer and the LIPIcs series and will be decided by the members of EATCS by a ballot “at the end of the summer.”
  • The EATCS Council recommends to go with Springer.
  • The poll will include the following documents: The proposals from both Springer and LIPIcs. The recommendation of the EATCS council. A dissenting opinion from a minority in the Council. A list of pros and cons.
  • The final decision is made by simple majority among EATCS members. A quorum of 25% is necessary.

For a quick calculation, the report currently lists 787 EATCS members. So some 200 members would need to respond. Get out the vote!

As a sign that times are changing, the ICALP proceedings were online and digitally available to all conference attendees at the time of the conference. Splendid.

Name Tags

I obsess about name tags.

ICALP 2011 chose the neck band solution, which I derided in a brilliant piece of investigative journalism Post ALGO 2009.

In the comments of that post, G. Philip suggested to print names on both sides of the tag, so as to increase the probability of legibility close to 1. Remarkably, this was exactly what the organisers had done!

(Again, enterprising conference attendees managed to defeat the good intentions. The lunch tickets happened to fit exactly in the name tag pocket, so that with probability close to 1/2, the tag showed the name badge, and with probability close to 1/2, it showed lunch tickets. Incidentally, lunch tickets also had your name on them, but only printed on one side. Thus, in total, attendees were identifiable with probability around 3/4.)

But the big problem with neck bands is that the name badge ends up below table height during lunch and also otherwise is awkward to inspect.

The layout of the ICALP 2011 badge is fair. A good part of the real estate is actually taken up by the attendee’s name. Also, accents were correct as far as I could tell.

In fact, the organisers chose to display first names in larger size, giving the event a friendly touch. I like this. However, during a thoughtful lunch conversation, somebody (I don’t know who because his name tag was dangling below the table) mentioned that the last name is more useful for striking up conversations. Especially newcomers would prefer to know that they’re talking to Turing instead of Alan.

I’m undecided about the name size issue. Maybe another EATCS poll can decide it.

Exponential Time Algorithms at ESA 2011

The list of accepted papers for ESA 2011 is online. Below is my own quick take on which papers are about exponential time algorithms.

ESA 2011 is colocated with IPEC under the ALGO 2011 umbrella, so there will be plenty of exciting results that week.

  • Dimitrios Thilikos. Fast sub-exponential Algorithms and Compactness in Planar Graphs.
  • Gwenaël Joret, Christophe Paul, Ignasi Sau, Saket Saurabh, Stéphan Thomassé, Hitting and Harvesting Pumpkins, arXiv:1105.2704.
  • Fedor Fomin, Ioan Todinca and Yngve Villanger, Exact algorithm for the maximum induced planar subgraph problem, Todinca’s slides from Worksh. Graph. Decomp. 2010.
  • Marek Cygan, Marcin Pilipczuk, Michal Pilipczuk and Jakub Wojtaszczyk, Scheduling partially ordered jobs faster than 2n.

I’m probably missing some results, but without online abstracts it’s hard to tell. I cannot judge a paper from its title alone. Comments and corrections are welcome, in particular links to online manuscripts.

Filter Bubble in Weekendavisen

I managed to excite Danish weekly Weekendavisen about the societal impacts of internet personalisation, along the lines of Eli Pariser’s recent book The Filter Bubble. This resulted in a nice, meaty two page article starting on the front page of the paper’s science section Idéer.

This is part of my ongoing effort to unleash the meme of the Algorithmic Lens in the public discourse. Two years ago I initiated an article about the algorithmic lens on the sciences in the same paper.

The Social Science Filter Bubble

The creation of such an article is a give-and-take between me and the journalist. I’d collaborated with him previously on the production of a TV programme about how Google’s page rank algorithm works, so we had established a level of common trust. Still, we come from vastly different epistemological backgrounds.

Here’s a detail that made me smile.

I originally suggested a formulation along the following lines:

Dewey and Habermas are fine, but today the public sphere is shaped by algorithmic processes, and we’re just in the beginning of that development. And just like we turn to Newton to understand gravity, we must turn to Turing and his disciples to understand some of the forces that influence the behaviour of individuals and groups in the information society.

Plenty of dropped names, and it was rejected for being too opaque. Consequently, in the final version, the references to philosophers Dewey and Habermas remain, but Newton and Turing have been removed. So, the readership of Weekendavisen is of course expected to know the public sphere and the civil society (as rightly they should). But they are protected from having to google who the other two guys are.

This is an example of the “old” filter bubble, where media through self-selection and editorial policy shape a common frame of reference, an understanding of what a Citizen is supposed to educate herself about.

In the Brave New World to come, an algorithm would have processed my quote. It would perhaps have replaced “Dewey and Habermas” by “Philosophers” or “Some social scientists” or “Dead White Males” on a per-reader basis, based on the epistemological priors, background knowledge, and ideological preferences of the reader.

Further reading

If you came here from Weekendavisen because you googled my name – provided your personalisation settings allowed me to burst your filter bubble –, and want to read more, check out thefilterbubble.com. That’s Eli Pariser’s blog, including links to his book, his TED talk video, and a list of 10 ways to pop your filter bubble.

Alternatively, check out my recent (Danish) TV production How Google works, which explains the PageRank algorithm. Now, the Good Olde Days.

Or you can hire me to give a talk; I have several ready, such as Hvordan Google virker or The Algorithmic Lens. See the full list at Popular Science talks.

Update: I also appeared on Danish public radio about this: P1 Morgen 4 Jun 2011, 7m45s, around 9:30.

Update: And even the the local paper Sydsvenskan: De digitala skygglapparna (4 July 2011) (in Swedish).

Update: And even in Svenska Dagbladet:
Isolerade i nätbubblan (15 August 2011) (in Swedish).

Image source: Wikimedia Commons

Including Git Revision Identifiers in LaTeX

I keep most of my LaTeX source files under the revision control system Git.

Stupid git


Git is opinionated software. It allows you to do many things, but others run counter to its dogmas. In particular, Git does not modify source files.

Other systems do. For example, you can ask Subversion to replace the placeholder $Revision in the committed source file with the latest revision number. This is useful for displaying revision numbers or dates in your LaTeX documents, for example.

From Git’s perspective, modification of source file is pure evil. Not only is this feature absent from Git, but the very request betokens moral corruption.

From Git’s perspective, the client programme (in this case, LaTeX) is responsible for including revision information in the output. Git humbly makes that information available. But it does not modify the source.

Stephan Hennig’s vc bundle consists of a number of scripts that extract revision information from various version control systems, including Git, and make it available as LaTeX macros. The information is taken from a separate, automatically generated input file vc.tex, so that the main LaTeX source file remains untouched by Git.

This shifts the problem to the generation of vc.tex. I’ve played around with various solutions based on Makefiles, and here is my current setup.

The vc.tex file is automatically generated and defines three LaTeX macros. Typically, it looks like this:

%%% This file is generated by Makefile.
%%% Do not edit this file!
%%%
\gdef\GITAbrHash{f61c739}
\gdef\GITAuthorDate{Fri May 13 10:34:51 2011 +0200}
\gdef\GITAuthorName{Thore Husfeldt}

The main LaTeX source includes the vc.tex file in the preamble and can now freely use these macros. For example, the revision information can be included in a footnote on the title page.

\documentclass{article}
...
\input{vc.tex}
...
\begin{document}
\maketitle
\let\thefootnote\relax
\footnotetext{Base revision~\GITAbrHash, \GITAuthorDate, \GITAuthorName.}
...

The responsibility of producing an up-to-date vc.tex rests on the Makefile:

latexfile = main

all: $(latexfile).pdf

$(latexfile).pdf : $(latexfile).tex vc.tex
	while (pdflatex $(latexfile) ; \
	grep -q "Rerun to get cross" $(latexfile).log ) do true ; \
	done

vc.tex: .git/logs/HEAD
	echo "%%% This file is generated by Makefile." > vc.tex
	echo "%%% Do not edit this file!\n%%%" >> vc.tex
	git log -1 --format="format:\
		\\gdef\\GITAbrHash{%h}\
		\\gdef\\GITAuthorDate{%ad}\
		\\gdef\\GITAuthorName{%an}" >> vc.tex

The interesting rule is the last one. It runs git log to produce vc.code. The hardest thing for me was to get the dependencies right. I think I’ve got it now. The input file vc.tex needs to be regenerated whenever it predates the last commit. As far as I understand the Git internals, the modification time of .git/logs/HEAD should give me a reliable timestamp for when the last commit happened, so I made my rule for vc.tex depend on that.

Of course, it’s a cheap operation, so we could generate vc.tex anew every time we run pdflatex. But then every call to make would recompile the source (because vc.tex has changed). To avoid that, we could leave vc.tex out of the dependencies for $(latexfile).pdf. But then a commit (which modifies the revision number but not any source files) would not lead to an automatic recompile. The LaTeX document would only display new revision information whenever it is edited after that revision.

If there’s a cleaner way of checking for “is vc.tex outdated compared to the Git repository”, please tell me.

TODO: Make the LaTeX document reflect that it corresponds to uncommitted edits after the latest revision. This should be doable by comparing the modification times of the LaTeX source files and .git/logs/HEAD. A cruder way is to let git status tell the Makefile if working directory is “clean”.

UPDATE (21 Feb 2012): Since this post was written, various other approaches have appeared. (Thanks to the commenters for pointing them out.) The idea of using a post-commit hook instead of a Makefile is now on CTAN: gitinfo package.

Generation Z and the Alphabet

I teach generation Z, people who are now in their early twenties.

Generation Z follows Generation Y, which follows Generation X.

What will we call the next generation? We’ve run out of letters! “Generation [”?

Well, it doesn’t matter. Read on…

I was reviewing old exam questions in my introductory algorithms and data structures class. Here’s the question:

This looks innocent enough. However, several students were openly annoyed by this question.

What’s the problem? It has nothing to do with priority queues or heap order or anything else algorithmic. If you aren’t generation Z, you’ll never guess.

They asked me to please use numbers instead of letters. Why? I turns out that comparison between letters is no longer constant time! As one student put it, with a straight face, it’s really hard to determine if, say, Q is higher or lower than some other letter. Helpful students sagely suggested to their fellow students to just start by making a list of the alphabet on a separate piece of paper for this type of exercise. This was met with earnest nodding.

It was quite clear that I had made this question needlessly difficult by making it about letters.

This is a sterling example of a skill that is utterly natural to my generation, who has looked things up alphabetically countless times. I have no harder time comparing M and S than I have comparing 5 and 13. But, of course, Generation Z has never looked anything up alphabetically. It‘s an utterly useless skill honed in the olden days of outdated information technology, like knowing how a slide rule works or typing on a T9 mobile phone keypad. Generation Z finds this as hard (and as useless) as I find it comparing Ψ and Φ. I can do it, because I memorised the Greek alphabet with I was eight or so, and can still rattle it off in the right order. But it takes linear time in the size of the alphabet.

So, from now on, I guess I use plain old numbers in this type of exam questions.

Also, the generations following Generation Z can be safely called Generation W, V, U, etc. Nobody will notice.

Also, I feel old.

Exponential Time Algorithms at ICALP 2011

The accepted papers for ICALP 2011 have been announced.

Based on my quick perusal of the track A papers with abstracts, here’s a list of papers related to exponential time computation, together with references to online version — I’m probably missing some.

  • Isolde Adler, Stavros Kolliopoulos, Philipp Klaus Krause, Daniel Lokshtanov, Saket Saurabh and Dimitrios Thilikos. Tight Bounds for Linkages in Planar Graphs. PDF at Adler’s web page.
  • Sanjeev Arora and Rong Ge. New Algorithms for Learning in Presence of Errors. PDF at Ge’s web page
  • Hans L. Bodlaender, Bart M. P. Jansen and Stefan Kratsch. Preprocessing for Treewidth: A Combinatorial Analysis through Kernelization. arxiv:1104.4217.
  • Andrei Bulatov and Dániel Marx. Constraint satisfaction parameterized by solution size.
  • Amin Coja-Oghlan and Angelica Pachon-Pinzon. The decimation process in random k-SAT. arxiv:1102.3145
  • Marek Cygan, Marcin Pilipczuk, Michal Pilipczuk and Jakub Wojtaszczyk. Subset Feedback Vertex Set is Fixed Parameter Tractable. arxiv:1004.2972.
  • Daniel Lokshtanov and Dániel Marx. Clustering with Local Restrictions. PDF at Lokshtanov’s web page.
  • Danny Hermelin, Matthias Mnich, Erik Jan Van Leeuwen and Gerhard J. Woeginger. Domination when the Stars Are Out. arxiv:1012.0012

Unfortunately, nothing about ETH. That would have been fun.

Dansk algoritmeterminologi

This post in Danish.

Coddled egg on hash, letkogt æg på biksemad. Kilde: Wikimedia Commons.

Formålet med denne oversigt er at sammenfatte (og til en vis grad foreslå) dansk fagterminologi for algoritmer og datastrukturer. Movitationen er dels at lette oversættelsen af fagtermer både mellem og inden for begge sprog, dels at stille dansk terminologi til rådighed for dem, der måtte have behov for at udtrykke sig skriftligt eller i formidlingsøjemed uden for snævre fagkrede.

Blandt fagfæller vil man i de fleste mundtlige situationer kunne gøre sig bedst forståelig ved at anvende de engelske termer med tillempet dansk-engelsk udtale og syntaks (»til sidst merger du /arraysene/«). Denne tilgang kan også anbefales, hvis man primært er interesseret i at signalere tilhørighed til fagfællesskabet.

Jeg har stor sympati for den angelsaksiske tradition for at finde fagtermer som er både kødfulde og ofte lidt latterlige (mouse, bubble sort, stack), dels fordi det skaber nyttige analogier i en ellers fremmedgørende digital virkelighed, dels fordi uhøjtideligheden nedbryder den benovelse, men ellers kan føle i mødet med fagterminologi. Desværre går begge disse effekter tabt, når mindre gængse engelske ord (merge, browse, hash, array) bruges uoversatte.

Der findes mig bekendt intet trykt dansksproget material i algoritmer og datastrukturer ud over Schmidt og Scharzbachs noter »Programmeringsteori og datastrukturer« og »Grafalgoritmer og algoritmisk problemløsningsteknik« fra Aarhus Universitet fra 1990erne og Polyteknisk forlags »Find formlen – algoritmer og datastrukturer« fra 2007. Deres terminologi er medtaget her. Jeg er taknemmelig for
at blive gjort opmærksom på dokumenterede forekomster af ord jeg måtte have overset. Nogle af konstruktionerne forneden er dog helt mine egne forslag; jeg har markeret dem med et advarende udråbstegn.

afslutning

substantiv
eng. closure

transitiv afslutning

»et endeligt dimensionalt kompakt Hausdorffrum, der er afsluttet i primidealrummet mht. hylster-kerne topologien«

spredefunktion (!)

eng. hash function

spredeværdi (eng. hash value)

spredetabel (eng. hash table)

universel spredning (eng. universal hashing)

Kommentarer:

Den udbredte danske terminologi er at anvende det engelske ord hash i tillempet dansk udtale ([hasj]), som i hashfunktion, hashværdi, hashtabel, at hashe.

På engelsk betegner substantivet hash en blandet ret typisk baseret på genopvarmede rester fra i går, og to hash betyder at hakke. I den rige teoridannelse bag spredefunktioner findes der sågar et »leftover hash lemma«. Desværre forbinder vi på dansk ikke hash med biksemad, men med det arabiske ord for hamp, hashish, et udbredt rusmiddel.

Har man lyst til at bruge et ord på dansk, der holder den engelske metafor i live, kan man forsøge sig med hakkefunktion, som også morfologisk ligger tæt op ad originalet. (Min ordbog informerer mig om at hash kommer til engelsk fra de franske hacher, som også er rod til hatchet, mens det danske hakke har plattyske rødder.) Mere spændende er rodefunktion. Jeg har i en periode
sagt biksefunktion (bikseværdi, universalbiks, gøgebiks) til mig selv, som indeholder både det rodede og det kulinariske perspektiv. Jeg er meget splittet i denne sag og kommer formentlig på bedre tanker engang.

bredde først-søgning

substantiv
eng. breadth first search

Bemærk tegnsætning, jf. først til mølle-princip.

del og hersk

substantiv
eng. divide and conquer

»den hurtige fouriertransformation er en del og hersk-algoritme«

Alternativer: del og kombinér (brugt ved AU).

Bemærk tegnsætningen, jf. gør det selv-mand.

dybde først-søgning

substantiv
eng. depth first search

dynamisk programmering

substantiv
eng. dynamic programming

Kommentarer: Ordet programmering er i denne sammenhæng lige så misvisende på dansk som det er på engelsk. Betydningen er »at lægge en plan«, som på dansk er kendt fra 1959 ifølge DDO, ikke »at skrive instruktionerer til en maskine«. Metoden blev formaliseret og
navngivet af Bellman i 1950erne, som beskriver baggrunden for terminologien i sin selvbiografi Eye of the Hurricane, 1984.

flette

verbum
-r, -de, -t
eng. merge

flettesortering

Vi kan flette to sorterede lister in lineær tid.

komplet

eng. complete

Et problem er NP-komplet, hvis det tilhører NP og er NP-hårdt.

Alternativet er fuldstændig, som bruges ved AU. Komplet ligger morfologisk nærmere det gængse engelske begreb.

graf

I matematikken henviser betegnelsen graf både til tegningen af en matematisk funktion, og til en kombinatorisk struktur. Dette uheldige begrebssammenfald optræder på både engelsk (graph) og tysk (Graph), hvor terminologien blev skabt i 1930erne. Grafens elementer hedder hjørner (eng. vertices, ty. Ecken), knuder (eng. nodes, ty. Knoten), eller bare punkter (eng. points).
Elementerne er forbundet med kanter (eng. edges, ty. Kanten), eller bare linjer. Når grafen er rettet (eng. directed), hedder kanterne ofte buer (eng. arcs, ty. Bögen) eller rettede kanter (eng. directed edges). En internt knudedisjunkt vej med samme hovede og hale er en
kreds (eng. circuit) eller cykel. Hvis grafen udgør et træ, kaldes elementerne ofte for knuder på dansk.

Se: rettet graf, knude

grådig

adjektiv
eng. greedy

  • grådig algoritme
  • grådig knudefarvning

indsættelsessortering

eng. insertion sort

hob

substantiv, fælleskøn
-en, -ene
eng. heap.

Mads sætter et element i hoben. Lise fjerner et element fra hoben. Det mindst element ligger øverst i hoben.

binærhob eller binær hob (eng. binary heap).

rækkebaseret hob (eng. array based heap).

hobsortering (eng. heap sort)

Alternativer til hob er bunke (som anvendes ved AU) eller dynge, som begge ligger bedre i munden. Men hob ligger nærmere den etablerede engelske betegnelse heap og optræder ved både KU, SDU og DTU.

hægtet liste

eng. linked list

Bruges ved KU, RUC og DTU. Ved AU bruges betegnelsen kædet liste. Man kan også støde på lænket liste, som kan være attraktiv fordi den ligger morfologisk (men ikke inholdsmæssigt) nærmere den engelske terminologi.

dobbelthægtet liste

En hægtet liste er en rekursiv datastruktur som er enten tom eller en reference til en knude bestående af et element og en reference til en hægtet liste.

knude

eng. node

Det danske ord /node/ er noget andet.

kviksort

eng. quicksort

Et udbredte alternativ er at bruge engelsk stavning og en tillempet engelsk udtale af quicksort, jf. kviksølv, kviksand, men quickstep. Algoritmen blev navngivet af C. A. R. Hoare og kunne sagtens hedde hoaresortering i stedet.

Se splitelement

substantiv, fælleskøn
-en, -ene
eng. queue, fra fransk queue

Søren stiller et element i køen. Metter fjerner et element fra
køen.

Først ind-først ud-kø.

prioritetskø (eng. priority queue)

Forældet dansk stavemåde for er ligeledes queue (med samme udtale som , jf. stavemåden bøf for beuf).

Almindelig ved alle danske læreanstalter. De engelske verber for indsættelse og fjernelse, enqueue og dequeue, er så lidt mundrette for danske sprogbrugere, at de sjældent anvendes, selv i engelsk-præget talesprog. Svensk har det praktiske verbum att köa for at stille sig i kø, og det ligger nært at bruge att avköa for dequeue, men på dansk virker det kluntet at bruge »Søren køer elementet« og »Mette afkøer elementet«.

linearitmisk

adjektiv
-, -e
eng. linearithmic.
[lineɑˈʁidmisg]

Neologisme, sammentrækning af lineær og logaritmisk. Funktionstilvækst proportional med $Nlog N$.

  • flettesortering kører i linearitmisk tid

lineær probering

eng. linear probing

Probere og probering er blevet brugt på dansk i århundreder inden for metallurgien i samme betynding som eng. probe.

markovkæde

eng. Markov chain

opslag

substantiv
eng. query

Alternativ: forespørgsel

rettet graf (!)

eng. directed graph, no. rettet graf, sv. riktad graf.

urettet graf

rettet kant, rettet cykel, rettet vej

Der er ikke mig bekendt nogen vedtaget terminologi for rettede grafer på dansk andet end betegnelsen orienteret graf og uorienteret eller ikke-orienteret graf. Desværre betyder oriented graph på engelsk er noget andet. Betegnelserne rettet og urettet undgår den mulige misforståelse, er kortere, mundrette (ha!), og gængse ord i dansk både som adjektiv og verbum, jf. »ensrettet vej« og »rette et våben mod nogen«.

Jørgen Bang-Jensen fra SDU, som har forfattet standardreferencen om rettede grafer, siger digraf på dansk (og digraph på engelsk), hvilket jeg finder både fikst og mundret. Ordet digraf findes allerede på dansk og er betegner i retskrivningen to bogstaver, der sammen repræsenterer én lyd. Bemærk at denne terminologi dog ikke giver noget forslag til rettet kant (dikant?) eller problemet med at betegne en graf som urettet (udigraf?).

række (!)

eng. array

På DTU bruges tabel, hvilket kan kollidere med anvendelser som symboltabel (eng. symbol table, som ikke er en række), engelske ord som hash table og brugen af tabel (eller table) for todimensionelle rækker. Række ligger tættere op ad den engelske array, frem for alt i betydningen »ordnet opstilling«, fx »the soldiers were arrayed in the yard«.

En anden dansk tradition, brugt fx ved AU, er at bruge vektor for array. Det giver matematisk god mening; i nogle år kolliderede denne brug med datatypen Vector i Java, som var Javas standardimplementation af den fordoblingsbaserede dynamiske række. Brugen af Vector er blevet overskygget af collectionspakken, så ordet er sådan set ledigt igen.

separat hægtning

eng. separate chaining eller direct chaining eller bare chaining

shellsortering

eng. Shell sort

Sorteringsmetoden er opkaldt efter Donald Shell. Bemærk stavemåden med lille begyndelsesbogstav, jf. dieselmotor.

splitelement

eng. partition element eller pivot element

Visse kilder (men hverken Sedgewick lærebog, Hoares originalartikel eller Knuths bøger) kalder splitelementet for pivoteringselement (eng. pivot element), fra fransk pivot: »tap hvorom noget drejer sig«. Jeg har ikke været i stand til at finde en god begrundelse for denne terminologi – jeg har en mistanke om, at den henviser til
pivotering i militærformationer og er muligvis skabt af Cormen, Leiserson og Rivests lærebog. Betegnelsen er direkte misvisende, idet de andre elementer ikke pivoterer omkring splitelementet, som forresten heller ikke står stille. Nytten af at introducere mystificerende terminologi, hvad enten det er på engelsk eller dansk, for en ganske gemen opdelingsproces har aldrig åbenbaret sig for mig. (I forbindelse med simpleksalgoritmen i lineær programmering giver pivotering derimod god mening.)

Alternativer: pivoteringselement, opdelingselement

stak

substantiv, fælleskøn
-ken, -ke, -kene
eng. stack, fra oldnordisk stakkr.

Søren trykker (eng. push) et element på stakken. Mette popper staktoppen.

Først ind/sidst ud-kø

*staktop (eng. stack top)

Indsættelsesoperationen hedder ofte push på engelsk, som oversættes med tryk bedre end med skub, hvis man vil bevare analogien til en fjederstøttet tallerkenstak. Der er ikke noget farverigt danskt ord for eng. pop, og »poppe op« og »pop op-bog« forekommer allerede på dansk. Man kan selvfølgeligt helt forlade push og pop-metaforerne og blot »lægge på stakken« og »fjerne fra stakken«, hvilket er både klart og ukontroversielt og derfor måske den mest anbefalelsesværdige løsning. Vil man lægge sig nærmere op ad den oprindelige tyske terminologi, kan man forsøge sig med kælderlager, indkældre og udkældre, som i for sig er kraftige og mundrette ord, men (så vidt jeg ved) helt uetablerede ved danske læreanstalter.

symboltabel

eng. symbol table

søgetræ

eng. search tree

binært søgetræ
2-3-søgetræ, jf. 1-0-føring
rødt-sort søgetræ, jf. sort-hvidt tv
top-ned 2-3-4-træ
bund-op 2-3-4-træ

topologisk sortering

eng. topological sort

trie

substantiv, fælleskøn
-en, -er, -erne
eng. trie

Det engelske trie er en neologisme som betegner en træ-lignende datastruktur navngivet efter midterstavelsen i retrieval, men udtales alligevel mest som try, ikke som tree. Man kan på dansk more sig med at finde en overstættelse som holder ordspillet i live, men det er vanskeligt og – givet triers relative sjældenhed – unødvendigt. Trie bør vel opføre sig som de danske substantiver die og gie, dvs. fælleskøn og med diftong.

udspændende træ

eng. spanning tree

letteste, udspændende træ

Et udbredt alternativ for letteste, udspændende træ er minimalt, udspændende træ. Bemærk dog, at i »minimalt udspændende træ« (i talesprog eller skriftligt uden komma) er minimalt et adverbium, og bekriver (ganske meningsløst) graden af udspændthed, i stedet for træets totale vægt.

udvalgssortering

eng. selection sort

Alternativer: udvælgesessortering, udtagelsessortering.

vej

eng. path

Dijkstras algorithme finder korteste veje

Det er NP-hårdt at finde en hamiltonvej i en graf

Et alternativ er sti.

How Google Works—Film at 11!

During 2010 I have been involved in the production of a TV programme about algorithms, in particular Google’s PageRank. It went online just a few hours ago:

I just sat through the gruelling task of watching myself.

The background is that IT University of Copenhagen, like all Danish Universities, has agreed to produce popular science content for the Danish public service broadcasting corporation DR. Thus, the show will be aired on the Danish TV channel DR2 some time during 2011. For now, it’s online at ITU’s own digital channel ITView. It’s all in Danish, of course.

The part I like best, and which took the longest time and most passion to think about and produce, is in the middle. The conceit is that the PageRank algorithm could have been invented in the Middle Ages and used by the Catholic church to rank books instead of web pages.

A computing monk doing algebraic graph theory

So we shot some scenes with monks in an old monastery who executed the algorithm by hand, using quill and paper. Actually seeing the finished scenes made my day today. It looks great, thanks to the people involved in actually filming this. I take endless joy from seeing authentic-looking monks drawing directed graphs and pretending to compute eigenvectors.

If you look closely, in the still you can see a drawing that looks suspiciously like figure 2 from “Authoritative sources in a hyperlinked environment” (Jon Kleinberg, J ACM 46(5): 604-632 (1999)). The programme is mainly about the PageRank algorithm of Brin and Page, but I like to imagine that the clever monks have active research in alternative ranking algorithms.