October | 2016 | Thore Husfeldt

The Orthogonal Vectors problem is given two sets of d-dimensional 01-vectors to decide if there is an orthogonal pair: two vectors, one from each set, with zero dot product. Obviously, the problem can be decided in time O(n²d) by trying all pairs.

Here’s the a complete reduction from the Orthogonal Vectors problem to evaluation of Perl regular expressions.

chomp($_ = <STDIN>);
s/0/\\d/g;
tr/1 /0|/;
print !!(<STDIN> =~ /$_/)

This is fully functional Perl code. Save the script as ov.pl and run it. The process expects two lines on standard input, each describing a set of vectors (as 01-sequences, separated by space). The script returns “1” if and only if the two sets contain an orthogonal pair of vectors.

>perl ov.pl
1000 0100 0010 1001
1110 1101 1111 1100^D
1

This construction shows that matching an n-character string against an n-character regular expression requires quadratic time under the Strong Exponential Time Hypothesis. So, the obvious quadratic-time dynamic programming algorithm for regex matching is in some sense optimal.

As to the Perl code, the first line eats the trailing newline of standard input. The next two lines contain the main part of the reduction, replacing “0” by “\d” (any digit), “1” by “0”, and “ ” by “|” to build a regex. The last line of the code attempts to match the second line from standard input to the regex. Oh, the double exclamation points? They are a perlism to enforce Boolean context, so that the output is “1” (true) exactly if the regex matches. The reduction is so obviously linear that it hurts.

Tell me this isn’t cute.

The core of the construction underlies the celebrated recent results of Backus and Indyk, Bringmann and Künnemann, and Abboud et al. that show similar bounds for various string edit distances. This particular example is just very concrete version of the NFA-acceptance reduction that appears on Karl Bringmann and Sebastian Krinninger’s slides for an MPI course on Complexity Theory of Polynomial-Time Problems (Summer 2016), there attributed to Russell Impagliazzo. I find that reduction very teachable. The step from NFA acceptance to a concrete Perl script is minimal, but makes it even more concrete (executable code!).

Amir Abboud, Arturs Backurs, Virginia Vassilevska Williams: Tight Hardness Results for LCS and Other Sequence Similarity Measures. IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015, Berkeley, CA, USA, 17-20 October, 2015, pp. 59-78.
Arturs Backurs, Piotr Indyk: Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false). Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, STOC 2015, Portland, OR, USA, June 14-17, 2015. pp. 51-58.
Karl Bringmann, Marvin Künnemann: Quadratic Conditional Lower Bounds for String Problems and Dynamic Time Warping. IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015, Berkeley, CA, USA, 17-20 October, 2015. pp. 79-97.

Explained using only the ten hundred words people use most often.

What is a good way of figuring out a problem?

First: it must be so simple and clear that we can explain it to a computer. (Or a Human. That’s not so different.)

Second: it must be fast enough so that it is done before lunch. (Or before the sun burns out. that’s not so different.)

Since people started using computers, lots of people get paid to think about ways to figure out problems. that’s what i’m doing. Today we understand that some problems are hard, and others are pretty easy.

I want to help us becoming better at talking about what makes a problem hard.

So far, most of the people who think about this stuff look only at the size of the problem.

But some of us feel that you can think more clearly about problems when you not only look at their size, but at some other things as well. Sometimes, this way of thinking even helps us to come up with better ways to figure out problems!

Two other people and myself have come up with a way to finish the game for all five-letter words — as long as you are forced to visit only around ten words. Or maybe twenty. So we found out that the number of words all-in-all was not so important. instead, what makes the problem hard is how many you are forced to visit. We can even find the shortest such path!

Read more here: Another man, myself, and a woman, ‘Shortest path through stuff you didn’t choose’, place where people who care about this kind of question meet each year (the name sounds like a soft drink), year twenty-hundred and ten and two.

Appendix

The above is a summary of a poster I made for a research sharing day at ITU.

hard-problems-poster (draft) in pdf (designed for printing in A0)

I loathe poster sessions, partly because of the unclear concept of who the audience is, and mainly because the social interaction exhausts me.

But to motivate me I gave myself the challenge to design a poster in the style of Randall Munroe’s Up-Goer Five: Use only the 1000 most used words of English, be concrete, whimsical, accessible, and informative. Munroe has since made a whole book in this style: Thing Explainer, which is quite lovely.

I did the whole thing in Omnigraffle Professional 5, which is hard pressed to handle a document with almost 6000 elements. Drawings are free-hand in Keynote (basically because I don’t have a good way of digitising my free-hand drawings — I am a competent artist, but Munroe’s style is sufficiently simple.) The font is xkcd-font from the iPython project, licensed under CCANC 3.0.

As a starting point I used a 2012 paper of Andreas, Nina, and myself that presents an efficient FPT-algorithm for shortest Steiner cycle. That’s as accessible a result as I have. The big drawing at the end shows a path from TEARS to SMILE passing through seven intervening emotions (WORRY, SPITE, etc.) This is a highly nontrivial result and a cute story that I’m insanely happy with, I haven’t found a good way of expressing this fact in only 1000 words.

Andreas Björklund, Thore Husfeldt, and Nina Taslaman, Shortest cycle through specified elements, Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms (SODA 2012), January 17–19, 2012, Kyoto, Japan, SIAM, pages 1747-1753. [PDF].

Thore Husfeldt

Monthly Archives: October 2016

Orthogonal Vectors to Perl Regex

Hard Problems and How To Figure Them Out

Appendix