Monthly Archives: December 2021

Digitisation in Higher Education

Summary of my keynote talk on 4 November 2021 at the eponymous Norwegian national conference Nasjonal konferanse: Digitalisering i høyere utdanning 2021, hosted at Bergen University and organised with the Norwegian Directorate for Higher Education and Skills. The summary is somewhat disjointed and includes some of the slides.

1. Digitisation in Higher Education (summary)

The first part of the talk is about the potential use of digital technology to improve learning outcome in specific contexts. When I gave the talk I showed a bunch of ephemeral examples that I had used myself, or which seemed otherwise cool, instructive, useful, or exciting in late 2021. These examples won’s stand the test of time, so I won’t repeat them here. A well-known and old example that everybody knows is the language learning application Duolingo.

Benefits include immediate feedback, scalability, transparency of assessment, accessibility, spaced repetition, and individual pacing or tracking of material. Common to the contexts where digital technology seems to be largely beneficial are

  1. learning outcomes can be validly tracked and assessed,
  2. it makes sense to design a clear progression or scaffolding through the curriculum,
  3. relevant feedback can be provided at scale and automated.

More problematic are assistive technologies, often using machine learning or other forms of decision-making processes collectively known as artificial intelligence, that ensure surface conformity of written material. These have clear benefits for authoring but also equally clear negative consequences for assessment. Modern word processors allow the production of professional-looking prose and automate surface markers of quality (layout, grammar, style, phrasing, vocabulary), which traditionally were used as proxies for content in assessment of learning outcomes; online natural language processing services like Google Translate invalidate traditional forms of language proficiency assessment based on written material. In 2021, a beta version of the programming environment GitHub Copilot automatically solves programming exercises from the task description. The same tools that perform plagiarism checks can also plagiarise very well.

So there is stuff that becomes somewhat easier by digitisation (transmission of knowledge), and where we have many ideas for what to do, and stuff that becomes much harder (valid assessment), and where we have very little idea about what to do.

A third theme is stuff that does become easier, but maybe oughtn’t. This is found under the umbrella education data mining – predicting student performance, retention, success, satisfaction, achievement, or dropout rate from harvested data such as age, gender, demographics, religion, socio-economic status, parents’ occupation, interaction with application platform, login frequency, discussion board entries, spelling mistakes, dental records, social media posts, photos, and everything else. The promise of this field—and in particular, machine-learned predictions based on polygenetic scores—is enormous; and there is an emerging literature of success stories in the educational literature. I strongly dislike this trend, but that’s just, like, my opinion.

This summaries the first part, with the foci

  1. what works?
  2. what fails?
  3. what should we not primarily evaluate along the works/fails axis?

There are of course a ton of other problems, which I expanded on in the full talk:

  1. digitisation favours certain disciplines and traditions of instruction
  2. it systematically undermines the authority of educators
  3. the intellectual property rights issues are ill understood and seem to have very little attention from educational administrators
  4. routine student surveillance

2. What the Hard Part Is

I’m pretty confident about part 1 (digitisation for education), and return to a confident stance in part 3 (education for digitisation). In contrast, part 2 is characterised by epistemic humility. It’s my best current shot at what makes the whole topic hard do think about.

Over the years I have learned a lot about “How to think about digitisation” from the DemTech project at IT University of Copenhagen, which looks at digitisation of democratic processes, in particular election technology. “How to digitise elections” is a priori an easier problem than “How to digitise higher education”, because the former is much more well-defined. Still, many of the same insights and phenomena apply.

These problems include a (largely unhelpful) eagerness of various stakeholders to digitise for the sake of it. An acerbic term for this is “magical IT gravy”:

[…] politicians who still think that you can just drench a problem domain in “magical IT gravy” and then things become perfect.

Poul Henning Kamp, 2021

Briefly put, if you digitise domain X then you will digitise your description of X. (A fancy word for this is reification.) Since (for various reasons, including lack of understanding, deliberate misrepresentation, social biases, status, and political loyalty) the description of X is seldom very good, digitisation reform typically deforms X. Since digitisation is seen as adaptive within organisations, it bypasses the converstations that such transformation would normally merit.

Example: Electronic Elections

My ITU colleague Carsten Schürmann showed me this insightful taxonomy:

Hierarchy of values related to digitisation. Adapted from Carsten Schürmann.

Let’s use voting as a concrete example to talk us through this. At the bottom is Functionality: voting is to increase a counter by 1 every time somebody casts a vote. This is easy to implement, and easy to digitise; any programmer can write an electronic voting system. Above that is Security: making sure the voting process is free (i.e., secret, private, non-coerced, universally accessible, etc.) This involves much harder computer science, for instance, cryptography. You need to have a pretty solid computer science background to get this right. Above that tier is Validity: how do we ensure that the numbers spat out by the election computer have anything to do with what people actually voted? This is even harder, in particular without allowing coercion by violating secrecy. (Google “vote selling” if you’ve never thought about this.) Finally, after thinking long and hard about all these issues (which by now include social choice theory, logic, several branches of mathematics) you arrive at what elections are actually about: Trust. The overarching goal of voting it so convince the loser that they lost. And this requires that the process is transparent.

Applications at the Functionality stage are easy (if you’re a competent programmer), they are mono-disciplinary, and the happen by themselves because geeks just want to see cool stuff work. The higher up you go in the hierarchy, the more disciplines need to be included, so the requirement for inter-disciplinary work increases. An organisational super-structure (such as a university) is needed to incentivise and coördinate the things at the top.

What is Higher Education About?

To translate the above model from voting to Higher Education, the Functionality part is transmission of knowledge: all the cool and exciting ideas I mentioned in part 1. Security corresponds to preserving privacy of learners throughout their educational history; this already conflicts with both educational tracking (e.g., for individualised progression) and reporting (e.g., for credentials). The first steps, probably in the wrong direction, can be seen by universities in Europe reacting to the European GDPR legislation, which already takes up a significant amount of organisational resources. This layer is correctly seen as annoying, depressing, time-consuming and boring, but still outshines in enthusiasm the next layer: Validity. The very features of information technology that make it incredibly useful for transmission of knowledge (speed, universality, faithful copying, depersonalised expression, accessibility, bandwidth) make it incredibly difficult to perform valid assessments, in particular in high-stakes, large-scale exams of universal skills. This is because digital technology makes it it is very easy to misrepresent authorship and originality. (Think of playing the violin, drawing, programming, parallel parking, mathematics, translating between languages—which digital artefacts would constitute proof of proficiency?)

Yet this is all still manageable, and some universities or educational systems may even choose to get this right. For instance, I think I know that to do.

But the really Hard Problem of Digitsation is the one at the top of the hierarchy: What is higher education even about?

I don’t know the answer to that.

Here are some possible answers. Maybe they’re all correct:

What is higher education about?

The narrative I tell myself is very much the first one, the capital-E Enlightenment story of knowledge transmission, critical thinking, and a marketplace of ideas. We can call this the Faculty view, or Researcher’s, or Teacher’s. The Economist’s view in the 2nd row is, thankfully, largely consistent with that; it’s no mystery that the labour market in knowledge economies pays a premium for knowledgeable graduates—the two incentives are somewhat aligned. (It does, however, make a big difference in how important transmission of knowledge and validity of assessment are in relation to each other. Again, “why not both?” and there is no conflict. Critical thinking versus conformity is a much more difficult tension to resolve.)

Still, the question whether Higher Education creates human capital (such as knowledge) or selects for human capital is (and remains) very hard to talk about among educators. Very few teachers, including myself, are comfortable with the idea that education favours the ability to happily and unquestioningly perform meaningless, boring, and difficult tasks exactly like the boss/teacher wanted (but was unable/unwilling to specify honestly or clearly) in dysfunctional group, and that it selects for these traits by attrition. Mind you, it may very well be true, but let’s not digitise that. (I strongly recommend Bryan Caplan’s excellent, though-provoking, and disturbing book The Case Against Education to everybody in education.)

The two other perspectives on what Higher Education is about (Experiental, Social, both are Student perspectives) are much more difficult to align with the others. Students-as-consumers-of-an-experience (“University is Live Role Playing”) and students-as-social-primates (“University is a Meet Market”) have very different utility functions than both Faculty and Economists.

The point is: if we digitise the phenomenon of Higher Education, we need to understand what it’s about. (Or maybe what we want it to be about.) Currently, Higher Education seems to do a passable job of honouring all four explanations. I guess my position is largely consistent with Chesterton’s Fence:

In the matter of reforming things, as distinct from deforming them, there is one plain and simple principle; a principle which will probably be called a paradox. There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, “I don’t see the use of this; let us clear it away.” To which the more intelligent type of reformer will do well to answer: “If you don’t see the use of it, I certainly won’t let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it.”

G.K. Chesterton, The Thing (1929)
Typical slate fencing on the Talyllyn Railway, Wales, Wikimedia Commons, Public Domain.

Even if we (as educators and educational managers) were to understand what education is about, and even if we were agree on good ideas for how to proceed, none of us lives in a world where good decisions are just made. We have to communicate with stakeholders who are themselves beholden to other stakeholders, and need to understand that these individuals live in a complicated social or organisational space where they are evaluated on their individual, institutional, or ideological loyalty. Not on the quality of their decisions.

We must expect most organisations, including most universities, to make decisions about digitisation in higher education that are based on much more important factors (to the stakeholders) than “whether it works”. Given the choice between trusting the advice of a starry-eyed snake-oil salesman (or, in our case, a magical IT gravy salesman) and a reasonable, competent, experienced, responsible, and careful teacher, most responsible organisations will choose the former.(!) The feedback loops in education are simply too long to incentivise any other behaviour.

3. Educating for Digitisation

The last part of the talk—about how to prepare higher education for digitisation—is really short. Because (I believe) I know what to do. I consider it a solved problem.

Here is the list of things to do:

  1. Do teach basic programming.

That’s it.

Basic programming is teachable, it is mostly learnable by the higher education target demographic, is easily diagnosed at scale, works extremely well with digitisation-as-a-tool, is extremely useful now (in 2021), applies to many domains, will likely be even more useful in future, and in more domains, is highly compatible with labour market demands, and an enabling tool in all academic disciplines.

If you’re on board with this, my best advice for anyone with the ambition to teach basic programming is to adopt the following curriculum:

  1. Basic programming
  2. Nothing else

Point 2 is really important. In particular, my advice is to not add contents like applications, history of computing, ethics, societal aspects, reflection, software design, project management, machine learning, app development, human–computer interaction, etc. Mind you, these topics are extremely interesting, and I love many of them dearly. (And they should all be taught!) But unless you are a much better teacher than me, you will not be able to make the basic programming part work.

That’s it. The full talk had better flow, included some hands-on demonstrations of cool applications, and had more jokes. I am really grateful to the organisers for giving me the chance to think somewhat coherently about these issues, and had a bunch of very fruitful and illuminating conversations with colleagues both before and after the talk. I learned a lot.

Algoritmer og informationssøgning

Presentation on 1 November 2021 at the annual meeting of the Danish Gymnasiernes, Akademiernes og Erhvervsskolernes Biblioteksforening, GAEB in Vejle, Denmark.

I had the pleasure of giving a series of lectures on algorithms for information search on the internet to the association of Danish librarians. This is a highly educated crowd with a lot of background knowledge, so we could fast-forward through some of the basic stuff.

I am, however, quite happy with the historic overview slide that I created for this occasion:

Here’s an English translation:

What do we expect from the search engine?SearchingRankingRankingRecommendation
What’s searching about?Which information exists?What has high quality?What is relevant for me?What ought I consume?
FokusFind informationFind informationFind informationMaintain my status, prevent or curate information
Core technology to be explainedCrawlers, keyword search, categoriesReference network topologyImplicit user profiles, nearest neighbour search, cookies Like, retweet
Example companyYahooGoogleAcxiomTwitter, Facebook
Source of profitAdvertisingUser dataAttention, interaction
Worry narrativePrivacy, filter bubblesMisinformation, tribalism, bias
Rough overview of the historical development of information search on the internet

Particularly cute is the row about “what is to be explained”. I’ve given talks on “how searching on the internet works” regularly since the 90s. and it’s interesting that the content and focus of what people care about and what people worry about (for various interpretations of “people”) changes so much.

  1. Here is the let 2000s version. How google works (in Danish). Really nice production values. I still have hair. Google is about network topology, finding stuff, and the quality measure is objectively defined by network topology and the Pagerank algorithm.
  2. In 2011, I was instrumental in placing the filter bubble narrative in Danish and Swedish media (Weekendavisen, Svenska dagbladet.) Suddenly it’s about subjective information targeting. I gave a lot of popular science talks about filter bubble. The algorithmic aspect is mainly about clustering.
  3. Today, most attention is about curation and manipulation, and the algorithmic aspect is different again.

I briefly spoke about digital education (digital dannelse), social media, and desinformation, but it’s complicated. A good part of this was informed by Hugo Mercier’s Not Born Yesterday and Jonathan Rauch’s The Constitution of Knowledge, which (I think) get this right.

The bulk of the presentation was based on somewhat tailored versions of my current Algorithms, explained talk, and an introduction to various notions of group fairness in algorithmic fairness, which can be found elsewhere in slightly different form.

There was a lot of room for audience interaction, which was really satisfying.