A Predictive Text

01 Apr 2009 Justin B Rye

The poor Babel fish […] has caused more and bloodier wars than any­thing else in the history of creation.

Douglas Adams

Last night, while I was sleeping, I had a vision of the future.  Some­body who claimed to be me aged 71 appeared before me and said he had a message about how the world would end in thirty years' time, and how it was all my fault.  Ridiculous, I said; this is just a dream, and we both know it.  If you say so, he replied; but I can still prove I'm real.  And he did: he dictated two sentences that generate the same cryptographic signature when I apply my private GPG key.  I'd go into details of how such a proof works, but I did promise to leave all the geeky stuff in my Linux sub­directory – suffice it to say that you can only do that if you know my secret pass­phrase and have access to some impossible amount of computational resources.  Besides, both sentences were slanderous claims about my personal life that I really don't care to repeat.

Mind you, it doesn't prove that what he told me is true.  It could all be a twisted practical joke; for a start, nothing in his story explained how he was appearing to me in the dream, or what I was meant to do about it, or even whether there was any point trying…

He started by describing the state of the world in the 2020s and '30s.  Much of what he said I have managed to forget, but I do remember a few high­lights.  Most of Indonesia was under a viciously enforced quarantine after an accident at what may or may not have been a secret Taiwanese‐owned bio­weapons plant.  London was gearing up to be the new Venice, every­one having more or less given up on reversing global warming after the development of shales as a viable source of fossil fuels.  And although UN peace­keeping camera‐drones seemed to have damped down the Sino‐Indian border conflict, they had turned West Africa from a crisis into the world's nastiest reality TV show.

He did mention a couple of positives, such as that he owned a palm­top device which claimed to be a quantum computer (not that he reckoned its qCPU deserved the label).  My visitor took all this for granted as trivial; he had come to warn me about some­thing more important.

It all started in the late 2010s, when there was a big fad for the use of catch‐phrases and trendy coinages spread not by word of mouth but via wiki‐style jargon lexicons, with different youth‐culture factions showing their allegiances by the use of different on­line meme‐pools.  To start with it was confined largely to text messaging, which had always had its own set of alternative spelling conventions more suited to the medium.  But then the rise of voxing (effectively, hands‐free texting via speech‐recognition) triggered a great deal of ignorant knee‐jerk controversy about how the language was going to the dogs.  Linguists got to appear on chat shows to explain that scientifically speaking it was less significant than phenomena like Cockney rhyming slang or French verlan.  The urge to belong to an exclusive socio­linguistic in‐group has always been one of the driving forces behind language evolution.  Even when teen­agers started using their phones to “spell­check” their conversations and filter out un­fashionable turns of phrase, this was still only a slight lubricant to the eternal vocabulary churn that goes on in all living languages.  Every­thing was just groovy.

It was widely taken for granted at this stage that mass‐media technologies were a force opposed to language change – a tragic mis­conception founded on the way the orthography had been pickled in printing ink for the past few centuries.  The truth is, hearing Mr Cholmondely‐Warner reading the BBC news did nothing to persuade youngsters to speak like him; after all, they already knew how ridiculously their grand­parents talked.  They might adopt urban‐elite slang off the telly in place of the local rustic vernacular… but that's an example of change being accelerated, not slowed.  Indeed, just as the introduction of printing in England had coincided with the Great Vowel Shift, the rise of net­work television in the US was accompanied by new developments like the Northern Cities Shift.  More unusual was the fact that many of the contributors to the on­line lexicons were non‐native speakers of English as a language of inter­national pop culture.  There was an increasing trend towards global cross‐pollination, with dialects splitting off on the basis of sub­cultural affiliations rather than simply by geography.

Meanwhile, soft­ware vendors were peddling tourist phrase­book apps that could run on a phone (by this time, a device you could wear in your ear).  The idea was that you could mutter in English and have your words rendered into Spanish in real time; but (as the pod­cast linguists warned) that's really tricky, since languages are far more than just dictionaries – for a start the words also have to go in a different order and carry different kinds of grammatical endings.  The big‐name company that invested most heavily in wearable Universal Translators went spectacularly bankrupt, while a simple King's‐English/LOL­speak converter proved to be much easier and more saleable.

Then around 2020 the people working on Natural Language Processing made a series of break­throughs.  Computers gained the ability to compose and parse sentences smoothly and effectively.  It should have meant workable translator devices for all, except that nobody wanted to risk that again – instead it was the Sales and Marketing departments who bought this technology.  Soft­ware agents that were capable of generating plausible conversational English flooded the virtual forums.  Although far from AIs, they could pass for slightly stoned and extremely shallow adolescents well enough to affect the flow of traffic and of advertising revenue.

The automatic reaction was an explosion in consensus reputation net­works incorporating related algorithms.  Human beings were still better than machines at handling colloquialisms and word­play, so now verbal eloquence tests came to be used (much like CAPTCHAs) to distinguish the bots from the humans, the goths from the vandals, and the trend‐setters from the dweebs.  Suddenly, people were being objectively scored for how well they talked the talk.  Once employers realised there was a publicly accessible ranking system for socio­linguistic status, they soon started putting minimum “rep” values in vacancy adverts.  Parents could no longer pretend to their children that academic grades mattered more than popularity…

Civilisation was indeed doomed, though not for that reason.

By now, there were shill‐bots trying to pass as with‐it by using the very latest chic words for things – or even better, random lexical innovations of their own.  That was kind of cool, in fact, and inspired people to populate the chat­spaces with silver‐tongued mock‐bots that did nothing but spout slightly over‐the‐top jargon; the success­ful ones were adopted as cultural icons.  It might not have made much sense, but attitudes to prestige dialects have never had much to do with rationality.  Meanwhile there were vox devices that could pick up your sub­vocalised murmurs and re­organise your sentences to stop your English teacher shouting at you, or (more often) garble your phrases in novel ways to impress your friends and baffle every­one else.  If you could learn to do the shuffle without your voxer, that was even more hip.

Over the course of the '20s, it turned into a Red Queen's Race. Impressionable teens were under more and more pressure to use the trendy new iDialects, with their increasingly divergent vocabulary, pronunciation, and grammar; they needed to distinguish themselves not just from their lamer rivals but from the swarming hordes of synthetic wannabes.  Advertising firms had ever‐larger NLP departments to calculate the kinds of new features that would strike consumers as catchy and sophisticated, and the movie and music industries scrambled to join the band­wagon.  The equipment for handling modified rule­sets (marketed as “raps”) grew more and more power­ful and accessible.  None of this was seen as a problem – after all, as long as the raps were available on­line you could always get things machine‐translated.  Indeed, all the hottest immersive soap operas of the decade provided automatic “tribalisation”; soon it was easy to recognise the bad‐guy characters because they were the only ones who were Standard English monoglots.  No two factions agreed what the appropriate label was for such a mon'glo (or shvoon, or mutie…), but it was always an insult.

There may have been a diverse eco­system of sources for these linguistic innovations, but there were just two rival rap dissemination schemes: GLOSS and PºPVºX.  PºPVºX had been the very first of them, and had swallowed most of its competitors.  However, it was increasingly under­mined by a plague of mal­ware raps, known, inevitably, as the pox.  Eventually PºPVºX announced that they were going to have to introduce draconian new authentication mechanisms for their channels – and lost their user­base over­night.  GLOSS had won a world­wide monopoly.

If GLOSS had been a private company, this would never have been tolerated.  But it wasn't a company; it was an in­formal community, in the old Open Source Soft­ware tradition, and any attempt to regulate or censor it just led to “free” variants rising to predominance.  It was one of these un­official forks that demonstrated how encrypted channels could hook into the social net­works' existing authentication mechanisms, wiping out the pox while avoiding all the draw­backs.  Except for one: now the only way to understand what a clique member was saying was to have access to that clique's rap, and the only way to do that was to get a fully validated rep with the clique, vouched for by people who passed the fluency tests.  Things had started to turn ugly.

What's more, the generation that had grown up with this accelerating process of language change were old enough to have children of their own now, and as they aged their capacity for absorbing new syntactic, morphological, and phonological features was waning.  Their parents still didn't understand them, but now they couldn't quite understand their younger siblings, either.  Here's where the pharming industry got to do its bit; new drugs were hitting the market, un­attractively known as “plasticisers”, that allowed adults to re­invigorate their in­born talent for language acquisition.  No more pain­ful grammar lessons! Just a couple of eye­drops every day and you too could be picking up the new lingo as effort­lessly as a child soaking up its mother‐tongue.  Given that the alternative was losing rep hand over fist, people were soon taking black‐market plasticisers in large doses.

This was more artificial than ever, though that had become academic since children who had been exposed to plasticisers in utero were growing up as native speakers.  Far from having any problem learning the dialects, they seemed happy to accept the idea that their rules were in a natural state of flux.  For the first time, language development had been yoked to the run­away speed‐doubling of the on­line world.  English evolved and diversified about as much over the course of the '20s as it had in the previous thousand years.  Then it did the same again in the first two years of the '30s.  By 2033 it was impossible to measure, because many of the new changes went in directions that natural languages had never explored.  Aspectual systems where all imperfectives were palindromes.  Phonologies with only a handful of segmental phonemes and dozens of distinct stress/tone categories.  Grammars with no such things as words or sentences, only poly­synthetic root‐complexes.

Not that the authorities were sitting idly by while all this happened!  They had leapt into action, passing new laws to enforce the correct use of Standard English (or Japanese, or whatever), with the immediate effect that law‐abiding citizens who happened to have slightly the wrong accent needed gloss‐links too.  Since the prescriptivists each wanted their own personal speech habits used as the universal yard­stick of correct­ness, the longer‐term result was to spark secessionist move­ments wherever there were regional variants.  Standard English had turned itself into a patch­work of deeply un­cool warring factions, without even a shared sub­culture or a proper rep frame­work to justify the effort of joining.

The use of “viral” soft­ware agents for commercial promotions was also tightly regulated, with the result that the profitable part of the market was left to the most malevolently virulent of illegal bot­nets.

The one thing the world's governments did achieve before they disintegrated into civil war was rounding up those disgusting subversives, the descriptive linguists.  So by the time it happened, we were safe in a concentration camp in Greenland, where we were forbidden to speak any­thing more recent than King James Bible English.  The gulag joke was that things would have to stop once word‐order patterns could flip between SOV and VOS half‐way through a sentence… except that in reality there obviously had to be be a limit on how fast people could re­learn an ever‐changing grammar.

What every­body had over­looked was that the humans involved weren't just racing one another; they were being chased.  And once they faltered, they were easy meat.  The rep‐system fluency filters started classifying the spam­bots as prestigious community members and the humans as imposters… As each social network fell, its neighbours with overlapping ID registries would come under attack, and meanwhile, GLOSS gave the ad­ware a direct pipe into people's chemically softened heads.  By 2038, over half the human race had been turned into dysphasic obsessive‐compulsives.  And as the world descended from balkanised militarism into outright gibbering lunacy, the bombs fell like a cleansing rain, and the story was over.

Did I have any questions?  Well, unfortunately the one that came into my head was: how is that my fault?  Which made my elder self angry.  It was my fault because I'd written some web page that inspired some­body to write a PHP script that ended up being used as part of the alpha release of PºPVºX; so if I hadn't written it, none of this would have happened.  And then he started ranting about how computational linguistics should never have become cool; that was where it all went wrong.

Nonsense, I told him: how could you blame me?  What about the programmers who wrote the spam­ware, or indeed the fail­safes on those weapons of mass destruction?  It was their fault, not mine.  Anyway, if he'd done some­thing wrong, he should punish himself – I wasn't guilty of any­thing yet!

That made him so furious it was the end of the whole dream.

So… is this the same page as he was talking about?  If so, could I have not written it?  I have no idea.  Just in case, I've thrown in a few deliberate lies and distortions.  Still, I mean, he must have been able to remember how hard up I am for new material – if he didn't want me to write about it, he shouldn't have told me.  It's his fault now, not mine.