It's mine and you can't have it. Actually if I make it public on Cafe Press you could, but then certain parties might notice and demand it be shutdown and ask for all moneys. And I'm not talking about Darwin.
Wednesday, January 30, 2013
Sunday, January 27, 2013
Asteroid Mining and Detecting Others' von Neumann Probes
With the announcement of "firefly", 3D-printing spacecraft to mine asteroids, we're getting closer to exploring space with multiple smaller craft, as well as more immediately economically rewarding activities, which is what will drive space exploration faster.
Of course it's also exciting because I think exploration of low-gravity bodies will give us more information about life elsewhere in the universe than we expect it to. While reasoning about extraterrestrial life invariably means making assumptions we don't even know we're making, based on what we know about the evolution of life on Earth and the number of planets in the rest of the universe, the development of some kind of replicators outside the solar system seems overwhelmingly likely. If we think at least partly self-reproducing probes are possible - and notice above that investors right here on 2013 Earth are trying to convince people they are - then we might be better off trying to get information about extraterrestrial life from artifacts already here in the solar system than from signals.
It is also likely that lower gravity bodies are better for any entity that wants to continue spreading, since gravity wells are energetically expensive to get in and out of. If you can get matter without descending onto a high gravity surface, you should. (Yes, "but what if aliens have antigravity" - but if we're going to bother thinking about it, we have to make guesses with what we know now. Otherwise maybe they'll ride unicorns. More seriously, if they don't care about gravity, why would they waste time with small gravity bodies like Earth? Mine the cores of gas giants. Hide just outside event horizons to evade detection.)
I've given previously in detail my arguments for why these artifacts might already be here, and where we might look. Comets and asteroids was the answer, so of course I'm excited that these mining probes may explore a number of asteroids during my lifetime. If there's something obvious, excellent (and frightening).
If they don't find anything it could mean:
1. There's really nothing there to find. Intelligent life is much rarer than we think. Replicator chemistry is either not as inevitable as it seems, or there's a Great Filter between algae and interstellar expansion, or life is just rare enough that we're isolated.
OR
2. Something is there to find, but we don't notice it at first.
Because we're looking for something alien - something completely outside our experience - it's hard to say what a gas chromatograph of chewed-up alien von Neumann probe chemistry would look like. (This is why I hope full rocks are towed back, so we can have people in Earth orbit doing real chemistry on them.)
So how to distinguish 1 from 2? Keep looking, and follow up any interesting chemistry we find, "interesting" meaning any low-entropy repeating patterns, either temporally or spatially, on low-gravity bodies. I very much doubt we're going to find a metal ship crouching amidst a flying rubble pile. I do think we'll find strange chemistry that's worth looking into, at least insofar as it's relevant to the origin of life on Earth, and at least with comets that's no longer controversial. I haven't yet seen a model which examines what fraction of asteroids we would expect to be colonized by theoretical replicators, so I'm not sure at what rate I should de-weight my expectation of finding alien artifacts on asteroids, as more asteroids are mined without the merest
Of course it's also exciting because I think exploration of low-gravity bodies will give us more information about life elsewhere in the universe than we expect it to. While reasoning about extraterrestrial life invariably means making assumptions we don't even know we're making, based on what we know about the evolution of life on Earth and the number of planets in the rest of the universe, the development of some kind of replicators outside the solar system seems overwhelmingly likely. If we think at least partly self-reproducing probes are possible - and notice above that investors right here on 2013 Earth are trying to convince people they are - then we might be better off trying to get information about extraterrestrial life from artifacts already here in the solar system than from signals.
It is also likely that lower gravity bodies are better for any entity that wants to continue spreading, since gravity wells are energetically expensive to get in and out of. If you can get matter without descending onto a high gravity surface, you should. (Yes, "but what if aliens have antigravity" - but if we're going to bother thinking about it, we have to make guesses with what we know now. Otherwise maybe they'll ride unicorns. More seriously, if they don't care about gravity, why would they waste time with small gravity bodies like Earth? Mine the cores of gas giants. Hide just outside event horizons to evade detection.)
I've given previously in detail my arguments for why these artifacts might already be here, and where we might look. Comets and asteroids was the answer, so of course I'm excited that these mining probes may explore a number of asteroids during my lifetime. If there's something obvious, excellent (and frightening).
If they don't find anything it could mean:
1. There's really nothing there to find. Intelligent life is much rarer than we think. Replicator chemistry is either not as inevitable as it seems, or there's a Great Filter between algae and interstellar expansion, or life is just rare enough that we're isolated.
OR
2. Something is there to find, but we don't notice it at first.
Because we're looking for something alien - something completely outside our experience - it's hard to say what a gas chromatograph of chewed-up alien von Neumann probe chemistry would look like. (This is why I hope full rocks are towed back, so we can have people in Earth orbit doing real chemistry on them.)
So how to distinguish 1 from 2? Keep looking, and follow up any interesting chemistry we find, "interesting" meaning any low-entropy repeating patterns, either temporally or spatially, on low-gravity bodies. I very much doubt we're going to find a metal ship crouching amidst a flying rubble pile. I do think we'll find strange chemistry that's worth looking into, at least insofar as it's relevant to the origin of life on Earth, and at least with comets that's no longer controversial. I haven't yet seen a model which examines what fraction of asteroids we would expect to be colonized by theoretical replicators, so I'm not sure at what rate I should de-weight my expectation of finding alien artifacts on asteroids, as more asteroids are mined without the merest
The Dying Earth Genre As Horror of the Irrational
The dying Earth subgenre is one which has recently received increasing attention, but it's hard to define. Not all far future sf is Dying Earth and not all Dying Earth work takes place in the far future, but you know it when you see it; it's kind of like pornography that way. Whether it's even truly a sub-genre is up for debate. It's not clear that most authors who write a story with dying Earth elements think of themselves as doing that, or being influenced even subconsciously by conventions established therein. Of course it was Jack Vance's work that provided the name and possibly it's the collection of tribute stories that came out a few years ago that awoke interest in this kind of writing. Other examples are Wolfe's Book of the New Sun, Farmer's Dark is the Sun, and Delany's Dhalgren, the latter of which is often inexplicably omitted from these lists. The far-future glimpse of the giant crab-things on the beach in Wells's The Time Machine should not be omitted, as well as the exhausted ecology in the last scene of Stephen Baxter's Evolution.
Illustration credit Don Dixon.
Dying Earth settings tend to be dark and cold and maybe (as above) glowing red, due to the failing Sun. Nature's cycles or even physical laws themselves have gone perversely off the rails or ceased completely, like a kind of cosmic menopause. (Whether this extends beyond our provincial corner of the universe, and if so why, is another question.) Interestingly, this derangement of reality is usually not the fault of humans - these aren't ecological morality tales - or at least the exhaustion and littering of the planet is pictured dispassionately, as an aggregate trend that has no moral meaning. (Although not a dying Earth work, Stephenson's Anathem has hints of a long history as resources are scavenged from old ruins, although the tone in describing this activity is again very matter-of-fact.) This creates a setting of an incomprehensibly huge and uncaring universe, a clockwork winding down despite any designs harbored by the characters or their ancestors, and indeed some of the profound questions of the human condition become meaningless against such an unforgiving vast backdrop: lay down and die now, or continue struggling to pass on genes and values and create happiness? It's all going to disappear in a few years, so how can it matter? In reality, we're all going to die, either now or a bit later, and this is the moral choice we all face already. It's just whether or not we already know what's going to kill us. It's not surprising then that Dying Earth works are good vehicles for exploring questions of meaning.
The collapse of reality mirrors or brings about a collapse in human society, where reason falls apart. Often the human race comes into contact with forces or beings we can't understand, either revealing themselves in the twilight of existence, or appearing as the clock strikes midnight. And there's a criticism of the subgenre: some works have struck me as yet another excuse to write fantasy but call it science fiction; the breakdown in nature and society returns us to an era of taverns and swords and magic. This is why Jack Vance's work is not my favorite. It's an easy trick to write a sword and sorcery novel but subvert the simplistic paradigm that science fiction differs from fantasy in that "it's in the future, so it might actually happen" (China Mieville has some interesting things to say about this attitude). But dying Earth writers are not the only ones who have used a faraway setting to throw out all the rules of history and write familiar settings. Another cheating technique science fiction writers use is the intervening apocalypse, to reset society and technology. The most honest and original solution to this problem for my money is Vinge's Slow Zone, but there are lots of cheats people use to make world-building easier.
The horror these authors convey at the extinction of reason is at the core of dying Earth prose, more important I think than the used-up Earth or the cooling Sun, and it's this that strikes a chord of unease in many readers, children of the Enlightenment as we are. The far future setting is just a way to get to a place where the laws of reality are broken, although if you're bold enough as Delany with Dhalgren, you can break them in the modern American Midwest. In fact I think we can throw out some of the traditional entrants in the subgenre and simultaneously reconsider some of strange fiction's early heroes as exploring this at least as thoroughly. Indeed, although the Earth is literally dying at the end of the Time Machine, we can make a pretty good guess at how the giant crab things got there (some iteration of evolution) and why they're struggling to survive (the Sun is burning out due to well-understood but inevitable physics).
Non-Euclidean geometry, by I2ebis.
In contrast, Lovecraft might not have been writing about the far future or the death of Earth as such, but he conveyed something far more unsettling that is at the core of the more disturbing dying Earth works, and it's this: we comfort ourselves by describing rationally the small slice of experience that our limited brains can deliver. Even if reason is not an illusion, then there are "black swans" which we cannot hope to have encountered in our brief existence, but which are no less important for our naivete. Asimov's Nightfall hints at this in the gibbering insanity that heretofore unknown darkness brings. In the real world, there are gamma ray bursts, comet swarms, clouds of debris around the galaxy that we rotate into periodically, supervolcano eruptions, and magnetic solar storms. And these at least are all things that we know exist! To the modern naturalist worldview, confident that we have either already understood everything important, or ultimately will, because we can, this is terrifying. As an aside, I don't personally know any unreconstructed theist who is a fan of strange fiction, and I predict it wouldn't seem that strange; they already think the world is fundamentally incomprehensible.
Of all the early strange fiction writers, Hodgson is the one who did this best. Where Lovecraft often gives us the details of the pantheons he has created, Hodgson leaves us in the same fog that his characters suffer. The House on the Borderlands is better known, but The Night Land is a better example (and here's a great resource on that work). He leaves the a-rational horrors lurking in the shadows, their forms not fully understood, exactly as experienced by his characters. King's The Mist seems like a modern cinematic version of the Night Lands, except set in a familiar place.
From the Night Land website, by Stephen Fabian.
A specific manifestation of the horror of the irrational, and one which more extreme horror has begun using in recent decades, is the divorce of experience from matter; that is, the existence of consciousness separate from damage to a body or control of the experience, more plainly, the possibility of hell. Lovecraft warns of something like this happening when Cthulhu awakens. Recently Iain Banks wrote Surface Detail, exploring the morality of simulated hells, but it wouldn't be correct to consider this horror of the irrational because there are still "rules"; the hells are a simulation, and the universe of the Culture Banks has created is eminently rational and I would argue is actually an extension of Enlightenment ideals.
A final common thread about far-future works in general, particularly from the classic period, is that they often explain how they got into our hands, as in Stapledon's The Last and First Men, where the author claims to be a mere telepathic mouthpiece for a far future historian. This is interesting because it's not obvious why these works would feel called upon to defend their authenticity, as compared to other works of science fiction.
It should be pointed out that among Lovecraft's concerns was the creeping perverse derangement of high European, especially English, reason by the infiltration of what he may have called the sinister, dark and Oriental races; more on this here. Surely modern California would have presented a nightmare vision to his sensibilities, one which doesn't seem to bother most people today.
Illustration credit Don Dixon.
Dying Earth settings tend to be dark and cold and maybe (as above) glowing red, due to the failing Sun. Nature's cycles or even physical laws themselves have gone perversely off the rails or ceased completely, like a kind of cosmic menopause. (Whether this extends beyond our provincial corner of the universe, and if so why, is another question.) Interestingly, this derangement of reality is usually not the fault of humans - these aren't ecological morality tales - or at least the exhaustion and littering of the planet is pictured dispassionately, as an aggregate trend that has no moral meaning. (Although not a dying Earth work, Stephenson's Anathem has hints of a long history as resources are scavenged from old ruins, although the tone in describing this activity is again very matter-of-fact.) This creates a setting of an incomprehensibly huge and uncaring universe, a clockwork winding down despite any designs harbored by the characters or their ancestors, and indeed some of the profound questions of the human condition become meaningless against such an unforgiving vast backdrop: lay down and die now, or continue struggling to pass on genes and values and create happiness? It's all going to disappear in a few years, so how can it matter? In reality, we're all going to die, either now or a bit later, and this is the moral choice we all face already. It's just whether or not we already know what's going to kill us. It's not surprising then that Dying Earth works are good vehicles for exploring questions of meaning.
The collapse of reality mirrors or brings about a collapse in human society, where reason falls apart. Often the human race comes into contact with forces or beings we can't understand, either revealing themselves in the twilight of existence, or appearing as the clock strikes midnight. And there's a criticism of the subgenre: some works have struck me as yet another excuse to write fantasy but call it science fiction; the breakdown in nature and society returns us to an era of taverns and swords and magic. This is why Jack Vance's work is not my favorite. It's an easy trick to write a sword and sorcery novel but subvert the simplistic paradigm that science fiction differs from fantasy in that "it's in the future, so it might actually happen" (China Mieville has some interesting things to say about this attitude). But dying Earth writers are not the only ones who have used a faraway setting to throw out all the rules of history and write familiar settings. Another cheating technique science fiction writers use is the intervening apocalypse, to reset society and technology. The most honest and original solution to this problem for my money is Vinge's Slow Zone, but there are lots of cheats people use to make world-building easier.
The horror these authors convey at the extinction of reason is at the core of dying Earth prose, more important I think than the used-up Earth or the cooling Sun, and it's this that strikes a chord of unease in many readers, children of the Enlightenment as we are. The far future setting is just a way to get to a place where the laws of reality are broken, although if you're bold enough as Delany with Dhalgren, you can break them in the modern American Midwest. In fact I think we can throw out some of the traditional entrants in the subgenre and simultaneously reconsider some of strange fiction's early heroes as exploring this at least as thoroughly. Indeed, although the Earth is literally dying at the end of the Time Machine, we can make a pretty good guess at how the giant crab things got there (some iteration of evolution) and why they're struggling to survive (the Sun is burning out due to well-understood but inevitable physics).
Non-Euclidean geometry, by I2ebis.
In contrast, Lovecraft might not have been writing about the far future or the death of Earth as such, but he conveyed something far more unsettling that is at the core of the more disturbing dying Earth works, and it's this: we comfort ourselves by describing rationally the small slice of experience that our limited brains can deliver. Even if reason is not an illusion, then there are "black swans" which we cannot hope to have encountered in our brief existence, but which are no less important for our naivete. Asimov's Nightfall hints at this in the gibbering insanity that heretofore unknown darkness brings. In the real world, there are gamma ray bursts, comet swarms, clouds of debris around the galaxy that we rotate into periodically, supervolcano eruptions, and magnetic solar storms. And these at least are all things that we know exist! To the modern naturalist worldview, confident that we have either already understood everything important, or ultimately will, because we can, this is terrifying. As an aside, I don't personally know any unreconstructed theist who is a fan of strange fiction, and I predict it wouldn't seem that strange; they already think the world is fundamentally incomprehensible.
Of all the early strange fiction writers, Hodgson is the one who did this best. Where Lovecraft often gives us the details of the pantheons he has created, Hodgson leaves us in the same fog that his characters suffer. The House on the Borderlands is better known, but The Night Land is a better example (and here's a great resource on that work). He leaves the a-rational horrors lurking in the shadows, their forms not fully understood, exactly as experienced by his characters. King's The Mist seems like a modern cinematic version of the Night Lands, except set in a familiar place.
A specific manifestation of the horror of the irrational, and one which more extreme horror has begun using in recent decades, is the divorce of experience from matter; that is, the existence of consciousness separate from damage to a body or control of the experience, more plainly, the possibility of hell. Lovecraft warns of something like this happening when Cthulhu awakens. Recently Iain Banks wrote Surface Detail, exploring the morality of simulated hells, but it wouldn't be correct to consider this horror of the irrational because there are still "rules"; the hells are a simulation, and the universe of the Culture Banks has created is eminently rational and I would argue is actually an extension of Enlightenment ideals.
A final common thread about far-future works in general, particularly from the classic period, is that they often explain how they got into our hands, as in Stapledon's The Last and First Men, where the author claims to be a mere telepathic mouthpiece for a far future historian. This is interesting because it's not obvious why these works would feel called upon to defend their authenticity, as compared to other works of science fiction.
It should be pointed out that among Lovecraft's concerns was the creeping perverse derangement of high European, especially English, reason by the infiltration of what he may have called the sinister, dark and Oriental races; more on this here. Surely modern California would have presented a nightmare vision to his sensibilities, one which doesn't seem to bother most people today.
Friday, January 25, 2013
Markov Metal!
More on Markov chains here. These lyrics were (more or less) written by a computer; I was only the editor. If you're dubious that this counts as computer-lyric: could someone really write these in 15 minutes?
I generated them using the following as input: Metallica albums Kill 'Em All through the Black Album; Megadeth Peace Sells through Cryptic Writings; and Slayer Reign in Blood through Seasons in the Abyss. I then curated the best sentences and arranged it so it vaguely rhymed. I used this Markov site (and a synonym site for the title). You'll see a familiar phrase every so often and if you really squint, you can kind of force a narrative coherence on some of the verses. But be honest, is it really that much less coherent or borderline plagiarist than most metal lyrics?
A better metal-song generator would include song structure as well. I'd like to do this for riff generation too but I've already wasted enough of my (and your) time. Enjoy!
SPIRIT VOID ONSLAUGHT
I hunger penniless
What might be unsteady
The flag of fire bursting with bated breath
The priest that they'll set free
When twilight blanket's welcome
But still has disappeared
But who's to cry out to each other again
Just leave you see it's crystal clear
Make a familiar face tomorrow, blackened
Dismembered destiny
A New World Order comes back
In bomb shelters filled with loneliness
I don't tread on your soul
Dethrone the word as bad omens in
Lie to myself, the murder's complete
There's nothing else I wonder as the past begins again
Come walk with fire with needles
This is an insane game
But I'll get caught up dead
My mother put him in like rain
Won't hesitate to take some crazy shit has passed
And the Reasons that cleanse you
Where I've seen it rise
Step outside in silent agony within you
Blistering of your mind
There's no more than my time has been stricken by bloodshed
Move can't you want desire
Life overturned, spanning the God.
Now no mercy for you
Part of life is no remorse
The wall down now
No IOU's, forgotten children
Shortest Straw has found me
Iron fist coming Down in Vanity
Exploiting their Appetite they fury
We want the war to support the heavens to be
If you committed me
I can't feel velocity
Taken my name and reality
What evil I set free
Anonymous existence rendered useless
At the tone of war dreams
A child to recall, the night fall
Souls of machinegun fire screaming
When death staring down
Soon You will be done
Another fight to save the word
We go on the undead altar and beg salvation
Chill your life, I deal in hell
To cut off through the swords
Kept restrained, disapprobation, but you try again
No end until I miss the creations of sacrifice curse
Bones of placid faces
The basis of clay now to be
I burn deep down thinking its done
Feeling the Part of blood
Stretching out on numbered days,
Fragments of pain
To slay all this kingdom of the sound of arms
I can't say back somehow your master plan of war
Hear my head against the stake you'd crush me
I'll Take No recess and smashing your steeds
Kills the sheep, you feed me
Trust me luck deserted me
No deed or dying soul could it come to me
Welcome to kill, not see more seriously
Memories can't believe
Experience pleasures of misery
Voices oppress like the number that is the need
The destroyer born faceless without eyes and sweat
We lied to live again
Killing, you raise the school of life
Just empty gun fire
A plan of Living drives you through your life
While you're fighting to lie
As soon you'll do lying dying time
Foreclosure of the hatred comes Burning inside
like a couple grains of demise
Seeking life
Will I know it to be written
Now you're next to kill a kiss
Shouting to begin whipping
On judgment day looking back to live
I realize you see how to live
For you hear evermore you've lost my soul
Fire so grim, I say you will kill
I can subdue but me through the shadows
Eternally rot amidst the gods
As I see won't take my soul without
I am stalking the things are a god
Don't pay dying one, command you for now
Oh so far beyond the needle
Diffused compulsions
I don't care
Bombard 'till submission
No repent, we are no end of sleeping in blood
So be looking back to mankind strapped in
Peace find in blood possessed with hell
Upending the way across the end
The kingdom of puppets
Another child draws near Inferno's coming
Dying to each other lover
The rising immortal in the hour from the dark.
I'm stoned
I generated them using the following as input: Metallica albums Kill 'Em All through the Black Album; Megadeth Peace Sells through Cryptic Writings; and Slayer Reign in Blood through Seasons in the Abyss. I then curated the best sentences and arranged it so it vaguely rhymed. I used this Markov site (and a synonym site for the title). You'll see a familiar phrase every so often and if you really squint, you can kind of force a narrative coherence on some of the verses. But be honest, is it really that much less coherent or borderline plagiarist than most metal lyrics?
A better metal-song generator would include song structure as well. I'd like to do this for riff generation too but I've already wasted enough of my (and your) time. Enjoy!
SPIRIT VOID ONSLAUGHT
I hunger penniless
What might be unsteady
The flag of fire bursting with bated breath
The priest that they'll set free
When twilight blanket's welcome
But still has disappeared
But who's to cry out to each other again
Just leave you see it's crystal clear
Make a familiar face tomorrow, blackened
Dismembered destiny
A New World Order comes back
In bomb shelters filled with loneliness
I don't tread on your soul
Dethrone the word as bad omens in
Lie to myself, the murder's complete
There's nothing else I wonder as the past begins again
Come walk with fire with needles
This is an insane game
But I'll get caught up dead
My mother put him in like rain
Won't hesitate to take some crazy shit has passed
And the Reasons that cleanse you
Where I've seen it rise
Step outside in silent agony within you
Blistering of your mind
There's no more than my time has been stricken by bloodshed
Move can't you want desire
Life overturned, spanning the God.
Now no mercy for you
Part of life is no remorse
The wall down now
No IOU's, forgotten children
Shortest Straw has found me
Iron fist coming Down in Vanity
Exploiting their Appetite they fury
We want the war to support the heavens to be
If you committed me
I can't feel velocity
Taken my name and reality
What evil I set free
Anonymous existence rendered useless
At the tone of war dreams
A child to recall, the night fall
Souls of machinegun fire screaming
When death staring down
Soon You will be done
Another fight to save the word
We go on the undead altar and beg salvation
Chill your life, I deal in hell
To cut off through the swords
Kept restrained, disapprobation, but you try again
No end until I miss the creations of sacrifice curse
Bones of placid faces
The basis of clay now to be
I burn deep down thinking its done
Feeling the Part of blood
Stretching out on numbered days,
Fragments of pain
To slay all this kingdom of the sound of arms
I can't say back somehow your master plan of war
Hear my head against the stake you'd crush me
I'll Take No recess and smashing your steeds
Kills the sheep, you feed me
Trust me luck deserted me
No deed or dying soul could it come to me
Welcome to kill, not see more seriously
Memories can't believe
Experience pleasures of misery
Voices oppress like the number that is the need
The destroyer born faceless without eyes and sweat
We lied to live again
Killing, you raise the school of life
Just empty gun fire
A plan of Living drives you through your life
While you're fighting to lie
As soon you'll do lying dying time
Foreclosure of the hatred comes Burning inside
like a couple grains of demise
Seeking life
Will I know it to be written
Now you're next to kill a kiss
Shouting to begin whipping
On judgment day looking back to live
I realize you see how to live
For you hear evermore you've lost my soul
Fire so grim, I say you will kill
I can subdue but me through the shadows
Eternally rot amidst the gods
As I see won't take my soul without
I am stalking the things are a god
Don't pay dying one, command you for now
Oh so far beyond the needle
Diffused compulsions
I don't care
Bombard 'till submission
No repent, we are no end of sleeping in blood
So be looking back to mankind strapped in
Peace find in blood possessed with hell
Upending the way across the end
The kingdom of puppets
Another child draws near Inferno's coming
Dying to each other lover
The rising immortal in the hour from the dark.
I'm stoned
Sunday, January 20, 2013
More Fun With Markov Chains - And Language Learning
This is cross-posted to my cognition and evolution blog.
Once I started playing with Markov chains I couldn't leave well enough alone, so I went back and played with Markov chains at the character level again, this time with several languages. As you add more elements to the state, the output text starts "looking more like" the language that it came from. To measure this, I checked to see if Google Translate's language autodetect could still tell what it was looking at. This led to a prediction about language learning.
Using the Spanish Wiki entry for Apollo 11, at a 1-character state out of 5 trials the computer thought it was seeing Estonian twice, and then Welsh, Irish, and Galician. Using the English Apollo 11 article, it thought it was seeing Welsh three times, then Afrikaans and Engilsh. With 2-character states for both languages, the translator guessed correctly 5 out of 5.
Then I pulled a dirty trick: I used Old English (from the introduction to Beowulf of course) complete with thorns and diphthongs, and used that as input. With 1-character states, the translator consistently thought it was seeing Welsh 5 for 5. With 2-characters, it still answered Welsh four times, and German once (remember, with the modern languages it could already reliably detect languages scrambled at this level). With 3-element states the translator said English twice, then Icelandic, German, and Welsh. Surprising that this was the first it responded with Icelandic, with all those thistles and thorns floating around.
Finally I started giving it blocks of unmolested Old English text. Unsurprisingly, it couldn't consistently say that this was an ancestral form of English (how many literate modern English speakers would, if they'd never seen it?) The translator said English three times and Danish twice. What!? We translator-Danes in the days of Markov! Feeding it full blocks of Canterbury tales, it had no problem seeing that Middle English was English.
Finally, using an online string generator as well as a little Excel function I wrote, I fed it totally random strings. The response was consistently Maltese.
Why Maltese? Maybe because of the X's, who knows. But I'm not trying to reverse engineer Google's translation engine. I'm more interested in whether its wild guesses on Old English and the low-element-state-scrambled Markov chains reveal something. One obviously is the possible relationship between languages; that Danish and Icelandic should appear in the translator's guesses with scrambled Old English is interesting but not surprising. But the preponderance of Welsh was also interesting. It's unlikely that the translator is noticing anything about a Celtic deep substrate of English, especially since it couldn't see the more recent Old English substrate of English! More likely there's something about Welsh that makes it a good guess in badly scrambled text, as possibly with Maltese. More sound combinations allowed? Lots of single-letter words? If this is true, then the languages that permit the most sounds and sound combinations will:
a) be the "last resort" guesses for translation engines, and
b) should take longer for children to start speaking.
Why b.? When children are learning their first language, imagine the difficulty of identifying individual words. All they're getting is a stream of sound. Now, when you learn a new word, you recognize the other words around it; not so when you're twelve months old. There is evidence that what kids are doing is trying to find word boundaries, and that part of the input comes from looking for sound combinations that appear less frequently, as in "worD Boundary" - English doesn't allow "db" to occur at the beginning or end of a word (although some languages do), but we allow d and b to run up against each other between words. In a language with sound combinations that are less constrained, it will be harder for kids to identify the word boundaries, and it will take them longer to start speaking. This prediction has verified parallels in morphosyntax. Cree and Fulani are notorious for being horribly irregular in terms of verbs and plurals, respectively. In most languages, children are proficient grammatically around age 5, but grammatical maturity is delayed in these languages for several years by the irregularities.
So my prediction is: if the orthography of Welsh and Maltese corresponds to their phonology and a less constraining set of rules about what sounds can occur together, then I predict that children's vocabularies will grow more slowly in those languages relative to most others. The most extreme would be Khoi-San, i.e. the famous "Bushman" click languages, which have the richest sound inventory of any language on Earth (not just the clicks). I'm not familiar with the phonology, just that their inventory is huge, and I'm presuming in my prediction that those sounds aren't severely constrained in terms of how they can appear in combination with each other.
Once I started playing with Markov chains I couldn't leave well enough alone, so I went back and played with Markov chains at the character level again, this time with several languages. As you add more elements to the state, the output text starts "looking more like" the language that it came from. To measure this, I checked to see if Google Translate's language autodetect could still tell what it was looking at. This led to a prediction about language learning.
Using the Spanish Wiki entry for Apollo 11, at a 1-character state out of 5 trials the computer thought it was seeing Estonian twice, and then Welsh, Irish, and Galician. Using the English Apollo 11 article, it thought it was seeing Welsh three times, then Afrikaans and Engilsh. With 2-character states for both languages, the translator guessed correctly 5 out of 5.
Then I pulled a dirty trick: I used Old English (from the introduction to Beowulf of course) complete with thorns and diphthongs, and used that as input. With 1-character states, the translator consistently thought it was seeing Welsh 5 for 5. With 2-characters, it still answered Welsh four times, and German once (remember, with the modern languages it could already reliably detect languages scrambled at this level). With 3-element states the translator said English twice, then Icelandic, German, and Welsh. Surprising that this was the first it responded with Icelandic, with all those thistles and thorns floating around.
Finally I started giving it blocks of unmolested Old English text. Unsurprisingly, it couldn't consistently say that this was an ancestral form of English (how many literate modern English speakers would, if they'd never seen it?) The translator said English three times and Danish twice. What!? We translator-Danes in the days of Markov! Feeding it full blocks of Canterbury tales, it had no problem seeing that Middle English was English.
Finally, using an online string generator as well as a little Excel function I wrote, I fed it totally random strings. The response was consistently Maltese.
Why Maltese? Maybe because of the X's, who knows. But I'm not trying to reverse engineer Google's translation engine. I'm more interested in whether its wild guesses on Old English and the low-element-state-scrambled Markov chains reveal something. One obviously is the possible relationship between languages; that Danish and Icelandic should appear in the translator's guesses with scrambled Old English is interesting but not surprising. But the preponderance of Welsh was also interesting. It's unlikely that the translator is noticing anything about a Celtic deep substrate of English, especially since it couldn't see the more recent Old English substrate of English! More likely there's something about Welsh that makes it a good guess in badly scrambled text, as possibly with Maltese. More sound combinations allowed? Lots of single-letter words? If this is true, then the languages that permit the most sounds and sound combinations will:
a) be the "last resort" guesses for translation engines, and
b) should take longer for children to start speaking.
Why b.? When children are learning their first language, imagine the difficulty of identifying individual words. All they're getting is a stream of sound. Now, when you learn a new word, you recognize the other words around it; not so when you're twelve months old. There is evidence that what kids are doing is trying to find word boundaries, and that part of the input comes from looking for sound combinations that appear less frequently, as in "worD Boundary" - English doesn't allow "db" to occur at the beginning or end of a word (although some languages do), but we allow d and b to run up against each other between words. In a language with sound combinations that are less constrained, it will be harder for kids to identify the word boundaries, and it will take them longer to start speaking. This prediction has verified parallels in morphosyntax. Cree and Fulani are notorious for being horribly irregular in terms of verbs and plurals, respectively. In most languages, children are proficient grammatically around age 5, but grammatical maturity is delayed in these languages for several years by the irregularities.
So my prediction is: if the orthography of Welsh and Maltese corresponds to their phonology and a less constraining set of rules about what sounds can occur together, then I predict that children's vocabularies will grow more slowly in those languages relative to most others. The most extreme would be Khoi-San, i.e. the famous "Bushman" click languages, which have the richest sound inventory of any language on Earth (not just the clicks). I'm not familiar with the phonology, just that their inventory is huge, and I'm presuming in my prediction that those sounds aren't severely constrained in terms of how they can appear in combination with each other.
Saturday, January 19, 2013
Fun With Markov Chains
I give you Garkov, where the Garfield Comic meets Markov chains. Japanese T-shirt designers have apparently been using this process for a while.
From Coding Horror
While not as entertaining as the Nietzschean Family Circus, it's not without its charm.
At the letter-level you can play fun games with Markov chains - for example, coming up with similar-sounding but not identical names. I started by throwing in the top American cities. I got a few that were believable and coherent but not identical to the inputs (Bernard, Costa Maria, and Pennectico) but unless you have a large corpus, that boundary of coherent but not identical is pretty sharp. Even then you can still get some clever portmanteaus out of the deal: Thorntonio, Mobilene, and Charleston-Salem. Even better, there was an obvious name for a gay men's magazine about the scene in Alaska - Manchorage (I'm sure someone's ahead of me on that) and my personal favorite Markovian city name, Allen West. Going from 300ish city names down to 50 states gave only one novel one in a number of tries, Monsaskana, but a few clever portmanteaus, like Coloridaho, Florado, and my favorite, Tennsylabamaska.
From now on when I need something like this I'm going to automate it this way. For president names, no way, only forty-four of those. I did the same thing with another Markov generator which operates at the sentence level (far more interesting) and tried it on one of my own blog posts, where I discuss the zoo hypothesis response to the Fermi paradox. At a mere 367 words, using a 2-element state, it has the same problem of either spitting out the same sentences verbatim or giving sentence portmanteaus: That we've been staring them in the face the whole universe is A kind of frightening abjectly humbling realization is in fact the best case scenario I expect because it means they will seem incomprehensible if we even recognize them. (Compare to the original post.) It's more interesting but less coherent with a 1-element state: To spoil the whole universe is exactly what we haven't noticed aliens Ever try to do This wanting as alive even if they're trying to get our own ignorance. The Markov process is producing grammatical sentences even at this level, notwithstanding run-ons or semantic incoherencies, whether green, colorless or otherwise.
In case you wanted input text from a more middlebrow writer than myself, I used Hemingway's Old Man and the Sea for a bigger corpus and with 2-element states got a few nice ones: Now there was no one that they rose and they had razor-sharp cutting edges on both sides. Knowing it was a little later to save the blood in the sky. Borrow two dollars and a skiff in the bow he could not fail myself and die on A Monday morning. Strong enough Now for the fish made He was letting the current made against the line and he is too wise to jump. And then 3 elements: Shoulders and braced his left hand and arm he took the bait just now. The skiff shake as he jerked and pulled on the fish and he had found a way of leaning forward against the bow he could not remember the prayer and then He would say them fast so that they made a half-garland on the projecting steel.
J.G. Ballard had characters who were poets who spent their days programming, and they were celebrated for the brilliance of the verse that their algorithms produced. (To read this as dystopian is to oversimplify Ballard with assumptions he exploded - it was a challenge to artists. I.e., how is this not what you're doing, except you're using the hardware inside your skull, and you're not sure how it works?) While the text above isn't about to fool a publisher that you're in possession of a lost Hemingway manuscript, it gets you partway there. That is to say, if I were an undergraduate in some bullshit post-modern theory class I needed for elective credit, for my term papers I would get hold of a bunch of secondary literature, stuff it in the Markov meat machine, and then curate the sentences and use spell check to make sure it's coherent. That takes the concentration and understanding out of it; at that point you're just editing.
It's worth pointing out that humans differ in how easily they create new words like this, mostly because we're organized semantically rather than phonetically. It's not easy, at least for healthy people. If I ask you to name words that start with p, you'll start slowing down after 5-10 words, but if I ask you to name things that have to do with palace, you'll have a much easier time. In some pathological states, you can't help but make up words, like Wernicke's encephalopathy, but the words people make up still follow the rules of their native language. Interestingly, when people "speak in tongues" during religious ceremonies, the tongues in which they speak to their gods have the same phonological rules as their native one.
Of any statistical technique, Markov chains have most made me wish I was a programmer. For instance, it doesn't seem that it would be any harder to reverse this process to recognize affixes, rather than predict following character. That is to say: in English we use "-s" (or "-es") and -ing as suffices on nouns and verbs. In any corpus, these will appear more often than other endings, and they will be less predicted by the letters preceding them than other clusters. By doing this, you could feed a corpus in any language and it would be able to pick out the likely affixes and particles, based on these properties. In fact, it's not implausible that this is how children are decoding language when learning their first one.
From Coding Horror
While not as entertaining as the Nietzschean Family Circus, it's not without its charm.
At the letter-level you can play fun games with Markov chains - for example, coming up with similar-sounding but not identical names. I started by throwing in the top American cities. I got a few that were believable and coherent but not identical to the inputs (Bernard, Costa Maria, and Pennectico) but unless you have a large corpus, that boundary of coherent but not identical is pretty sharp. Even then you can still get some clever portmanteaus out of the deal: Thorntonio, Mobilene, and Charleston-Salem. Even better, there was an obvious name for a gay men's magazine about the scene in Alaska - Manchorage (I'm sure someone's ahead of me on that) and my personal favorite Markovian city name, Allen West. Going from 300ish city names down to 50 states gave only one novel one in a number of tries, Monsaskana, but a few clever portmanteaus, like Coloridaho, Florado, and my favorite, Tennsylabamaska.
From now on when I need something like this I'm going to automate it this way. For president names, no way, only forty-four of those. I did the same thing with another Markov generator which operates at the sentence level (far more interesting) and tried it on one of my own blog posts, where I discuss the zoo hypothesis response to the Fermi paradox. At a mere 367 words, using a 2-element state, it has the same problem of either spitting out the same sentences verbatim or giving sentence portmanteaus: That we've been staring them in the face the whole universe is A kind of frightening abjectly humbling realization is in fact the best case scenario I expect because it means they will seem incomprehensible if we even recognize them. (Compare to the original post.) It's more interesting but less coherent with a 1-element state: To spoil the whole universe is exactly what we haven't noticed aliens Ever try to do This wanting as alive even if they're trying to get our own ignorance. The Markov process is producing grammatical sentences even at this level, notwithstanding run-ons or semantic incoherencies, whether green, colorless or otherwise.
In case you wanted input text from a more middlebrow writer than myself, I used Hemingway's Old Man and the Sea for a bigger corpus and with 2-element states got a few nice ones: Now there was no one that they rose and they had razor-sharp cutting edges on both sides. Knowing it was a little later to save the blood in the sky. Borrow two dollars and a skiff in the bow he could not fail myself and die on A Monday morning. Strong enough Now for the fish made He was letting the current made against the line and he is too wise to jump. And then 3 elements: Shoulders and braced his left hand and arm he took the bait just now. The skiff shake as he jerked and pulled on the fish and he had found a way of leaning forward against the bow he could not remember the prayer and then He would say them fast so that they made a half-garland on the projecting steel.
J.G. Ballard had characters who were poets who spent their days programming, and they were celebrated for the brilliance of the verse that their algorithms produced. (To read this as dystopian is to oversimplify Ballard with assumptions he exploded - it was a challenge to artists. I.e., how is this not what you're doing, except you're using the hardware inside your skull, and you're not sure how it works?) While the text above isn't about to fool a publisher that you're in possession of a lost Hemingway manuscript, it gets you partway there. That is to say, if I were an undergraduate in some bullshit post-modern theory class I needed for elective credit, for my term papers I would get hold of a bunch of secondary literature, stuff it in the Markov meat machine, and then curate the sentences and use spell check to make sure it's coherent. That takes the concentration and understanding out of it; at that point you're just editing.
It's worth pointing out that humans differ in how easily they create new words like this, mostly because we're organized semantically rather than phonetically. It's not easy, at least for healthy people. If I ask you to name words that start with p, you'll start slowing down after 5-10 words, but if I ask you to name things that have to do with palace, you'll have a much easier time. In some pathological states, you can't help but make up words, like Wernicke's encephalopathy, but the words people make up still follow the rules of their native language. Interestingly, when people "speak in tongues" during religious ceremonies, the tongues in which they speak to their gods have the same phonological rules as their native one.
Of any statistical technique, Markov chains have most made me wish I was a programmer. For instance, it doesn't seem that it would be any harder to reverse this process to recognize affixes, rather than predict following character. That is to say: in English we use "-s" (or "-es") and -ing as suffices on nouns and verbs. In any corpus, these will appear more often than other endings, and they will be less predicted by the letters preceding them than other clusters. By doing this, you could feed a corpus in any language and it would be able to pick out the likely affixes and particles, based on these properties. In fact, it's not implausible that this is how children are decoding language when learning their first one.
Dude, Owls Are So Metal
Even their sh*t is metal. These are owl pellets, with the little facies of the unfortunate dinees. Owls can poop Obituary cover art dude! From David Stillman's Santa Barbara outdoors blog.
Thursday, January 17, 2013
Short Film: VESSEL
What I like about this nifty short film: aliens with animal-like materially self-interested motives (just like you and I have) and a non-sappy ending.
Saturday, January 12, 2013
Lone Star Must Be at the Center of the Milky Way
How do I know he's there? Because the center of the galaxy tastes like raspberries, according to new radio telescope findings. And why is that? There can be only one reason...
Tuesday, January 8, 2013
Spar, by Kij Johnson
Marshall Maresca recommended this excellent piece at his blog. It richly deserves all the awards it was nominated for and received. At its core it explores the incommensurability of experience and on top of that, features a truly alien alien, which Gregory Benford described as the holy grail of sf. And (bonus) it avoids the pitfalls of fictions of ideas, because there's something strongly allegorical going on too. Finally, it also avoids the smarmy sense some stories emit of being proud of themselves for being pointlessly shocking, because it's not. That's too much spoilage already, so just read and enjoy. Perhaps not work safe.
Sunday, January 6, 2013
What Do Robots and Mariachi Have in Common? (Hint: Metal)
Reign in Blood Mariachi Style, with Dave Lombardo himself on drums:
Now Motorhead's Ace of Spades being played by robots.
Now Motorhead's Ace of Spades being played by robots.