Better Language Scrambling?
Better Language Scrambling?
I should preface this by saying that I can barely script, let alone code, and have absolutely no sense of the resources (either manhours to implement or server toll) that might be required. My second idea will unequivocally be easier in both arenas, but that's not going to stop me from voicing the first.
Ideally, it would be nice to have a small dictionary implemented (the length of a pocket dictionary, perhaps a few thousand words?), with each word arbitrarily weighted against skill level; instead of random scrambling, words would only be scrambled if their weight fell higher than the listening character's skill level. Articles and prepositions might be weighted at inept or amateur (thus allowing (likely erroneous) contextual interpretation), while more complex words or concepts might require adept/expert-level knowledge of the language to understand. To avoid the need for too many fallbacks and contingencies, misspelled words might always be scrambled (for characters less than master/grandmaster), and rationalized IC as "local dialects."
It seems a little silly when, as a Journeyman listener, a word as common as "Greetings" gets scrambled to "Ergetjlgu," just because it fell on the wrong side of a coin toss.
The second, much easier but much less interesting, implementation of this idea is to simply weight each word on the fly based on the number of characters in the word. Okay, "greetings" might still be scrambled, but with a few exceptions, there's a moderate correlation in the English language between the length of a word and its complexity. I might not expect a Novice speaker to know what "denouement" means, but "Come here!" probably shouldn't be scrambled for anybody but the most Inept students.
This might be a bit "fluffy" (i.e. not adding much to coded gameplay), but I think it'd make some RP (read: eavesdropping ) more interesting.
EDIT: Looking back over my logs, it appears scrambling already goes by word length, as anything under five characters is coming out fine, and five-letter words are translated about half the time. So, my second suggestion's kind of pointless.
Still, I think the first one (running speech through a dictionary) would be fun. If it's viable (i.e. wouldn't take a negative toll on server response time), I'd be willing to compile and weight a (plain-text, would still need parsing) dictionary to ease up on manhours required.
Ideally, it would be nice to have a small dictionary implemented (the length of a pocket dictionary, perhaps a few thousand words?), with each word arbitrarily weighted against skill level; instead of random scrambling, words would only be scrambled if their weight fell higher than the listening character's skill level. Articles and prepositions might be weighted at inept or amateur (thus allowing (likely erroneous) contextual interpretation), while more complex words or concepts might require adept/expert-level knowledge of the language to understand. To avoid the need for too many fallbacks and contingencies, misspelled words might always be scrambled (for characters less than master/grandmaster), and rationalized IC as "local dialects."
It seems a little silly when, as a Journeyman listener, a word as common as "Greetings" gets scrambled to "Ergetjlgu," just because it fell on the wrong side of a coin toss.
The second, much easier but much less interesting, implementation of this idea is to simply weight each word on the fly based on the number of characters in the word. Okay, "greetings" might still be scrambled, but with a few exceptions, there's a moderate correlation in the English language between the length of a word and its complexity. I might not expect a Novice speaker to know what "denouement" means, but "Come here!" probably shouldn't be scrambled for anybody but the most Inept students.
This might be a bit "fluffy" (i.e. not adding much to coded gameplay), but I think it'd make some RP (read: eavesdropping ) more interesting.
EDIT: Looking back over my logs, it appears scrambling already goes by word length, as anything under five characters is coming out fine, and five-letter words are translated about half the time. So, my second suggestion's kind of pointless.
Still, I think the first one (running speech through a dictionary) would be fun. If it's viable (i.e. wouldn't take a negative toll on server response time), I'd be willing to compile and weight a (plain-text, would still need parsing) dictionary to ease up on manhours required.
Re: Better Language Scrambling?
Awesome, if you put together the dictionary, I'll plug it in
Re: Better Language Scrambling?
Seriously?Mask wrote:Awesome, if you put together the dictionary, I'll plug it in
I'll get on it this weekend.
Re: Better Language Scrambling?
Cool ideas for language scrambling:
1) Simplify the language spoken based on skill level, ie convert:
If we had a simple dictionary of words together with how common they were, like for modern english, we could probably tweak it a little to give it a more FK-ish feel - for example, I would say that we would use the word 'fireball' and 'broadsword' a bit more often than in modern english...
If the dictionary had some simple markup in terms of nouns, verbs, prepositions etc, it would be cool to occasionally mix those up for lower skilled speakers. Also, some very uncommon words would just be untranslatable, no matter how short they were.
1) Simplify the language spoken based on skill level, ie convert:
to:Excuse me, sir, but I would like to purchase your finest short blade
2) For low skill levels, make people occasionally use the wrong words, ie, convert:Me want shortsword
to:I would like a bag of wool
How could this be made to work? Hmm.I would like to bag your bull
If we had a simple dictionary of words together with how common they were, like for modern english, we could probably tweak it a little to give it a more FK-ish feel - for example, I would say that we would use the word 'fireball' and 'broadsword' a bit more often than in modern english...
If the dictionary had some simple markup in terms of nouns, verbs, prepositions etc, it would be cool to occasionally mix those up for lower skilled speakers. Also, some very uncommon words would just be untranslatable, no matter how short they were.
Re: Better Language Scrambling?
I had some ugly distractions this weekend, but I'm a little ways through this and making decent progress.
I tried looking for a comparable list for modern English (i.e. "how common words are"), but there's nothing out there that's comprehensive enough, even for a short dictionary. So, I'm suffering and doing it by hand. What you're going to get in the first draft is just a list of words with arbitrary numbers assigned, 1 to 8, corresponding to mastery levels. I want distribution among levels to be on a bell curve, but that might wait until the second draft.
Even just tagging words "noun," "verb," etc. would literally double the file size (and how much was needed to parse), but if that's not a problem it's a simple matter to do. Foreseeable problems arise when words can be both nouns and verbs, or other parts of speech, depending on their context. Like "walk." How will the scrambler know the difference between "I want to walk to Waterdeep" and "I went to Waterdeep Walk", when it tries to switch funny words around?
Untranslatable words can just be removed from the dictionary, and then be treated however typos are treated.
I don't know why you'd want to bag my bull, but I think that would involve a rhyming dictionary, which is an entirely different beast. Also, I'm flattered.
I tried looking for a comparable list for modern English (i.e. "how common words are"), but there's nothing out there that's comprehensive enough, even for a short dictionary. So, I'm suffering and doing it by hand. What you're going to get in the first draft is just a list of words with arbitrary numbers assigned, 1 to 8, corresponding to mastery levels. I want distribution among levels to be on a bell curve, but that might wait until the second draft.
Even just tagging words "noun," "verb," etc. would literally double the file size (and how much was needed to parse), but if that's not a problem it's a simple matter to do. Foreseeable problems arise when words can be both nouns and verbs, or other parts of speech, depending on their context. Like "walk." How will the scrambler know the difference between "I want to walk to Waterdeep" and "I went to Waterdeep Walk", when it tries to switch funny words around?
Untranslatable words can just be removed from the dictionary, and then be treated however typos are treated.
I don't know why you'd want to bag my bull, but I think that would involve a rhyming dictionary, which is an entirely different beast. Also, I'm flattered.
Re: Better Language Scrambling?
Wow, I think manually writing a dictionary is too awesome an undertaking. What if we were just to get a bunch of FR novels and run them all through a program which would just count the number of instances of each word in each book and use that as a commonality indicator based on firing them into a distribution and extracting different quantiles?
With the results of that, we could fire it through some other program which would find out the word-type from another dictionary and add that meta data in?
A selection of FR novels and a tool like this:
https://code.google.com/p/epub2txt/
And we have a dictionary! I might have a quick go at this and send you the result for review - PM me your email address.
With the results of that, we could fire it through some other program which would find out the word-type from another dictionary and add that meta data in?
A selection of FR novels and a tool like this:
https://code.google.com/p/epub2txt/
And we have a dictionary! I might have a quick go at this and send you the result for review - PM me your email address.
Re: Better Language Scrambling?
http://www.lipsum.com/
Lorem ipsum is a bunch of rubbish text generated to fill paragraph forms before they be filled with words, phrases and actual text. This is a generator for it. I find its a good source of readible jibberish similar to latin.
Lorem ipsum is a bunch of rubbish text generated to fill paragraph forms before they be filled with words, phrases and actual text. This is a generator for it. I find its a good source of readible jibberish similar to latin.
Justice is not neccesarily honourable, it is a tolerable business, in essence you tolerate honour until it impedes justice, then you do what is right.
Spelling is not necessarily correct
Spelling is not necessarily correct
Re: Better Language Scrambling?
Oh, I'd already got the word list. It was just a matter of weighting it. XDMask wrote:Wow, I think manually writing a dictionary is too awesome an undertaking. What if we were just to get a bunch of FR novels and run them all through a program which would just count the number of instances of each word in each book and use that as a commonality indicator based on firing them into a distribution and extracting different quantiles?
With the results of that, we could fire it through some other program which would find out the word-type from another dictionary and add that meta data in?
A selection of FR novels and a tool like this:
https://code.google.com/p/epub2txt/
And we have a dictionary! I might have a quick go at this and send you the result for review - PM me your email address.
And, really, frequency of use is not an indicator of ease of use. People learn to count in their native tongue at, what, three years? Four? But how often do you think you'll find "eight" or "thirteen" printed in a novel? =/
Honestly, this is nothing difficult, it's just a little time-consuming. But, if you like, we can try it your way and see what comes up.
Re: Better Language Scrambling?
This would be really awesome, though from experience I can tell you that it would also be really hard.Mask wrote:Cool ideas for language scrambling:
1) Simplify the language spoken based on skill level, ie convert:
to:Excuse me, sir, but I would like to purchase your finest short blade
Me want shortsword
Something for the longer term (hoping not to hijack this thread): once Lisira's dictionary is in, I could generate lists of nonsense words that look "elvish", "orcish", etc. These could be substituted in for uncommon words, instead of just scrambling them. For example, instead of
An elf says "There is a great meeting in the forest" --> An elf says "There is a great lfdsknh in the fprdtt"
you might get
An elf says "There is a great meeting in the forest" --> An elf says "There is a great omentie in the lassiya"
Re: Better Language Scrambling?
+1Pirro wrote:Something for the longer term (hoping not to hijack this thread): once Lisira's dictionary is in, I could generate lists of nonsense words that look "elvish", "orcish", etc. These could be substituted in for uncommon words, instead of just scrambling them. For example, instead of
An elf says "There is a great meeting in the forest" --> An elf says "There is a great lfdsknh in the fprdtt"
you might get
An elf says "There is a great meeting in the forest" --> An elf says "There is a great omentie in the lassiya"