December 8, 2018

Random signatures

The goal is to generate pronounceable names consisting of three components, e.g., Ritin Vogga Umkithessix, for a total of eight syllables. By pronounceable, I mean that English (or German, or Romanian and so on) speakers should find that the names seem pronounceable, and be able to make a credible attempt at pronouncing them; they will most likely not agree completely on the pronunciation, but that's perfectly acceptable.

Phonotactics

  • The vowels are a, e, i, o, and u, and the diphthongs are ai, au, ei, eu, oi, ou and ui. They are intended to be pronounced /ai/, /au/ and so on, but English pronunciation is fine.
  • The consonants are b, ch (as in chill), d, f, g, h, j (as in jam), k, l, m, n, p, qu, r, s, sh, t, th (as in thin), w, y (as in yes), z, and zh (the sound of s in vision).
  • The first syllable may begin with a vowel; otherwise, all syllables begin with either one consonant, or a two-syllable cluster of which the first is one of b, f, g, k, p, s, sh, v, z, and the second is an l or an r, or the first is d, t, th, and the second is r.
  • Except the last syllable, syllables end in a vowel, a diphtong, or a vowel plus m or n.
  • A single consonant between two vowels may be geminated, except ch, h, j, qu, sh, th, w, x, y, and zh, which may not.
  • The last syllable may end in one of the vowels a, e, o, in a diphthong, or in a pronounceable cluster of two consonants.
  • In words of two or more syllables, the dynamic stress falls on the penultimate syllable.

Implementation

The main function is RandSignature(), which generates a random, pronounceable, three-part name such as, for example, Fraunems Zhadum Thegonvonjask or Aimave Fashes Amzequev. The function selects a syllable pattern (2, 2, 4, or 2, 3, 3, or 3, 1, 4, or 3, 2, 3), and calls RandName() to generate the three components.
Function RandSignature() As String

  Dim NumSyllables As Variantconsonant
  Dim i As Long

  Select Case RandChoice(4)
  Case 0: NumSyllables = Array(2, 2, 4)
  Case 1: NumSyllables = Array(2, 3, 3)
  Case 2: NumSyllables = Array(3, 1, 4)
  Case 3: NumSyllables = Array(3, 2, 3)
  End Select

  For i = LBound(NumSyllables) To UBound(NumSyllables)
    RandSignature = RandSignature _
      & IIf(i > LBound(NumSyllables), " ", "") & RandName(NumSyllables(i))
  Next i

End Function
The bulk of the implementation is in the function RandName(NumSyllables), which actually generates a pronounceable name components and implements the phonotactics rules. It iterates with i from 1 to NumSyllables, accumulating the current syllable is the string s. After NumSyllables have been generated and concatenated, it finalizes the name ending, and finally replaces placeholder letters with the correct spelling. Since this is a simple-minded example, all multiple choices are equiprobable.

cchThe sound of ch in chill
KckOnly at end of word
kkck
qqu
Ssh
TthAs in English, presumably
ZzhThe sound of s in measure or vision
WuSecond component of a diphthong; spelled w at end of word
YiSecond component of a diphthong; spelled y at end of word

While generating the name, the diphthongs ai, au, ei, eu, and so on are represented by aY, aW, eY, eW and so on, that is to say, W and Y stand for the second component of a diphthong; they will be replaced with u and i after completing the generation of the name component, unless they fall at the end of the word, when they are replaced with w and y. Thus, uzoYnisk will be spelled Uzoinisk, but efoW is will be spelled Efow. I have this feeling that monoglot native English speakers would be more willing to accept Efow and Koxaloy than the more regular spellings Efou and Koxaloi.
  • The first syllable may being with a vowel, with probability 1/5. The second and subsequent syllables will always begin with a consonant.
  • When generating an initial consonant, avoid the combinations iy, mh, mm, my, nh, nn, ny, uw, Ww and Yy, either because they are hard to pronounce (iy, uw, Ww and Yy), or because they would be misleading (mh, nh), or because native English speakers wouldn't know how to make them (my, ny), or because they would be generated later by the letter-doubling logic.
  • A single consonant between vowels has a 1/3 chance of being doubled, unless it is one of chjqSTwxyZ, which cannot be doubled.
  • After one of bfgkpsSvz there is a 1/3 chance for an l or an r, giving a combination known as muta cum liquida, which can be pronounced as a cluster at the beginning of the syllable.
  • The syllable onset is followed by a vowel, chosen so that the combinations yi, quu and wu do not occur. Once per word, the vowel may become a diphthong; otherwise, an m or an n may be added.
  • At the end of the word, syllables may end in a consonant or a cluster. If the final syllable vowel is an i or a u then a final consonat or cluster is always added.
Function RandName(ByVal NumSyllables As Long) As String

  Dim i As Long
  Dim s As String
  Dim DiphthongUsed As Boolean '= False

  For i = 1 To NumSyllables
    s = ""
    If i > 1 Or RandChance(4 / 5) Then
      Do
        s = Mid$("bcdfghjklmnpqrsStTvwxyzZ", 1 + RandChoice(24), 1)
      Loop Until _
        InStr("iy mh mm my nh nn ny uw Ww Yy", Right$(RandName, 1) & s) = 0
      If Len(RandName) > 0 And InStr("aeiou", Right$(RandName, 1)) > 0 _
        And InStr("chjqSTwxyZ", s) = 0 And RandChance(1 / 3) _
      Then
        s = s & s
      End If
    End If
    If InStr("bfgkpsSvz", s) > 0 And RandChance(1 / 3) Then
      s = s & Mid$("lr", 1 + RandChoice(2), 1)
    ElseIf InStr("dtT", s) > 0 And RandChance(1 / 6) Then
      s = s & "r"
    End If
    If Right$(s, 1) = "y" Then
      s = s & Mid$("aeou", 1 + RandChoice(4), 1)
    ElseIf Right$(s, 1) = "q" Or Right$(s, 1) = "w" Then
      s = s & Mid$("aeio", 1 + RandChoice(4), 1)
    Else
      s = s & Mid$("aeiou", 1 + RandChoice(5), 1)
    End If
    If DiphthongUsed Then
      If RandChance(1 / 4) Then s = s & Mid$("mn", 1 + RandChoice(2), 1)
    Else
      If InStr("aeo", Right$(s, 1)) > 0 And RandChance(1 / 3) Then
        s = s & Mid$("mnWY", 1 + RandChoice(4), 1)
      ElseIf InStr("i", Right$(s, 1)) > 0 And RandChance(1 / 3) Then
        s = s & Mid$("mn", 1 + RandChoice(2), 1)
      ElseIf InStr("u", Right$(s, 1)) > 0 And RandChance(1 / 3) Then
        s = s & Mid$("mnY", 1 + RandChoice(3), 1)
      End If
      DiphthongUsed = InStr("WY", Right$(s, 1)) > 0
    End If
    RandName = RandName & s
  Next i

  If Len(RandName) < 2 Or InStr("iu", Right$(RandName, 1)) > 0 _
    Or (InStr("aeo", Right$(RandName, 1)) > 0 _
    And RandChance(IIf(Right$(RandName, 1) = "e", 2 / 3, 1 / 3))) _
  Then
    RandName = RandName & Mid$("dfgKlmnprsStTvxz", 1 + RandChoice(16), 1)
  End If

  If InStr("n", Right$(RandName, 1)) > 0 And RandChance(1 / 2) Then
    RandName = RandName & Mid$("dkts", 1 + RandChoice(5), 1)
  ElseIf InStr("lr", Right$(RandName, 1)) > 0 And RandChance(1 / 2) Then
    RandName = RandName & Mid$("kmnpts", 1 + RandChoice(5), 1)
  ElseIf InStr("f", Right$(RandName, 1)) > 0 And RandChance(1 / 2) Then
    RandName = RandName & Mid$("kts", 1 + RandChoice(3), 1)
  ElseIf InStr("s", Right$(RandName, 1)) > 0 And RandChance(1 / 2) Then
    RandName = RandName & Mid$("kt", 1 + RandChoice(2), 1)
  ElseIf InStr("km", Right$(RandName, 1)) > 0 And RandChance(1 / 2) Then
    RandName = RandName & Mid$("ts", 1 + RandChoice(2), 1)
  ElseIf InStr("p", Right$(RandName, 1)) > 0 And RandChance(1 / 2) Then
    RandName = RandName & Mid$("s", 1 + RandChoice(1), 1)
  End If

  RandName = Replace$(RandName, "c", "ch")
  RandName = Replace$(RandName, "K", "ck")
  RandName = Replace$(RandName, "q", "qu")
  RandName = Replace$(RandName, "S", "sh")
  RandName = Replace$(RandName, "T", "th")
  RandName = Replace$(RandName, "Z", "zh")
  RandName = Replace$(RandName, "W", "u")
  RandName = Replace$(RandName, "Y", "i")
  RandName = Replace$(RandName, "kk", "ck")
  RandName = Replace$(RandName, "nb", "mb")
  RandName = Replace$(RandName, "np", "mp")

  If Right$(RandName, 1) = "u" Then
    RandName = Left$(RandName, Len(RandName) - 1) & "w"
  ElseIf Right$(RandName, 1) = "i" Then
    RandName = Left$(RandName, Len(RandName) - 1) & "y"
  End If

  RandName = UCase$(Left$(RandName, 1)) & Mid$(RandName, 2)

End Function
The function RandChance(Chance) returns True with a probability equal to Chance, a number between 0 and 1. RandChance(0) is of course always False, and RandChance(1) is always True.
Function RandChance(ByVal Chance As Double) As Boolean

  RandChance = Rand < Chance

End Function
RandChoice(NumChoices) returns an integer between 0 and NumChoices − 1, inclusive; if a number between 1 and NumChoices, inclusive, is needed, then 1 must be added to the result.
Function RandChoice(ByVal NumChoices As Long) As Long

  RandChoice = Int(IIf(NumChoices > 1, NumChoices, 1) * Rand)

End Function
The multiplicative congruential random number generator with modulus 2147483647 and multiplier 69621 has been picked up from Stephen K. Park and Keith W. Miller, "Random number generators: good ones are hard to find", in Communications of the ACM, volume 31 (1988), number 10, pp. 1192-1201. The method of computing the next number in the sequence

x = ax mod m

is due to Linus Schrage, first described in his paper "A more portable Fortran random number generator", in ACM Transactions on Mathematical Software, vol.5 (1979), pp.132-138.
Function Rand(Optional ByVal Seed = -1) As Double

  Const Modulus As Long = 2147483647
  Const Multiplier As Long = 69621
  Const Quotient As Long = Modulus \ Multiplier '= 30845
  Const Remainder As Long = Modulus Mod Multiplier '= 23902

  Static RandValue As Long '= 0

  Dim i As Long

  If RandValue <= 0 Or Seed >= 0 Then
    Seed = Seed Mod Modulus
    If Seed <= 0 Then Seed = _
      (1 + (((CLng(Date) Mod 65536) * 25173 + 13849) Mod 65536) \ 338) _
      * (1 + Int(CDbl(Timer) * 128#))
    RandValue = Seed
    For i = 1 To 100
      Rand
    Next i
  End If

  RandValue = _
    Multiplier * (RandValue Mod Quotient) - Remainder * (RandValue \ Quotient)
  If RandValue < 0 Then RandValue = RandValue + Modulus

  Rand = RandValue / Modulus

End Function
The function RandSignature() can be used to insert one or more generated names into the current document or worksheet, or simply, as in the following example, to display the generated names in the immediate window.
for i = 1 to 70: _
s = RandSignature: _
? s & iif(i mod 2 = 1, space$(40 - len(s)), vbcrlf);: _
next i
Javloyaf Quimt Quoudeshachiv            Pouyun Quekajo Chinchinjath
Quepruv Xouthowo Xijonwa                Chimnem Jatoirip Ateibro
Keyappuz Rusha Thiheique                Trundra Jarrilips Javraxo
Rethrechuy Chum Lojamthemom             Dafro Lithixex Gopuprin
Xojinemt Boy Bomboizlenwax              Pittoz Linim Zijenlouhuns
Porriv Baxoumo Bonthovlist              Veddarud Xike Pumdebruck
Xima Juyuthey Begrowev                  Joushaggurm Graw Choudichawilk
Pemwans Quannit Ximpralouzond           Xuichunchug Rathuy Thohutham
Keshems Lowamvumt Powemgliv             Remthraquiv Quo Gofettanzhin
Zrechumchip Dey Azlaboshruy             Rumsajox Rap Rizhaukazzon
Gruimunt Tantuy Rashimwerra             Thullombo Zuz Seitazixenk
Zlidaw Quonthavvuck Fliklorrog          Jublinzley Thut Kauzheddixa
Quivimt Thinramwon Fauthrusud           Dossuy Lamesh Zijaibraque
Dadde Ragoppufk Zirroufrask             Bonsaw Reshank Itheuframif
Loraquick Eck Rasibojaz                 Quoizretre Thizha Fotushumt
Lorrizhiln Leshra Puckimxish            Thozappa Joxish Krivvimnuth
Zaxeupo Vrums Promrihiblof              Gladith Gimshrux Ithabboslin
Tharav Frinfem Kuishonsatuv             Linrorruth But Xezhuzhechav
Droizhothash Rempo Bremthinga           Xezzuxo Bijeg Puiquimblask
Jezhis Farirt Brambubbemmut             Zhiquozhuy Xumt Pekrumdrunfush
Jibbouthreck Zhimbams Jukixond          Sikuflips Xith Bunvuchokrem
Thruvid Sohey Imnunmimjunt              Jinolled Fod Rudoxizhow
Ruzhovle Bamey Zennimnirn               Gaituged Zhur Grichuddeuchind
Uchigips Vlogat Rozhushuy               Kucho Quepray Lollikamel
Zhaquid Piquew Kruikrofockand           Zhoixo Xejimrin Augitro
Iddiv Johimgeck Jorekas                 Fichul Quaikladdix Regurat
Givimplim Quew Sullujusuth              Xakril Tachishla Rinmoment
Slezha Jowo Luisichovemt                Leithog Loija Vossijassin
Chumpraz Xerew Rozhavunim               Fiha Biquim Dunvlakacho
Sonslo Lissil Lippaushonduz             Vezzuth Ponequir Takoikra
Pemmuck Lamdis Imminjoplor              Shrasunaz Ow Bupimwawag
Bruson Krusloshen Thunwimbups           Rinchid Kuckuixux Oigotre
Vimzhuithug Xiquid Shlujira             Truilanthif Kil Sritenrinar
Zhikluth Jichesh Aiquigripish           Glenquimla Jixem Ranrizhey
Rinvlito Kow Frindosufums               Fumbush Braubrix Trontawija

No comments:

Post a Comment