-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME
57 lines (45 loc) · 2.46 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
# ntih (Naming Things Is Hard)
# Usage
> ntih [OPTIONS] [[MIN:]MAX] [COUNT] < EXAMPLE TEXT > WORDS
Build a Markov chain of character pairs by sampling words from input text.
Using that chain, generate COUNT words with length between MIN and MAX characters.
Given MAX and not MIN generate words of exactly that length.
OPTIONS
-f CHAIN_FILE : File where the chain will be stored.
-i INPUT_FILE : Input file from which to read sample words.
-r : Force regeneration of CHAIN_FILE.
ENVIRONMENT
$NTIHFILE : Default CHAIN_FILE from which to load a chain.
$NTIHDICT : Default INPUT_FILE from which to generate a chain.
# Example
$> ntih 1:6 100 | sort | column -c 80
aemeez bsiono dumung gshopa isluab lhedhe natych rxisty unkyup wyl
aigibr ceubla eapjs hé jamé logood on scorva unshev xem
ångfuc cimius écigae hodt jaryde lsserk peogta shdovo urdogb ythhip
ångmyr climoc ectmak hoke jibruc lyevac pique sveral utlixs yve
ångtif craesi emliel htnava jociz mb plyplu tcafon vim zacqué
ångtod dabune émochp hyanys jutezb mirhog punqué tlosua vlopyr zerplo
atulkw ddysmi fdregy hysfud khmikh mnseyb qadley trsvex vutshn zhizoi
awma débuzs ferbre ibbuys knodki msscsl qommoh twipus vuttge ziccav
azurlu dhuthp filwon icucli kru nafinb r twogga whorwa zircoo
blia dudotp fucuff irrhot lathit nagluk rwopré ulyrud wonien zs
# What
Here's a simple line encoding for a markov chain of characters in a word:
<char previous><char current> <char next><integer weight> ...
Entries looks like this:
^l a38 e24 i23 l1 o18 u9 y2
an $49 a11 c27 d48 e12 f1 g22 i19 k7 l1 n9 o4 q1 s54 t53 u3 w1 x1
nd $12 a13 b5 e42 f1 h2 i21 l12 m2 o7 p1 r5 s17 t1 u4
fr $1 a10 e15 i2 o9 u2 y1
db a3 i1 l1 u2
Each rule gives possible next characters given the previous two characters.
There are two special characters '^' and '$' denoting start and end points.
To build a word a starting character is selected randomly, then for each
step a next character is selected randomly weighted by the rules.
Function ntih_render reads a chain of this format then uses it to generate words.
Function ntih_build generates a chain in this format by reading text input.
Function ntih wraps the build and render functions in a nice interface.
# Links
GNU - https://www.gnu.org/software/
Bash - https://tiswww.case.edu/php/chet/bash/bashtop.html
Greg's Wiki - http://mywiki.wooledge.org/