Subversion Repositories blog

Rev

Rev 48 | Rev 51 | Go to most recent revision | Show entire file | Ignore whitespace | Details | Blame | Last modification | View Log | RSS feed

Rev 48 Rev 49
Line 5... Line 5...
5
computer. Since I hate word puzzles quite a bit, I'd taken similar steps in the
5
computer. Since I hate word puzzles quite a bit, I'd taken similar steps in the
6
past. For example, [this recent challenge][4]: 
6
past. For example, [this recent challenge][4]: 
7
7
8
Fill in the 4x4 square crossword with common uncapitalized words, using no 
8
Fill in the 4x4 square crossword with common uncapitalized words, using no 
9
letter more than once:
9
letter more than once:
10
NAGS
-
 
11
E
10
12
W
11
<table>
-
 
12
   <tr><td>N</td><td>A</td><td>G</td><td>S</td></tr>
-
 
13
   <tr><td>E</td><td></td><td></td><td></td></tr>
-
 
14
   <tr><td>W</td><td></td><td></td><td></td></tr>
-
 
15
   <tr><td>T</td><td></td><td></td><td></td></tr>
13
T
16
</table>
14
17
15
My secret weapon? [GNU coreutils][1]. Regex are a great tool, but I rarely have
18
My secret weapon? [GNU coreutils][1]. Regex are a great tool, but I rarely have
16
to use some of the more obscure features, which hurts on the occasions where
19
to use some of the more obscure features, which hurts on the occasions where
17
they're called for. So the NPR puzzle can be a good way to practice and learn!
20
they're called for. So the NPR puzzle can be a good way to practice and learn!
18
21
Line 20... Line 23...
20
`/usr/share/dict/words`. The format of this file is one word per line. Every 
23
`/usr/share/dict/words`. The format of this file is one word per line. Every 
21
form of a word, including contractions and possessives, gets its own line. We 
24
form of a word, including contractions and possessives, gets its own line. We 
22
use `|` (pipe) to chain the output of one command as the input of the next. 
25
use `|` (pipe) to chain the output of one command as the input of the next. 
23
Cat simply outputs a file, and wc -l counts the lines in it.
26
Cat simply outputs a file, and wc -l counts the lines in it.
24
27
25
    laptop:~$ cat /usr/share/dict/words | wc -l
28
    `laptop:~$ cat /usr/share/dict/words | wc -l`
-
 
29
26
    98569
30
    `98569`
27
31
28
2. I assume no apostrophes are in the puzzle. Grep reads input and outputs only
32
2. I assume no apostrophes are in the puzzle. Grep reads input and outputs only
29
those lines that match a regular expression (regex).  Using the -v option to 
33
those lines that match a regular expression (regex).  Using the -v option to 
30
grep changes it to output only lines that don't match our pattern.
34
grep changes it to output only lines that don't match our pattern.
31
35
32
    laptop:~$ cat /usr/share/dict/words | grep -v "'" | wc -l
36
    `laptop:~$ cat /usr/share/dict/words | grep -v "'" | wc -l`
-
 
37
33
    74059
38
    `74059`
34
39
35
3. That's a lot of words to fuddle around with, so lets winnow this down. 
40
3. That's a lot of words to fuddle around with, so lets winnow this down. 
36
Firstly, we only care about 4 letter words. We can use grep to give us only 
41
Firstly, we only care about 4 letter words. We can use grep to give us only 
37
these words, using the regular expression "^....$". Caret (^) represents 
42
these words, using the regular expression "^....$". Caret (^) represents 
38
the start of a line, and $ represents the end of one. Each period is a single 
43
the start of a line, and $ represents the end of one. Each period is a single 
39
free choice character for grep, matching exactly one character in the input. 
44
free choice character for grep, matching exactly one character in the input. 
40
45
41
46
42
    laptop:~$ cat /usr/share/dict/words | grep -v "'" | grep "^....$" | wc -l
47
    `laptop:~$ cat /usr/share/dict/words | grep -v "'" | grep "^....$" | wc -l`
-
 
48
43
    3174
49
    `3174`
44
50
45
4. Having cut the search space by 96 percent, we now turn to the clues for... 
51
4. Having cut the search space by 96 percent, we now turn to the clues for... 
46
clues. Fortunately, nags and newts define which letters every word can start 
52
clues. Fortunately, nags and newts define which letters every word can start 
47
with. Grep treats symbols within [] as alternatives, meaning any any one 
53
with. Grep treats symbols within [] as alternatives, meaning any any one 
48
symbol within can match the input. Below alters the regex from 3 to only match 
54
symbol within can match the input. Below alters the regex from 3 to only match 
49
words starting with a, g, s, e, w or t.
55
words starting with a, g, s, e, w or t.
50
56
51
    laptop:~$ cat /usr/share/dict/words | grep -v "'" | grep "^[agsewt]...$" | wc -l
57
    `laptop:~$ cat /usr/share/dict/words | grep -v "'" | grep "^[agsewt]...$" | wc -l`
-
 
58
52
    777
59
    `777`
53
60
54
5. Rules say no two letters repeat in the puzzle, so we'll exclude all words 
61
5. Rules say no two letters repeat in the puzzle, so we'll exclude all words 
55
with the letters from nags and newts anywhere other than the first letter. As 
62
with the letters from nags and newts anywhere other than the first letter. As 
56
an alternative to -v, we can use carets inside brackets to indicate "not". 
63
an alternative to -v, we can use carets inside brackets to indicate "not". 
57
64
58
    laptop:~$ cat /usr/share/dict/words | grep -v "'" | grep "^[agsewt][^nagsewt][^nagsewt][^nagsewt]$" | wc -l
65
    `laptop:~$ cat /usr/share/dict/words | grep -v "'" | grep "^[agsewt][^nagsewt][^nagsewt][^nagsewt]$" | wc -l`
-
 
66
59
    131
67
    `131`
60
68
61
6. Next, we can rule out words with repeated letters, like solo and wool. To do
69
6. Next, we can rule out words with repeated letters, like solo and wool. To do
62
this quickly, we'll need to use [backreferences][2]. Backreferences can be 
70
this quickly, we'll need to use [backreferences][2]. Backreferences can be 
63
[slow][3], but since our dataset is so tiny, it will be fine to add it to the
71
[slow][3], but since our dataset is so tiny, it will be fine to add it to the
64
end of the pipeline.
72
end of the pipeline.
65
73
66
    cat /usr/share/dict/words | grep -v "'" | grep "^[agsewt][^nagsewt][^nagsewt][^nagsewt]$" | grep -vE "([a-z]).*(\1)" | wc -l
74
    `cat /usr/share/dict/words | grep -v "'" | grep "^[agsewt][^nagsewt][^nagsewt][^nagsewt]$" | grep -vE "([a-z]).*(\1)" | wc -l`
-
 
75
67
    106
76
    `106`
68
77
69
7. Starting to get close! From here on out, this plays a lot like sudoku. Our 
78
7. Starting to get close! From here on out, this plays a lot like sudoku. Our 
70
goal is now to start constructing regex for each word. We replace the leading 
79
goal is now to start constructing regex for each word. We replace the leading 
71
alternative for a specific letter. To start off, we've only got 7 options for 2
80
alternative for a specific letter. To start off, we've only got 7 options for 2
72
across:
81
across:
73
82
74
    laptop:~$ cat /usr/share/dict/words | grep -v "'" | grep "^e[^nagsewt][^nagsewt][^nagsewt]$" | grep -vE "([a-z]).*(\1)"
83
    `laptop:~$ cat /usr/share/dict/words | grep -v "'" | grep "^e[^nagsewt][^nagsewt][^nagsewt]$" | grep -vE "([a-z]).*(\1)"`
-
 
84
75
    echo
85
    `echo`
-
 
86
76
    ecru
87
    `ecru`
-
 
88
77
    emir
89
    `emir`
-
 
90
78
    epic
91
    `epic`
-
 
92
79
    euro
93
    `euro`
-
 
94
80
    evil
95
    `evil`
-
 
96
81
    expo
97
    `expo`
-
 
98
-
 
99
We now write a different regex without negations to get the same list.
82
100
83
We now write a different regex without negations to get the same list. 
-
 
84
    laptop:~$ cat /usr/share/dict/words | grep "^e[cmpuvx][hipr][cloru]$" | grep -vE "([a-z]).*(\1)" | wc -l
101
    laptop:~$ cat /usr/share/dict/words | grep "^e[cmpuvx][hipr][cloru]$" | grep -vE "([a-z]).*(\1)" | wc -l
-
 
102
85
    7
103
    7
86
104
87
Now we build a similar regex for 2 down. Adding in what we know about it's 
105
Now we build a similar regex for 2 down. Adding in what we know about it's 
88
intersection with 2 across (cmpuvx) is the sudoku like step:
106
intersection with 2 across (cmpuvx) is the sudoku like step:
-
 
107
89
    laptop:~$ cat /usr/share/dict/words | grep -v "'" | grep "^a[cmpuvx][^nagsewt][^nagsewt]$" | grep -vE "([a-z]).*(\1)"
108
    laptop:~$ cat /usr/share/dict/words | grep -v "'" | grep "^a[cmpuvx][^nagsewt][^nagsewt]$" | grep -vE "([a-z]).*(\1)"
-
 
109
90
    achy
110
    achy
-
 
111
91
    acid
112
    acid
-
 
113
92
    amid
114
    amid
-
 
115
93
    amok
116
    amok
-
 
117
94
    avid
118
    avid
95
119
96
We rewrite this one as 
120
We rewrite this one as 
97
121
98
    laptop:~$ cat /usr/share/dict/words | grep -v "'" | grep "^a[cmv][hio][dky]$" | grep -vE "([a-z]).*(\1)" | wc -l
122
    laptop:~$ cat /usr/share/dict/words | grep -v "'" | grep "^a[cmv][hio][dky]$" | grep -vE "([a-z]).*(\1)" | wc -l
-
 
123
99
    5
124
    5
100
125
101
Applying the same logic to 3 down yields "^g[ir][lriu][bdlmp]$", and 4 down 
126
Applying the same logic to 3 down yields `"^g[ir][lriu][bdlmp]$"`, and 4 down 
102
yields "^s[lu][cilmoru][bdfhkmopr]$". 
127
yields `"^s[lu][cilmoru][bdfhkmopr]$"`. 
103
128
104
8. The last positions in each down regex constructs a new regex for 4 across:
129
8. The last positions in each down regex constructs a new regex for 4 across:
105
130
106
    cat /usr/share/dict/words | grep -v "'" | grep "^t[dky][bdlmp][bdfhkmopr]$" | grep -vE "([a-z]).*(\1)"
131
    `cat /usr/share/dict/words | grep -v "'" | grep "^t[dky][bdlmp][bdfhkmopr]$" | grep -vE "([a-z]).*(\1)"`
-
 
132
107
    typo
133
    `typo`
108
134
109
A unique solution to 4 across!
135
A unique solution to 4 across!
110
136
111
9. Revisiting 2 down with this new fact also yields a unique answer. I leave
137
9. Revisiting 2 down with this new fact also yields a unique answer. I leave
112
it from here as an exercise to the reader to solve the entire puzzle.
138
it from here as an exercise to the reader to solve the entire puzzle.