Rev 48 | Rev 51 | Go to most recent revision | Show entire file | Ignore whitespace | Details | Blame | Last modification | View Log | RSS feed
| Rev 48 | Rev 49 | ||
|---|---|---|---|
| Line 5... | Line 5... | ||
| 5 | computer. Since I hate word puzzles quite a bit, I'd taken similar steps in the |
5 | computer. Since I hate word puzzles quite a bit, I'd taken similar steps in the |
| 6 | past. For example, [this recent challenge][4]: |
6 | past. For example, [this recent challenge][4]: |
| 7 | 7 | ||
| 8 | Fill in the 4x4 square crossword with common uncapitalized words, using no |
8 | Fill in the 4x4 square crossword with common uncapitalized words, using no |
| 9 | letter more than once: |
9 | letter more than once: |
| 10 | NAGS |
- | |
| 11 | E |
10 | |
| 12 | W |
11 | <table> |
| - | 12 | <tr><td>N</td><td>A</td><td>G</td><td>S</td></tr> |
|
| - | 13 | <tr><td>E</td><td></td><td></td><td></td></tr> |
|
| - | 14 | <tr><td>W</td><td></td><td></td><td></td></tr> |
|
| - | 15 | <tr><td>T</td><td></td><td></td><td></td></tr> |
|
| 13 | T |
16 | </table> |
| 14 | 17 | ||
| 15 | My secret weapon? [GNU coreutils][1]. Regex are a great tool, but I rarely have |
18 | My secret weapon? [GNU coreutils][1]. Regex are a great tool, but I rarely have |
| 16 | to use some of the more obscure features, which hurts on the occasions where |
19 | to use some of the more obscure features, which hurts on the occasions where |
| 17 | they're called for. So the NPR puzzle can be a good way to practice and learn! |
20 | they're called for. So the NPR puzzle can be a good way to practice and learn! |
| 18 | 21 | ||
| Line 20... | Line 23... | ||
| 20 | `/usr/share/dict/words`. The format of this file is one word per line. Every |
23 | `/usr/share/dict/words`. The format of this file is one word per line. Every |
| 21 | form of a word, including contractions and possessives, gets its own line. We |
24 | form of a word, including contractions and possessives, gets its own line. We |
| 22 | use `|` (pipe) to chain the output of one command as the input of the next. |
25 | use `|` (pipe) to chain the output of one command as the input of the next. |
| 23 | Cat simply outputs a file, and wc -l counts the lines in it. |
26 | Cat simply outputs a file, and wc -l counts the lines in it. |
| 24 | 27 | ||
| 25 | laptop:~$ cat /usr/share/dict/words | wc -l |
28 | `laptop:~$ cat /usr/share/dict/words | wc -l` |
| - | 29 | ||
| 26 | 98569 |
30 | `98569` |
| 27 | 31 | ||
| 28 | 2. I assume no apostrophes are in the puzzle. Grep reads input and outputs only |
32 | 2. I assume no apostrophes are in the puzzle. Grep reads input and outputs only |
| 29 | those lines that match a regular expression (regex). Using the -v option to |
33 | those lines that match a regular expression (regex). Using the -v option to |
| 30 | grep changes it to output only lines that don't match our pattern. |
34 | grep changes it to output only lines that don't match our pattern. |
| 31 | 35 | ||
| 32 | laptop:~$ cat /usr/share/dict/words | grep -v "'" | wc -l |
36 | `laptop:~$ cat /usr/share/dict/words | grep -v "'" | wc -l` |
| - | 37 | ||
| 33 | 74059 |
38 | `74059` |
| 34 | 39 | ||
| 35 | 3. That's a lot of words to fuddle around with, so lets winnow this down. |
40 | 3. That's a lot of words to fuddle around with, so lets winnow this down. |
| 36 | Firstly, we only care about 4 letter words. We can use grep to give us only |
41 | Firstly, we only care about 4 letter words. We can use grep to give us only |
| 37 | these words, using the regular expression "^....$". Caret (^) represents |
42 | these words, using the regular expression "^....$". Caret (^) represents |
| 38 | the start of a line, and $ represents the end of one. Each period is a single |
43 | the start of a line, and $ represents the end of one. Each period is a single |
| 39 | free choice character for grep, matching exactly one character in the input. |
44 | free choice character for grep, matching exactly one character in the input. |
| 40 | 45 | ||
| 41 | 46 | ||
| 42 | laptop:~$ cat /usr/share/dict/words | grep -v "'" | grep "^....$" | wc -l |
47 | `laptop:~$ cat /usr/share/dict/words | grep -v "'" | grep "^....$" | wc -l` |
| - | 48 | ||
| 43 | 3174 |
49 | `3174` |
| 44 | 50 | ||
| 45 | 4. Having cut the search space by 96 percent, we now turn to the clues for... |
51 | 4. Having cut the search space by 96 percent, we now turn to the clues for... |
| 46 | clues. Fortunately, nags and newts define which letters every word can start |
52 | clues. Fortunately, nags and newts define which letters every word can start |
| 47 | with. Grep treats symbols within [] as alternatives, meaning any any one |
53 | with. Grep treats symbols within [] as alternatives, meaning any any one |
| 48 | symbol within can match the input. Below alters the regex from 3 to only match |
54 | symbol within can match the input. Below alters the regex from 3 to only match |
| 49 | words starting with a, g, s, e, w or t. |
55 | words starting with a, g, s, e, w or t. |
| 50 | 56 | ||
| 51 | laptop:~$ cat /usr/share/dict/words | grep -v "'" | grep "^[agsewt]...$" | wc -l |
57 | `laptop:~$ cat /usr/share/dict/words | grep -v "'" | grep "^[agsewt]...$" | wc -l` |
| - | 58 | ||
| 52 | 777 |
59 | `777` |
| 53 | 60 | ||
| 54 | 5. Rules say no two letters repeat in the puzzle, so we'll exclude all words |
61 | 5. Rules say no two letters repeat in the puzzle, so we'll exclude all words |
| 55 | with the letters from nags and newts anywhere other than the first letter. As |
62 | with the letters from nags and newts anywhere other than the first letter. As |
| 56 | an alternative to -v, we can use carets inside brackets to indicate "not". |
63 | an alternative to -v, we can use carets inside brackets to indicate "not". |
| 57 | 64 | ||
| 58 | laptop:~$ cat /usr/share/dict/words | grep -v "'" | grep "^[agsewt][^nagsewt][^nagsewt][^nagsewt]$" | wc -l |
65 | `laptop:~$ cat /usr/share/dict/words | grep -v "'" | grep "^[agsewt][^nagsewt][^nagsewt][^nagsewt]$" | wc -l` |
| - | 66 | ||
| 59 | 131 |
67 | `131` |
| 60 | 68 | ||
| 61 | 6. Next, we can rule out words with repeated letters, like solo and wool. To do |
69 | 6. Next, we can rule out words with repeated letters, like solo and wool. To do |
| 62 | this quickly, we'll need to use [backreferences][2]. Backreferences can be |
70 | this quickly, we'll need to use [backreferences][2]. Backreferences can be |
| 63 | [slow][3], but since our dataset is so tiny, it will be fine to add it to the |
71 | [slow][3], but since our dataset is so tiny, it will be fine to add it to the |
| 64 | end of the pipeline. |
72 | end of the pipeline. |
| 65 | 73 | ||
| 66 | cat /usr/share/dict/words | grep -v "'" | grep "^[agsewt][^nagsewt][^nagsewt][^nagsewt]$" | grep -vE "([a-z]).*(\1)" | wc -l |
74 | `cat /usr/share/dict/words | grep -v "'" | grep "^[agsewt][^nagsewt][^nagsewt][^nagsewt]$" | grep -vE "([a-z]).*(\1)" | wc -l` |
| - | 75 | ||
| 67 | 106 |
76 | `106` |
| 68 | 77 | ||
| 69 | 7. Starting to get close! From here on out, this plays a lot like sudoku. Our |
78 | 7. Starting to get close! From here on out, this plays a lot like sudoku. Our |
| 70 | goal is now to start constructing regex for each word. We replace the leading |
79 | goal is now to start constructing regex for each word. We replace the leading |
| 71 | alternative for a specific letter. To start off, we've only got 7 options for 2 |
80 | alternative for a specific letter. To start off, we've only got 7 options for 2 |
| 72 | across: |
81 | across: |
| 73 | 82 | ||
| 74 | laptop:~$ cat /usr/share/dict/words | grep -v "'" | grep "^e[^nagsewt][^nagsewt][^nagsewt]$" | grep -vE "([a-z]).*(\1)" |
83 | `laptop:~$ cat /usr/share/dict/words | grep -v "'" | grep "^e[^nagsewt][^nagsewt][^nagsewt]$" | grep -vE "([a-z]).*(\1)"` |
| - | 84 | ||
| 75 | echo |
85 | `echo` |
| - | 86 | ||
| 76 | ecru |
87 | `ecru` |
| - | 88 | ||
| 77 | emir |
89 | `emir` |
| - | 90 | ||
| 78 | epic |
91 | `epic` |
| - | 92 | ||
| 79 | euro |
93 | `euro` |
| - | 94 | ||
| 80 | evil |
95 | `evil` |
| - | 96 | ||
| 81 | expo |
97 | `expo` |
| - | 98 | ||
| - | 99 | We now write a different regex without negations to get the same list. |
|
| 82 | 100 | ||
| 83 | We now write a different regex without negations to get the same list. |
- | |
| 84 | laptop:~$ cat /usr/share/dict/words | grep "^e[cmpuvx][hipr][cloru]$" | grep -vE "([a-z]).*(\1)" | wc -l |
101 | laptop:~$ cat /usr/share/dict/words | grep "^e[cmpuvx][hipr][cloru]$" | grep -vE "([a-z]).*(\1)" | wc -l |
| - | 102 | ||
| 85 | 7 |
103 | 7 |
| 86 | 104 | ||
| 87 | Now we build a similar regex for 2 down. Adding in what we know about it's |
105 | Now we build a similar regex for 2 down. Adding in what we know about it's |
| 88 | intersection with 2 across (cmpuvx) is the sudoku like step: |
106 | intersection with 2 across (cmpuvx) is the sudoku like step: |
| - | 107 | ||
| 89 | laptop:~$ cat /usr/share/dict/words | grep -v "'" | grep "^a[cmpuvx][^nagsewt][^nagsewt]$" | grep -vE "([a-z]).*(\1)" |
108 | laptop:~$ cat /usr/share/dict/words | grep -v "'" | grep "^a[cmpuvx][^nagsewt][^nagsewt]$" | grep -vE "([a-z]).*(\1)" |
| - | 109 | ||
| 90 | achy |
110 | achy |
| - | 111 | ||
| 91 | acid |
112 | acid |
| - | 113 | ||
| 92 | amid |
114 | amid |
| - | 115 | ||
| 93 | amok |
116 | amok |
| - | 117 | ||
| 94 | avid |
118 | avid |
| 95 | 119 | ||
| 96 | We rewrite this one as |
120 | We rewrite this one as |
| 97 | 121 | ||
| 98 | laptop:~$ cat /usr/share/dict/words | grep -v "'" | grep "^a[cmv][hio][dky]$" | grep -vE "([a-z]).*(\1)" | wc -l |
122 | laptop:~$ cat /usr/share/dict/words | grep -v "'" | grep "^a[cmv][hio][dky]$" | grep -vE "([a-z]).*(\1)" | wc -l |
| - | 123 | ||
| 99 | 5 |
124 | 5 |
| 100 | 125 | ||
| 101 | Applying the same logic to 3 down yields "^g[ir][lriu][bdlmp]$", and 4 down |
126 | Applying the same logic to 3 down yields `"^g[ir][lriu][bdlmp]$"`, and 4 down |
| 102 | yields "^s[lu][cilmoru][bdfhkmopr]$". |
127 | yields `"^s[lu][cilmoru][bdfhkmopr]$"`. |
| 103 | 128 | ||
| 104 | 8. The last positions in each down regex constructs a new regex for 4 across: |
129 | 8. The last positions in each down regex constructs a new regex for 4 across: |
| 105 | 130 | ||
| 106 | cat /usr/share/dict/words | grep -v "'" | grep "^t[dky][bdlmp][bdfhkmopr]$" | grep -vE "([a-z]).*(\1)" |
131 | `cat /usr/share/dict/words | grep -v "'" | grep "^t[dky][bdlmp][bdfhkmopr]$" | grep -vE "([a-z]).*(\1)"` |
| - | 132 | ||
| 107 | typo |
133 | `typo` |
| 108 | 134 | ||
| 109 | A unique solution to 4 across! |
135 | A unique solution to 4 across! |
| 110 | 136 | ||
| 111 | 9. Revisiting 2 down with this new fact also yields a unique answer. I leave |
137 | 9. Revisiting 2 down with this new fact also yields a unique answer. I leave |
| 112 | it from here as an exercise to the reader to solve the entire puzzle. |
138 | it from here as an exercise to the reader to solve the entire puzzle. |