-
Notifications
You must be signed in to change notification settings - Fork 74
/
utm.bf
491 lines (399 loc) · 18.4 KB
/
utm.bf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
[A universal Turing machine from Yurii Rogozhin's article "Small universal
Turing machines", in Theoretical Computer Science, 168(2):215-240, 20 November
1996. Thus, a very direct proof that brainfuck is Turing-complete. For i/o
formats and so on, read below; for fuller detail, dig up the article.
If you just want a quick and complete test case, the input b1b1bbb1c1c11111d
should produce the output 1c11111.
Daniel B Cristofani (cristofdathevanetdotcom)
http://www.hevanet.com/cristofd/brainfuck/
This Turing machine achieves Turing-completeness not by simulating other
Turing machines directly, but by simulating a Turing-complete class of
tag-systems (a computational model invented by Emil Post and named after the
children's game "tag"). A tag-system transforms strings over an alphabet A =
{a[1], a[2], ... a[n], a[n+1]} as follows: a positive integer m is chosen, and
so is a function P that maps each a[i] for 1<=i<=n to a string P(a[i]) over
the alphabet A. Now:
1. if the string being transformed has fewer than m elements, the whole
process stops now.
2. m elements are removed from the beginning of the string
3. call the first element removed a[k]; if k=n+1 the whole process stops now.
4. P(a[k]) is appended to the string.
5. steps 1-5 are repeated.
The particular class of tag-systems this Turing machine simulates is the class
where m=2, the initial string has length at least 2, and all P(a[i]) where
1<=i<=n are of the form a[n]a[n]B[i] where B[i] is some string over the
alphabet A (B[i] is the empty string if and only if i=n).
The input for this brainfuck program is mildly complex, and there is no error
checking. The complexity comes from the encoding of tag-systems in terms of
Turing machine tape configurations. Note that the set of initial tape
configurations that represent tag-systems from the above class is a small,
though Turing-complete, subset of the set of possible initial tape
configurations for this Turing machine; and the following brainfuck program is
only designed to accept inputs from that subset.
-The representation of a symbol a[i] from the alphabet A is a string of 1s
which is one element longer than twice the combined length of all P(a[j])
where 1<=j<i.
-a value like P(a[i]) = a[n]a[n]a[w]a[x]...a[y]a[z] is represented as follows:
b 1
b 111...(as many as required to represent a[z] as described above) b
b 111...(to represent a[y] as described above) b
.
.
.
b 111...(to represent a[x]) b
b 111...(to represent a[w]) b
b 111...(to represent a[n]) b
b 111...(as many as for a[n] as described above, MINUS the number of 1s that
represent a[i]; and no final b)
-The function P is represented by listing all its outputs in the order
P(a[n]), P(a[n-1]), ... P(a[2]), P(a[1]). The representation of P(a[n+1])=STOP
is done for you by the brainfuck program.
-The initial string a[q]a[r]...a[s]a[t] to be transformed by the tag-system is
represented as
111...(as many as required to represent a[q] as above) c
111...(to represent a[r]) c
.
.
.
111...(to represent a[s]) c
111...(to represent a[t]; note that there is no final c.)
-The input to this program is a function P as described above, then a single
b, then the initial string to be transformed. Run all the 1s, bs, and cs
together in one line with nothing between, followed by a linefeed, or a period
if a linefeed is problematic for your implementation.
-The output format, if the program terminates, is the same as the input format
for the initial string, and represents the final state of the transformed
string, with a final a[n+1] appended due to a peculiarity of the Turing
machine's algorithm.
-An example.
A tag-system over the alphabet A = {a[1], a[2], a[3], a[4]}, where m = 2 and:
P(a[1]) = a[3]a[3]a[2]a[1]a[4]
P(a[2]) = a[3]a[3]a[1]
P(a[3]) = a[3]a[3]
P(a[4]) = STOP
meets the criteria above; and when applied to the initial string a[2]a[1]a[1]
it gives:
a[2]a[1]a[1]
a[1]a[3]a[3]a[1]
a[3]a[1]a[3]a[3]a[2]a[1]a[4]
a[3]a[3]a[2]a[1]a[4]a[3]a[3]
a[2]a[1]a[4]a[3]a[3]a[3]a[3]
a[4]a[3]a[3]a[3]a[3]a[3]a[3]a[1]
a[3]a[3]a[3]a[3]a[3]a[1]
and then it's done.
Now, the encoding:
a[1] is 1
a[2] is 11111111111
a[3] is 11111111111111111
a[4] is 111111111111111111111
P(a[1]) is b1 b111111111111111111111b b1b b11111111111b b11111111111111111b
b1111111111111111
P(a[2]) is b1 b1b b11111111111111111b b111111
P(a[3]) is b1 b11111111111111111b b
the initial string is 11111111111c1c1
and so the whole input is
b1 b11111111111111111b b
b1 b1b b11111111111111111b b111111
b1 b111111111111111111111b b1b b11111111111b b11111111111111111b
b1111111111111111
b
11111111111c1c1
which when run together for input to the program becomes
b1b11111111111111111bbb1b1bb11111111111111111bb111111b1b111111111111111111111bb1bb11111111111bb11111111111111111bb1111111111111111b11111111111c1c1
The output should be
11111111111111111 c
11111111111111111 c
11111111111111111 c
11111111111111111 c
11111111111111111 c
1 c
111111111111111111111
--that is,
11111111111111111c11111111111111111c11111111111111111c11111111111111111c11111111111111111c1c111111111111111111111
For those interested, the state table of the Turing machine itself is
10<L1 201L2 30cR1 40cL2
11<L1 210R2 311R3 410R4
1b>R1 2b>L3 3b<R4 4bcL2
1<0R1 2<>L2 3< H 4<
1>bL1 2><R2 3>bR3 4><R4
1c0R4 2cbR2 3c1R1 4cbR4
where the initial state is 1, the blank symbol is "0", tape cells are set as
per the input but with the termination code P(a[n+1])=STOP represented as a
"< b" at the left end, and the head is initially at the first "1" in the code
for the initial string. If and when the machine halts, the head is at the
leftmost cell, a "<"; the representations of the rules are intact, in a form
isomorphic to their original form (each "b" replaced with ">" and each "1"
unchanged); they are followed by a series of "1" cells, then a "c" (the
leftmost one at that time), then the cells representing the final state of the
transformed string, then a "c" and a sequence of "1" cells representing a[n+1]
as mentioned.
The minimal test case b1b1bbb1c1c11111 represents the tag-system where P(a[1])
= a[1]a[1] and P(a[2]) = STOP, applied to the string a[1]a[1]a[2]. This runs
for 518 steps of the Turing machine, exercising all 23 Turing machine
instructions, before halting with the output string a[1].
Here is the brainfuck program that implements this Turing machine. The basic
memory layout is as follows.
Each Turing machine cell is represented by a brainfuck cell, with the symbols
"0 1 b < > c" represented by 0, 1, 2, 3, 4, 5 respectively. The brainfuck
cells representing the Turing machine cells are laid out contiguously from the
beginning of the tape, except that the head of the Turing machine is
represented by a gap of three brainfuck cells, just to the left of the
brainfuck cell that represents the current Turing machine cell. At the start
of each cycle, the rightmost of these three cells holds the Turing machine
state, where states 1-4 are represented by 1-4 and "halt" (here treated as a
separate state) is represented by 0. The other two cells hold zeroes.
Now to walk through the code:
+++>++>>>+[
Set up 3 2 0 0 1, representing "< b" and the Turing machine head, in the
initial state 1; we can put this at the left end of the brainfuck array
because the Turing machine will never go left from the "<".
Next, start the main input-processing loop. Each time through this loop, we
begin at the rightmost tape cell that we have filled so far, or at the state
cell of the Turing machine head if it is to the right of all tape cells (as it
is initially). Each time through, we read a character; if it is "1" or "c", we
add the appropriate code to the right end of the tape; if it is a "b", we not
only add the code to the end of the tape but also move the head to the right
of it, since the head must follow the rightmost "b" when the Turing machine
starts; if the input character is a linefeed or other terminator, we add
nothing to the tape but position the brainfuck pointer at the zero that
follows the last filled tape cell, thus ending this loop.
>>,[>+++++<[[->]<<]<[>]>]
Read input, producing
... x 0 'i 0 ...
where "x" is a nonzero cell and "i" is the input.
While i is nonzero, run this loop:
-set up
... x 0 'i 5 0 ...
-If the input was six or greater, the [[->]<<] part will five times decrement
both i and 5; the sixth time, it will only decrement i, and move to the cell
left of i, producing
... x '0 i 0 0 ...
following which the code <[>]> will restore the pointer to i:
... x 0 'i 0 0 ...
In short, while i is at least six, the net effect of each iteration of the
loop [>+++++<[[->]<<]<[>]>] is to reduce i by 6; so repeated iterations will
change i to i mod 6; call this j. Then the loop will be run once more.
Now legitimate input characters give the values 1, 2, 3, 4 when reduced mod 6;
"1" gives 1, "b" gives 2, "c" gives 3, and linefeed and "." and "d" all give
4. On the last run through the loop [>+++++<[[->]<<]<[>]>] , the [[->]<<] part
will decrement both j and 5 repeatedly until j is zeroed, i.e. it will zero j
while reducing 5 by j, leaving
... x 0 '0 r 0 ...
where r is 5-j. The code <[>]> leaves this configuration unchanged, and the
loop exits.
>-[<<+++++>>-[<<---->>-[->]<]]
If r was 1, i mod 6 was 4, meaning a terminator. So we don't fill any tape
cell but leave
... x 0 0 '0 ...
If r was 2, i mod 6 was 3, meaning "c". So we set up
... x 5 0 '0 ...
If r was 3, i mod 6 was 2, meaning "b". In this case we set up
... x 1 0 '0 ...
and skip the innermost [->] loop, then step left leaving
... x 1 '0 0 ...
(note the pointer position; only in this case is the pointer immediately to
the right of a nonzero cell.)
If r was 4, i mod 6 was 1, meaning "1". In this case we set up
... x 1 0 '1 0 ...
and enter the inner [->] loop, resulting in
... x 1 0 0 '0 ...
after which we step left, producing
... x 1 0 '0 ...
<[<-<[<]+<+[>]<<+>->>>]
Now we step left. If and only if i was "b", we will enter this loop which will
move the Turing machine head. Now the situation, including the head which is
somewhere off to the left, is
... 2 0 0 1 y ... y y y y '1 0 0 0 ...
where each y is a 1 (since that is the only symbol that occurs between one b
and the next b, in input strings that represent tag-systems).
Now we zero one y and scan left:
... 2 0 '0 1 y ... y y y 0 1 0 0 0 ...
set two more 1's, and scan right (these two, and the 1 that was formerly the
state cell of the head, now serve as tape symbols, so we label them y);
... 2 y y y y ... y y y '0 1 0 0 0 ...
next, we set up the head at its new position, and we set the cell to the left
of it to b, and move the pointer just right of the 1 which is the state:
... 2 y y y y ... y 2 0 0 1 '0 0 0 ...
and now we're to the right of the rightmost nonzero cell, as we should be, so
we end the head-moving loop.
(In the degenerate case where we received two b's in a row, the process goes:
...2 0 0 1 '1 0 0 0 ...
...2 0 '0 0 1 0 0 0 ...
...2 y y '0 1 0 0 0 ...
...2 2 0 0 1 '0 0 0 ...
as it should.)
After either performing or skipping that loop, there are four possibilities
corresponding to the four possible inputs.
... x x x x 1 '0 ... if the input was a "1"
... x 2 0 0 1 '0 ... if the input was a "b"
... x x x x 5 '0 ... if the input was a "c"
... x x x x 0 '0 ... if the input was a terminator.
<]<[<]>[
Now we move left and close the loop. If the input was anything but a
terminator, this puts us at the rightmost nonzero cell and we repeat the
input-processing loop. If the input was a terminator, this puts us at the zero
after the rightmost nonzero cell, and the input is already finished; then we
scan left to the gap that represents the Turing machine head, and position the
pointer at the state cell. Then we begin the main loop, which will be executed
once for each step of the Turing machine's operation, stopping when the state
cell holds 0 (representing the halt state).
-[>++++++<-]>[<+>-]
At the beginning of each iteration of the main processing loop, the
configuration is
... w x 0 0 's y z ...
where w, x, y, z are tape cells, with y being the current tape cell, and s is
the state (1-4). First we combine the current state with the current symbol;
we add (6*(s-1)) onto y, then move the result back into the former location of
s. Call the combination g; its values range from 0 to 23, one for each
state-symbol combination.
... w x 0 0 '0 g z ...
... w x 0 0 g '0 z ...
Now we have to use g to select a new symbol, a new state, and a direction to
move the head. We will provisionally move the head right one cell, and set a
direction flag; if that flag is nonzero, we will move the head left two cells,
resulting in a total movement of one cell to the left. That is, we want to use
g in
... w x 0 0 g 0 z ...
to construct an appropriate
... w x y d 0 s z ...
where d is the direction flag, s is the new state, and y is the new symbol;
then if d is nonzero ("move left after all") we want to shift this to produce
... w 0 0 s x y z ...
The way we use g to pick new values for s, y, and d is a very general scheme
for mathematical functions of one variable, and this UTM is one place we need
the generality. (See my rot13.b for a place where I didn't need the
generality, and used this method anyway, leading to comically inconcise but
fast code.) The basic idea is like
>set f(0)<[>set f(1)<-[>set f(2)<-[>set f(3)<-[...&c...]]]]
The argument (in this case, g) is gradually decremented; if it was (say) five,
the program will enter the five outermost loops of this nest but will skip the
sixth and those inside it, since by that point the argument will have been
zeroed. So we just make sure the output cells have the right values for f(5)
at that point, by setting them inside the fifth loop but outside the sixth.
We get the outputs from the state table, naturally, reading down the columns.
Recall that "01b<>c" map to 0 1 2 3 4 5, and that "d=1" means "left".
+<<<+++>+>
g>=0; set s=1, y=3, d=1
[-
g>=1; same values. There's some continuity in the table; only changes from one
combination to the next will be commented hereafter.
[<<+>->-
g>=2; y=4, d=0
[<<[-]>>-
g>=3; y=0
[<<++>+>-
g>=4; y=2, d=1
[<<-->->>+++<-
g>=5; y=0, d=0, s=4
[<<+>+>>--<-
g>=6; y=1, d=1, s=2
[<<->->-
g>=7; y=0, d=0
[<<++++>+>>+<-
g>=8; y=4, d=1, s=3
[>-<-
g>=9; s=2
[<<->->-
g>=10; y=3, d=0
[<<->>-
g>=11; y=2
[<<+++>>>-<-
g>=12; y=5, s=1
[<<---->>>++<-
g>=13; y=1, s=3
[<<++>>>+<-
g>=14; y=3, s=4
[>[-]<-
g>=15; s=0 (this is the halt condition; having it produce d=0 is useful, since
moving left would take us outside the brainfuck array, and the capability of
actually not moving the head has been omitted as unnecessary, given that we're
only going to output the part of the tape that holds the final state of the
transformed string.)
[<<->>>+++<-
g>=16; y=2, s=3
[<<->>>--<-
g>=17; y=1, s=1
[<<++++>+>>+<-
g>=18; y=5, d=1, s=2
[<<[-]>->>++<-
g>=19; y=0, d=0, s=4
[<<+++++>+>>--<-
g>=20; y=5, d=1, s=2
[<->>++<[<<->>-]]
Here's a tricky part. g==21 is never produced by the Turing machine from
correct input, so the remaining states to consider are g==22 and g==23, which
should give (y=3, d=0, s=4) and (y=2, d=0, s=4) respectively. So we set up
d=0, s=4 which are common to both, then we take the 2 or 3 that remains in g
and subtract it from y to produce the right result for y.
]]]]]]]]]]]]]]]]]]]]
We arrive here when g has been zeroed. Again, the layout now is
... w x y d '0 s z ...
and y, d, and s have the correct values.
<[->>[<<+>>-]<<<[>>>+<<<-]<[>>>+<<<-]]
If d==1 we do:
... w x y '0 0 s z ...
... w x y s 0 '0 z ...
... w x '0 s 0 y z ...
... w '0 0 s x y z ...
to move the head left and leave the pointer at the leftmost cell of the head,
where it would be if d had been 0 also.
>>]
Go to the state cell, and if it is nonzero (not the halt state) go through the
main Turing machine loop again.
>[-[---[-<]]>]
Now the Turing machine has halted. The situation is
4 0 0 '0 x x ... x x 0 0 0 ...
where each x is either 1, 4, or 5, and the leftmost 5 ("c") marks the start of
the transformed string. So we scan through the tape looking for that 5, and
incidentally clearing the tape as we go. When we find a 1, we decrement it,
skip the loop [---[-<]], move right, and start fresh. When we find a 4, we
decrement it down to 0, skip the loop [-<], move right, and start fresh. When
we find a 5, we decrement it down to 0, move left, move right to the space the
5 occupied (now 0), and end this loop.
>[+++[<+++++>--]>]
Now we want to output the part of the tape that represents the transformed
string; the situation is
... 0 '0 x x x ... x x x 0 0 0 ...
where each x is either 1, representing "1", or 5, representing "c". To
transform each to the right ASCII value we first add 3, producing 4 for "1"
and 8 for "c", then multiply the resulting value by 5/2 while moving it left;
this is safe because we know the value is even. We scan through the string
this way, after which it consists of 10s (for "1") and 20s (for "c"), and the
whole string has been shifted one cell left.
+<++[[>+++++<-]<]
Now we set two more cells to make the linefeed, producing
... 0 x x x ... x x x '2 1 0 0 ...
and multiply each cell by five while moving it right, producing
... '0 0 x x x ... x x x 11 0 0 ...
where each x is either 50 (for "1") or 100 (for "c").
>>[-.>]
Now we scan right, reducing each cell by one and outputting it; the values are
49 (ASCII for "1"), 99 (ASCII for "c"), and 10 (ASCII for the final linefeed).
After this loop there are no more commands, so the program terminates.]
The entire program again without comments:
+++>++>>>+[>>,[>+++++<[[->]<<]<[>]>]>-[<<+++++>>-[<<---->>-[->]<]]
<[<-<[<]+<+[>]<<+>->>>]<]<[<]>[-[>++++++<-]>[<+>-]+<<<+++>+>
[-
[<<+>->-
[<<[-]>>-
[<<++>+>-
[<<-->->>+++<-
[<<+>+>>--<-
[<<->->-
[<<++++>+>>+<-
[>-<-
[<<->->-
[<<->>-
[<<+++>>>-<-
[<<---->>>++<-
[<<++>>>+<-
[>[-]<-
[<<->>>+++<-
[<<->>>--<-
[<<++++>+>>+<-
[<<[-]>->>++<-
[<<+++++>+>>--<-
[<->>++<
[<<->>-
]]]]]]]]]]]]]]]]]]]]]]<[->>[<<+>>-]<<<[>>>+<<<-]<[>>>+<<<-]]>>]
>[-[---[-<]]>]>[+++[<+++++>--]>]+<++[[>+++++<-]<]>>[-.>]