How does the Brainfuck Hello World actually work?
Asked Answered
O

6

148

Someone sent this to me and claimed it is a hello world in Brainfuck (and I hope so...)

++++++++++[>+++++++>++++++++++>+++>+<<<<-]>++.>+.+++++++..+++.>++.<<+++++++++++++++.>.+++.------.--------.>+.>.

I know the basics that it works by moving a pointer and increment and decrementing stuff...

Yet I still want to know, how does it actually work? How does it print anything on the screen in the first place? How does it encode the text? I do not understand at all...

October answered 30/5, 2013 at 12:57 Comment(5)
Must be pretty hard to maintain an application written in this language ..Ansermet
@ring0: nah, that's a write-only language.Cortie
what's it's practical use ?Canny
@YashVerma it doesn't need one..Oldenburg
@YashVerma It's clearly specified in the name of the language.Zerlina
E
303

1. Basics

To understand Brainfuck you must imagine infinite array of cells initialized by 0 each.

...[0][0][0][0][0]...

When brainfuck program starts, it points to any cell.

...[0][0][*0*][0][0]...

If you move pointer right > you are moving pointer from cell X to cell X+1

...[0][0][0][*0*][0]...

If you increase cell value + you get:

...[0][0][0][*1*][0]...

If you increase cell value again + you get:

...[0][0][0][*2*][0]...

If you decrease cell value - you get:

...[0][0][0][*1*][0]...

If you move pointer left < you are moving pointer from cell X to cell X-1

...[0][0][*0*][1][0]...

2. Input

To read character you use comma ,. What it does is: Read character from standard input and write its decimal ASCII code to the actual cell.

Take a look at ASCII table. For example, decimal code of ! is 33, while a is 97.

Well, lets imagine your BF program memory looks like:

...[0][0][*0*][0][0]...

Assuming standard input stands for a, if you use comma , operator, what BF does is read a decimal ASCII code 97 to memory:

...[0][0][*97*][0][0]...

You generally want to think that way, however the truth is a bit more complex. The truth is BF does not read a character but a byte (whatever that byte is). Let me show you example:

In linux

$ printf ł

prints:

ł

which is specific polish character. This character is not encoded by ASCII encoding. In this case it's UTF-8 encoding, so it used to take more than one byte in computer memory. We can prove it by making a hexadecimal dump:

$ printf ł | hd

which shows:

00000000  c5 82                                             |..|

Zeroes are offset. 82 is first and c5 is second byte representing ł (in order we will read them). |..| is graphical representation which is not possible in this case.

Well, if you pass ł as input to your BF program that reads single byte, program memory will look like:

...[0][0][*197*][0][0]...

Why 197 ? Well 197 decimal is c5 hexadecimal. Seems familiar ? Of course. It's first byte of ł !

3. Output

To print character you use dot . What it does is: Assuming we treat actual cell value like decimal ASCII code, print corresponding character to standard output.

Well, lets imagine your BF program memory looks like:

...[0][0][*97*][0][0]...

If you use dot (.) operator now, what BF does is print:

a

Because a decimal code in ASCII is 97.

So for example BF program like this (97 pluses 2 dots):

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++..

Will increase value of the cell it points to up to 97 and print it out 2 times.

aa

4. Loops

In BF loop consists of loop begin [ and loop end ]. You can think it's like while in C/C++ where the condition is actual cell value.

Take a look BF program below:

++[]

++ increments actual cell value twice:

...[0][0][*2*][0][0]...

And [] is like while(2) {}, so it's infinite loop.

Let's say we don't want this loop to be infinite. We can do for example:

++[-]

So each time a loop loops it decrements actual cell value. Once actual cell value is 0 loop ends:

...[0][0][*2*][0][0]...        loop starts
...[0][0][*1*][0][0]...        after first iteration
...[0][0][*0*][0][0]...        after second iteration (loop ends)

Let's consider yet another example of finite loop:

++[>]

This example shows, we haven't to finish loop at cell that loop started on:

...[0][0][*2*][0][0]...        loop starts
...[0][0][2][*0*][0]...        after first iteration (loop ends)

However it is good practice to end where we started. Why ? Because if loop ends another cell it started, we can't assume where the cell pointer will be. To be honest, this practice makes brainfuck less brainfuck.

Enthrall answered 8/11, 2013 at 22:27 Comment(10)
Cool, now I understood it :)October
What about , [ and ]? :(Scevour
Generally my goal was just to explain bf ideology, but you're right. I ll boost my answer soon.Enthrall
That was a perfect solution to the novice trying to comprehend this language ideology. Congrats, and great post.Valise
Best Brainfuck intro I've seen. Honestly you undo BF a bit by your postThresher
@Enthrall the real (Brain)f*ckers don't care about Da Rulez: $ echo "++++++++[>++++[>++>+++>+++>+<<<<-]>+>->+>>+[<]<-]>>.>>---.+++++++..+++.>.<<-.>.+++.------.--------.>+.>++." | bf Hello world!Boneset
This is the best way I can understand the Brainf*** language! Now I know how my custom Brainf*** interpreter converts the code to a string!Liqueur
I guess that if you need a project for your spare time, you can always add Unicode support to Brainfuck.Genethlialogy
After your post, BF is !BF anymore!Latticework
a perfect introduction; I guess that the biggest value of BF is in creating unusually looking and puzzling code patternsBargain
R
58

Wikipedia has a commented version of the code.

+++++ +++++             initialize counter (cell #0) to 10
[                       use loop to set the next four cells to 70/100/30/10
    > +++++ ++              add  7 to cell #1
    > +++++ +++++           add 10 to cell #2 
    > +++                   add  3 to cell #3
    > +                     add  1 to cell #4
    <<<< -                  decrement counter (cell #0)
]                   
> ++ .                  print 'H'
> + .                   print 'e'
+++++ ++ .              print 'l'
.                       print 'l'
+++ .                   print 'o'
> ++ .                  print ' '
<< +++++ +++++ +++++ .  print 'W'
> .                     print 'o'
+++ .                   print 'r'
----- - .               print 'l'
----- --- .             print 'd'
> + .                   print '!'
> .                     print '\n'

To answer your questions, the , and . characters are used for I/O. The text is ASCII.

The Wikipedia article goes on in some more depth, as well.

The first line initialises a[0] = 10 by simply incrementing ten times from 0. The loop from line 2 effectively sets the initial values for the array: a[1] = 70 (close to 72, the ASCII code for the character 'H'), a[2] = 100 (close to 101 or 'e'), a[3] = 30 (close to 32, the code for space) and a[4] = 10 (newline). The loop works by adding 7, 10, 3, and 1, to cells a[1], a[2], a[3] and a[4] respectively each time through the loop - 10 additions for each cell in total (giving a[1]=70 etc.). After the loop is finished, a[0] is zero. >++. then moves the pointer to a[1], which holds 70, adds two to it (producing 72, which is the ASCII character code of a capital H), and outputs it.

The next line moves the array pointer to a[2] and adds one to it, producing 101, a lower-case 'e', which is then output.

As 'l' happens to be the seventh letter after 'e', to output 'll' another seven are added (+++++++) to a[2] and the result is output twice.

'o' is the third letter after 'l', so a[2] is incremented three more times and output the result.

The rest of the program goes on in the same way. For the space and capital letters, different array cells are selected and incremented or decremented as needed.

Restrain answered 30/5, 2013 at 13:6 Comment(5)
But WHY it prints? or how? The comments are explaining to me the intent of the line, now what it do.October
It prints because the compiler knows that , and . are used for I/O, much like C prints by using putchar. It is an implementation detail handled by the compiler.Restrain
And also because it's setting the required cells to the integer values for the ASCII characters in "Hello World"Maverick
I expected a more in depth explanation... but :/October
@October - I added the in-depth explanation of the code from Wikipedia to the answer. You can see the linked article for more information.Restrain
C
12

Brainfuck same as its name. It uses only 8 characters > [ . ] , - + which makes it the quickest programming language to learn but hardest to implement and understand. ….and makes you finally end up with f*cking your brain.

It stores values in array: [72 ][101 ][108 ][111 ]

let, initially pointer pointing to cell 1 of array:

  1. > move pointer to right by 1

  2. < move pointer to left by 1

  3. + increment the value of cell by 1

  4. - increment the value of element by 1

  5. . print value of current cell.

  6. , take input to current cell.

  7. [ ] loop, +++[ -] counter of 3 counts bcz it have 3 ′+’ before it, and - decrements count variable by 1 value.

the values stored in cells are ascii values:

so referring to above array: [72 ][101 ][108 ][108][111 ] if you match the ascii values you’ll find that it is Hello writtern

Congrats! you have learned the syntax of BF

——-Something more ———

let us make our first program i.e Hello World, after which you’re able to write your name in this language.

+++++ +++++[> +++++ ++ >+++++ +++++ >+++ >+ <<<-]>++.>+.+++++ ++..+++.++.+++++ +++++ +++++.>.+++.----- -.----- ---.>+.>.

breaking into pieces:

+++++ +++++[> +++++ ++ 
                  >+++++ +++++ 
                  >+++ 
                  >+ 
                  <<<-]

Makes an array of 4 cells(number of >) and sets a counter of 10 something like : —-psuedo code—-

array =[7,10,3,1]
i=10
while i>0:
 element +=element
 i-=1

because counter value is stored in cell 0 and > moves to cell 1 updates its value by+7 > moves to cell 2 increments 10 to its previous value and so on….

<<< return to cell 0 and decrements its value by 1

hence after loop completion we have array : [70,100,30,10]

>++. 

moves to 1st element and increment its value by 2(two ‘+’) and then prints(‘.’) character with that ascii value. i.e for example in python: chr(70+2) # prints 'H'

>+.

moves to 2nd cell increment 1 to its value 100+1 and prints(‘.’) its value i.e chr(101) chr(101) #prints ‘e’ now there is no > or < in next piece so it takes present value of latest element and increment to it only

+++++ ++..

latest element = 101 therefore, 101+7 and prints it twice(as there are two‘..’) chr(108) #prints l twice can be used as

for i in array:
    for j in range(i.count(‘.’)):
           print_value

———Where is it used?——-

It is just a joke language made to challenge programmers and is not used practically anywhere.

Crowl answered 11/1, 2018 at 9:16 Comment(0)
B
9

To answer the question of how it knows what to print, I have added the calculation of ASCII values to the right of the code where the printing happens:

> just means move to the next cell
< just means move to the previous cell
+ and - are used for increment and decrement respectively. The value of the cell is updated when the increment/decrement happens

+++++ +++++             initialize counter (cell #0) to 10

[                       use loop to set the next four cells to 70/100/30/10

> +++++ ++              add  7 to cell #1

> +++++ +++++           add 10 to cell #2 

> +++                   add  3 to cell #3

> +                     add  1 to cell #4

<<<< -                  decrement counter (cell #0)

]            

> ++ .                  print 'H' (ascii: 70+2 = 72) //70 is value in current cell. The two +s increment the value of the current cell by 2

> + .                   print 'e' (ascii: 100+1 = 101)

+++++ ++ .              print 'l' (ascii: 101+7 = 108)

.                       print 'l' dot prints same thing again

+++ .                   print 'o' (ascii: 108+3 = 111)

> ++ .                  print ' ' (ascii: 30+2 = 32)

<< +++++ +++++ +++++ .  print 'W' (ascii: 72+15 = 87)

> .                     print 'o' (ascii: 111)

+++ .                   print 'r' (ascii: 111+3 = 114)

----- - .               print 'l' (ascii: 114-6 = 108)

----- --- .             print 'd' (ascii: 108-8 = 100)

> + .                   print '!' (ascii: 32+1 = 33)

> .                     print '\n'(ascii: 10)
Bedroom answered 20/3, 2017 at 6:34 Comment(0)
W
5

All the answers are thorough, but they lack one tiny detail: Printing. In building your brainfuck translator, you also consider the character ., this is actually what a printing statement looks like in brainfuck. So what your brainfuck translator should do is, whenever it encounters a . character it prints the currently pointed byte.

Example:

suppose you have --> char *ptr = [0] [0] [0] [97] [0]... if this is a brainfuck statement: >>>. your pointer should be moved 3 spaces to right landing at: [97], so now *ptr = 97, after doing that your translator encounters a ., it should then call

write(1, ptr, 1)

or any equivalent printing statement to print the currently pointed byte, which has the value 97 and the letter a will then be printed on the std_output.

Waldman answered 18/7, 2016 at 10:0 Comment(0)
E
2

I think what you are asking is how does Brainfuck know what to do with all the code. There is a parser written in a higher level language such as Python to interpret what a dot means, or what an addition sign means in the code.

So the parser will read your code line by line, and say ok there is a > symbol so i have to advance memory location, the code is simply, if (contents in that memory location) == >, memlocation =+ memlocation which is written in a higher level language, similarly if (content in memory location) == ".", then print (contents of memory location).

Hope this clears it up. tc

Estes answered 29/7, 2018 at 10:31 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.