Lambda reflections

How the compiler, the Library and the Kernel work - Part 2

Apr 25, 2014 • compiler, kernel, libc
In the previous part of this little series, we talked about the compiler, and what it does with the header files, in our attempt to demistify their usage. In this part, I want to show you what’s the compiler’s output, and how we create our file.

The compiler’s composition

Generally speaking, a compiler belongs to a family of software called translators. A translator’s job is to read some source code in a source language, and generate (translate it to) some source code in a target language.

Now, you might think that most compilers you know don’t do that. You input a (source code) file, and you get a binary file, ready to run when you want it to. Yes that’s what it does, but it’s not the compiler that does all this. If you remember from the last installment of this series, when you call the compiler like gcc some_file.c or clang some_file.c, in essence you are calling the compilation driver, with the file as a parameter. The compilation driver then calls 1) the preprocessor, 2) the (actual) compiler, 3) the assembler and last but not least the linker. At least when it comes to gcc, these pieces of software are called cpp, cc1, gas (executable name is as) and collect2 (executable name is ld) respectively.

From that little software collection up top, that we call the compiler, we can easily take notice of at least 3 (yeah, that’s right) translators, that act as we mentioned earlier, that is take some input in a source language, and produce some output to a target language.

The first is the preprocessor. The preprocessor accepts source code in C as a source language, and produces source code again in C (as a target language), but with the output having various elements of the source code resolved, such as header file inclusion, macro expansion, etc.

The second is the compiler. The compiler accepts (in our case) C source code, as a source language, and translates it to some architecture’s assembly language. In my case, when I talk about the compiler, I’m gonna assume that it produces x86 assembly.

The last one, is the assembler, which accepts as input some machine’s architecture assembly language, and produces what’s called binary, or object representation of it, that is it translates the assembly mnemonics directly to the bytes they correspond to, in the target architecture.

At this point, one could also argue that the linker is also a translator, accepting binary, and translating it to an executable file, that is, resolving references, and fitting the binary code on the segments of the file that is to be produced. For example, on a typical GNU/Linux system, this phase produces the executable ELF file.

The (actual) compiler’s output: x86 assembly.

Before we go any further, I would like to show you what the compiler really creates:

For the typical hello world program we demonstrated in our first installment, the compiler will output the following assembly code:
```
	.file	"hello.c"
	.section	.rodata
.LC0:
	.string	"Hello world!"
	.text
	.globl	main
	.type	main, @function
main:
	pushq	%rbp
	movq	%rsp, %rbp
	subq	$16, %rsp
	movl	%edi, -4(%rbp)
	movq	%rsi, -16(%rbp)
	movl	$.LC0, %edi
	call	puts
	movl	$0, %eax
	leave
	ret
	.size	main, .-main
	.ident	"GCC: (GNU) 4.8.2 20131212 (Red Hat 4.8.2-7)"
	.section	.note.GNU-stack,"",@progbits
```
To produce the above file, we had to use the following gcc invocation command: gcc -S -fno-asynchronous-unwind-tables -o hello.S hello.c. We used -fno-asynchronous-unwind-tables to remove .cfi directives, which tell gas (the gnu assembler) to emit Dwarf Call Frame Information tags, which are used to reconstruct a stack backtrace when a frame pointer is missing.

For more usefull compilation flags, to control the intermediary compilation flow, try these:
- -E: stop after preprocessing, and produce a *.i file
- -S: we used this, stop after the compiler, and produce a *.s file
- -c: stop after the assembler, and produce a *.o file.
The default behaviour is to use none, and stop after the linker has run. If you want to run a full compilation and keep all the intermediate files, use the -save-temps flag.

From source to binary: the assembler.

The next part of the compilation process, is the assembler. We have already discussed what the assembler does, so here we are going to see it in practice. If you have followed so far, you should have two files, a hello.c, which is the hello world’s C source code file, and a hello.S which is what we created earlier, the compiler’s (x86) assembly output.

The assembler operates on that last file as you can imagine, and to see it running, and emit binary, we need to invoke it like this: as -o hello.bin hello.S, and produces this:
```
ELF\00\00\00\00\00\00\00\00\00\00>\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\F0\00\00\00\00\00\00\00\00\00\00\00@\00\00\00\00\00@\00\00\00UH\89\E5H\83\EC\89}\FCH\89u\F0\BF\00\00\00\00\E8\00\00\00\00\B8\00\00\00\00\C9\C3Hello world!\00\00GCC: (GNU) 4.8.2 20131212 (Red Hat 4.8.2-7)\00\00.symtab\00.strtab\00.shstrtab\00.rela.text\00.data\00.bss\00.rodata\00.comment\00.note.GNU-stack\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00 \00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00@\00\00\00\00\00\00\00 \00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\B8\00\00\00\00\00\000\00\00\00\00\00\00\00	\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00&\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00`\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00,\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00`\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\001\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00`\00\00\00\00\00\00\00
\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\009\00\00\00\00\00\000\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00m\00\00\00\00\00\00\00-\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00B\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\9A\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\9A\00\00\00\00\00\00\00R\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\B0\00\00\00\00\00\00\F0\00\00\00\00\00\00\00
\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00	\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\A0\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\F1\FF\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00	\00\00\00\00\00\00\00\00\00\00\00\00\00 \00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00hello.c\00main\00puts\00\00\00\00\00\00\00\00\00\00\00\00\00
\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00	\00\00\00\FC\FF\FF\FF\FF\FF\FF\FF
```
Last but not least: the linker

We saw what the assembler emits, which is to say, binary code. However, that binary code still needs further processing. To explain that, we need to go back a little.

In our first installment of the series, we said that when you call a function like printf(), the compiler only needs its prototype to do type checking and ensure that you use it legally. For that you include the header file stdio.h. But since that contains the function prototype only, where is the source code for that function? Surely, it must be somewhere, since it executes successfully to begin with, but we haven’t met the source code for printf so far, so where is it?

The function’s source code is located in the .so (shared object) of the standard C library, which in my system (Fedora 19, x64) is libc-2.17.so. I don’t want to expand on that further, as I plan to do so on the next series installment, however, what we have said so far is enough for you to understand the linker’s usage:

The linker resolves the undefined (thus far) reference to printf, by finding the reference to the printf symbol and (in layman’s talk) making a pointer to point to it so that execution can jump to printf’s code when we have to do that during our program’s execution.

To invoke the linker on our file, at least according to it’s documentation, we should do the following: ld -o hello.out /lib/crt0.o hello.bin -lc. Then we should be able to run the file like this: ./hello.out.

Epilogue

That’s this end of this part 2 of my series that explains how your code turns into binary, and how your computer (at least when it comes to the software side) runs it. In part 3, I am going to discuss in greater length, the C library, and the kernel.

References
My Linux from Scratch Experience

Feb 23, 2014 • linux, lfs, distribution
The past two to three days, I have been busy with creating my very own Linux distribution using the well known Linux from Scratch. This post is an accounting of my experience with the process, what I liked, what I did learn from that, what was surprising to me and more.

Linux from Scratch: An introduction

If you are here, then you most likely already know what linux from scratch is, but for the sake of completeness (or in the case that you don’t know what it is, but are so keen on learning) I will provide an introduction about it here.

Linux from scratch is a book (from now on, lfs), providing a series of steps that guide you to the creation of a fully function GNU/Linux distribution. Although the original book creates a “barebones” distribution, with only fundamental tools in it, the distribution created provides a fine enviroment for further experimentation or customization.

Apart from the basic book, the lfs project also has 3-4 books to read if you want to extend the basic system (such as blfs, Beyond Linux from Scratch) or if you want to automate the process, create a distribution that is more secure, or how to cross-compile an lfs system for different machines.

My experience with building LFS

A small introduction about my background

I have been a UNIX (-like) systems (full-time) user for about 2.5 years now. During that time I had seen myself from being what you would call a Linux newbie, not knowing how to use a system without a GUI installed (have I mentioned that Ubuntu was my favourite distribution) to being an arguably experienced UNIX programmer, trying to learn more about the various aspects of UNIX systems, and delving deeper and deeper into them every day (while also feeling pain if using something other than a UNIX like system).

During that time, I have learned about the Unix way of working with the system, using the shell and the system’s toolchain to write software and other wise manipulate the system. I ditched my old knowledge about IDEs and GUIs, and set out to master the command line and the associated tools (Anecdote: I remember, when I first came from to Unix from Windows, to searching the net for a C/C++ IDE to do development.) I remember reading about how people worked another way in Unix land, using an editor, and the shell to work, and I decided to force myself to learn to work that way. I still remember trying to use vim and gcc, and ending up liking this way better because it seemed a more natural way to interact with the software development process, than using a ide and pressing the equivalent of a “play” button, so that magic ensues for the next few seconds until I have a result.

Time has passed since then, and going through hours and hours of reading and working with the system, I did learn quite a lot about it. My Google Summer of Code experience in 2013 expanded my system knowledge even further (that’s what you get when you have to work with the system kernel, the C library and a compiler).

But in all that time, of using Unix like systems, I never had the chance to create one myself. And although my background did allow me to know quite a few things of the inner workings of a system like that, I never actually saw all these software systems combining in front of my very eyes to create that beauty we know as a GNU/Linux distribution. And that left me a bad taste, because I knew what was happening, but I wanted to see it happen right in front of my eyes.

Knowing about the existence of lfs, and not actually going through it also made matters worse for me, as I knew that I could actually “patch” that knowledge gap of mine, but I never really tried to do that. I felt that I was missing on a lot, and that lfs would be instrumental to my understanding of a Linux system. Having gone through that some years ago, and getting stuck at the very beginning had also created an innate fear in me, that it was something that would be above my own powers.

Until two days ago, when I said to myself: “You know what? I have seen and done a lot of things in a UNIX system. I am now much more experienced than I was when I last did it. And I know I want to at least try it, even if it will only give me nothing but infinite confusion Because if I do manage to get it, I will learn so many more things, or at least get assured that my preexisting knowledge was correct” And that thought was the greatest motive I had to do that in a fairly long time.

So, I sat at my desk, grabbed a cup of coffee and off I went!

The process

Preparation and the temporary toolchain

The book is itself several chapters long, each of which perform another “big step” in the creation of the distribution.

The first few chapters are preparatory chapters, where you ensure the integrity of the building environment, and download any building dependencies you may be lacking, create a new partition that will host the lfs system, and create the user account that will do the building of the temporary toolchain.

The temporary toolchain building is a more exciting process. In essence you compile and collect several pieces of software that will later be used to compile the distribution’s toolchain and other software.

You start of with building binutils, and that is to get a working assembler and linker. After having a working assembler and linker, you proceed with compiling gcc. Next on is unpacking the linux headers, so that you can compile (and link against them) the glibc.

Having the basic parts of the toolchain compiled, you then proceed with installing other software that is needed in the temporary toolchain, like gawk, file, patch, perl etc.

Building the main system

After getting done with the temporary toolchain, you then chroot into the lfs partition. You start of with creating the needed directories (like /bin, /boot, /etc, /home etc) and then continue with building the distribution software, utilising the temporary toolchain. For instance, you construct a new gcc, you compile sed, grep, bzip, the shadow utility that manages the handling of passwords etc, all while making sure that things don’t break, and running countless tests (that sometimes take longer than what the package took to compile) to ensure that what you build is functional and reliable.

Final configuration

Next one on the list, is the various configuration files that reside in /etc, and the setup of sysvinit, the distribution’s init system.

Last, but not least, you are compiling the linux kernel and setting up grub so that the system is bootable.

At this point, if all has gone well, and you reset, you should boot into your new lfs system.

What did I gain from that?

Building lfs was a very time consuming process for me. It must have taken about 7-8 hours at the very least. Not so much because of the compilation and testing (I was compiling with MAKEFLAGS='-j 4' on a Core i5), but because I didn’t complete some steps correctly, and later needed to go back and redo them, along with everything that followed and the time it took to research some issues, programs or various other things before I did issue a command at the shell.

Now if I were to answer the question “What did I gain from that”, my answer would be along the lines of “Infinite confusion, and some great insight at some points”.

To elaborate on that,
- lfs mostly served as a reassurance that indeed, what I did know about the system was mostly correct.
- I did have the chance to see the distribution get built right before my eyes, which was something I longed for a great amount of time.
- It did make me somewhat more familiar with the configure && make && make install cycle
- It made me realise that the directories in the system are the simple result of a mkdir command, and that configuration files in the /etc/folder are handwritten plain files. (yeah, I feel stupid about that one - I don’t know what I was expecting. This was probably the result of the “magic involved” that the distro making process entailed for me)
- I got to see the specific software that is needed to create a distribution, and demonstrate to me how I can build it, customize that build, or even change that software to my liking
- And last but not least, something that nearly every lfs user says after a successful try: I knew that package managers did a great many things in order to maintain the system, and that much of the work I would normally have to do was done nearly automatically but boy, was I underestimating them. After lfs, I developed a new appreciation for a good package manager.
Epilogue

Lfs was, for the most part, a great experience. As a knowledge expander, it works great. As a system that you keep and continue to maintain? I don’t know. I know that people have done that in the past, but I decided against maintaining my build, as I figured it would be very time consuming, and that if I ever wanted to gain the experience of maintaining a distro, I would probably fork something like Crux.

In the end if you ask me if I can recommend that to you, I will say that I’m not so sure. It will provide you with some insight into the internals of a GNU/Linux distribution, but it won’t make you a better programmer as some people claim (most of the process revolves around the configure && make && make install cycle, and some conf files handwriting).

In the end, it is yourself who you should ask. Do you want that knowledge? Is it worth the hassle for you? Do you want the bragging rights? Are you crazy enough to want to maintain it? These are all questions that you get as many answers to them as the people you ask.
How the compiler, the Library and the kernel work - Part 1

Dez 12, 2013 • compiler, kernel, libc
Before we get any further, it might be good if we provided some context.

Hello world. Again.
```
#include <stdio.h>

int
main (int argc, char **argv)
{
    printf ("Hello world!\n");

    return 0;
}
```
Every user space (read: application) programmer, has written a hello world program. Only god knows how many times this program has been written. Yet, most programmers’ knowledge of the program is limited to something along the lines of:
- It sends the string passed as a parameter to the system to print.
- It takes the printf function from stdio.h and prints the string
and various other things, which are anywhere between plain wrong, or partially correct.

** So why not demistify the process? **

Enter the C preprocessor.

You may have heard of the C Preprocessor. It’s the first stage of a c or c++ file compilation, and it’s actually responsible for things such as:
- inclusion of header files (it does so by replacing #include <header.h> with the content of this file, and the file it includes recursively),
- macro expansion, such as the famous comparison of two numbers (a greater than b). In essence, if you define the following macro #define gt(a, b) ((a > b) ? 1 : 0), then in a statement such as this:
```
 if (gt (5, 3)) printf ("The first parameter is greater than the second.\n");
```
gt (5, 3) gets expanded to the macro definition, so after the preprocessor has run you end up with something like this:
```
 if (((5 > 3) ? 1 : 0)) printf ("The first parameter is greater than the second.\n");
```
- conditional compilation (things such as:
```
#ifdef WIN32 
    printf ("We are on windows\n"); 
#endif
```
amongst others. You can see it for yourself. Write the hello world program, and pass it to cpp: cpp hello_world.c

So now that we know what it does it’s time to demistify a common myth regarding it: Some people believe that the header files include the function to be called.. That’s wrong. What it does include is function prototypes (and some type definitions, etc) only. It doesn’t include the body of the function to be called.

Some people find that fact quite surprising, though, it isn’t, if you get to understand what the compiler does with it.

Say hello to the compiler.

Here we are gonna unmask another pile of misconceptions. First of all, some people think that when they call gcc on the command line they are actually calling the compiler. They are not. In fact they are calling the software commonly called the compilation driver, whose job is to run all the software needed to fully turn source to binary, including preprocessors, the actual compiler, an assembler and finally the linker

Having said that, the actual compiler that’s getting called when you call gcc is called cc1. You may have seen it some times when the driver reports errors. Wanna take a look at it, to make sure I’m not lying to you? (Hint: I’m not!) Fair enough. Why don’t you type this in the command line: gcc -print-prog-name=cc1. It should tell you where the actual compiler is located in your system.

So now that we have this (misconception) out of our minds, we can continue with our analysis. Last time we talked about it, we said that the header files include prototypes and not the whole function.

You may know that in C, you usually declare a function, before you use it. The primary reason for doing this is to provide the compiler with the ability to perform type checking, that is to check that the arguments passed are correct, both in number, and in type, and to verify that the returned value (assuming there is one) is being used correctly. Below is a program that demonstrates the function prototype:
```
#include <stdio.h>

int add_nums (int first, int second);

int
main (void)
{
    printf ("5 + 5 results in %d\n", add_nums (5, 5));

    return 0;
}

int
add_nums (int first, int second)
{
    return first + second;
}
```
In this particular example, the prototype gives the compiler a wide variety of information. It tells it that function add_nums takes two int arguments and returns an integer to the calling function. Now the compiler can verify that I am passing correct arguments to it when I call it inside printf. If I don’t include the function prototype, and do something slightly evil such as calling add_nums with float arguments then this might happen:
```
5 + 4 results in 2054324224
```
Now that you know that the compiler (the real one) only needs the prototype and not the actual function code, you may be wondering how the compiler actually compiles it if it doesn’t know it’s code.

Now is the time to bring down another missconception. The word compiler is just a fancy name for software otherwise known as translators. A translator’s job is to get input and turn it from one language (source language) to a second language (target language), whatever that may be. Most of the times, when you compile software, you compile it to run in your computer, which runs on a processor from the x86 architecture family of processors. A processor is typically associated with an assembly language for that architecture (which is just human friendly mnemonics for common processor tasks), so your x86 computer runs x86 assembly (ok that’s not 100% true, but for simplicity’s sake at the moment, it should serve. We will see why it’s not true later.) So the compiler (in a typical translation) translates (compiles) your C source code to x86 assembly. You can see this by compiling your hello world example and passing the compiler the -S (which asks it to stop, after x86 assembly is produced) parameter, likewise gcc -S hello.c.

Conclusion

At this part, we saw how the compiler and the preprocessor work with our code, in an attempt to demistify the so called library calls. In the next part, we are going to study the assembler and the linker, and for the final part the loader and the kernel.
GSOC Week 11 report

Sep 2, 2013 • gcc, golang, gsoc
Introduction

This week was spent investigating the runtime and debugging executables with gdb. It was interesting in the sense that it provided me with some interesting pieces of information. Without any further ado, let’s present our findings:

My findings

Before starting out playing with libpthread, and glibc, I wanted to make sure that the goruntime behaved the way I believed it behaved, and make some further assurances about the goruntime. These assurances had to do with the total number of goroutines and the total number of machine threads at various checkpoints in the language runtime.
- The first thread in the program is initialised during runtime_schedinit.
- The number of m’s (kernel threads) is dependent on the number of goroutines. The runtime basically attempts to create an equal amount of m’s to run the goroutines. We can observe everytime a new goroutine is created, there is a number of calls to initiate a new kernel thread.
- There are at least two kernel threads. One that supports the runtime (mainly the garbage collector) and one that executes the code of the go program.
There is only one small piece of code in the goruntime that creates some sort of confusion for me, and that is the code for a new m initialisation. Let me first present the code that confuses me:
```
M*
runtime_newm(void)
{

    ...
	mp = runtime_mal(sizeof *mp);

    ...
	mcommoninit(mp);
	mp->g0 = runtime_malg(-1, nil, nil);

    ...
	if(pthread_attr_init(&attr) != 0)
		runtime_throw("pthread_attr_init");
	if(pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED) != 0)
		runtime_throw("pthread_attr_setdetachstate");

    ...
}
```
I purposely compacted the function for brevity, as it only serves as a demonstration for a point. Now, my confusion lies in the line mp->g0 = runtime_malg(-1, nil, nil). It is a piece of code that allocates memory for a new goroutine. Now I am ok with that, but what I do not understand is that new kernel threads (m’s) are supposed to be pick and run a goroutine from the global goroutine pool - that is run an existing one, and not create a new one. Now, the runtime_malg is given parameters that don’t initialise a new goroutine properly, but still, new memory is allocated for a new goroutine, and is returned to mp->g0 from runtime_malg.

Assuming I have not misunderstood something, and I am not mistaken (which is kind of likely), this is behavior that could lead to a number of questions and/or problems. For instance, what happens to the goroutine created by runtime_malg? Is it killed after the m is assigned a new goroutine to execute? Is it parked on the goroutine global list? Is it just ignored? Does it affect the runtime scheduler’s goroutine count? This is the last thing I feel I wanna clear out regarding gccgo’s runtime.

gdb

For this week, I also run the executables created by gccgo through gdb. It was a fertile attempt that, most of the time, confirmed my findings in the goruntime. It also provided us with some other nice pieces of information regarding the crashing of goroutines, but also left me with a question.

The code in question that I run through gdb is this:
```
package main

import "fmt"

func say(s string) {
    for i := 0; i < 5; i++ {
        fmt.Println(s)
    }
}

func main() {
    fmt.Println("[!!] right before a go statement")
    go say("world")
    say ("hello")
}
```
Your very typical hello world like goroutine program. Now, setting a break point in main (not the program’s main, that’s main.main. main as far as the runtime is concerned is the runtime entry point, in go-main.c) and running it through gdb yields the following results:
```
Breakpoint 1, main () at ../../../gcc_source/libgo/runtime/go-main.c:52
52 runtime_check ();
2:  __pthread_total = 1
1: runtime_sched.mcount = 0
(gdb) next
53 runtime_args (argc, (byte **) argv);
2: __pthread_total = 1
1: runtime_sched.mcount = 0
54 runtime_osinit ();
2: __pthread_total = 1
1: runtime_sched.mcount = 0
63: runtime_schedinit ();
2: __pthread_total = 1
1: runtime_sched.mcount = 1
```
Up until now, nothing unexpected. The kernel thread is registered with the runtime scheduler during its initialisation process in runtime_schedinit and that’ why the runtime_sched.mcount is reported to be zero many times before schedinit is run.
```
68 __go_go (mainstart, NULL);
2: __pthread_total = 1
1: runtime_sched.mcount = 1
(gdb) display runtime_sched.gcount
3: runtime_sched.gcount = 0
```
That too is ok, because a new goroutine is registered with the scheduler during the call to __go_go. Now I am gonna fast forward a bit, to a more interesting point.
```
...
[DEBUG] (in runtime_gogo) new goroutine's status is 2
[DEBUG] (in runtime_gogo) number of goroutines now is 2
[New Thread 629.30]

Program received SIGTRAP, Trace/breakpoint trap.
0x01da48ec in ?? () from /lib/i386-gnu/libc.so.0.3
3: runtime_sched.gcount = 2
2: __pthread_total = 2
1: runtime_sched.mcount = 2
(gdb) info threads
 Id   Target  Id       Frame
 6    Thread  629.30   0x08048eb7 in main.main () at goroutine.go:12
 5    Thread  629.29   0x01da48ec in ?? () from /lib/i386-gnu/libc.so.0.3
*4    Thread  629.28   0x01da48ec in ?? () from /lib/i386-gnu/libc.so.0.3
```
This is getting weird. I mean, libpthread is reporting that 2 threads are active,
but gdb reports that 3 are active. Anyway, let's continue:

```
[DEBUG] (in runtime_stoptheworld) stopped the garbage collector
[DEBUG] (in runtime_starttheworld) starting the garbage collector
[DEBUG] (in runtime_starttheworld) number of m's now is: 2
[DEBUG] (in runtime_starttheworld) [note] there is already one gc thread
[!!] right before a go statement

Program received signal SIGTRAP, Trace/breakpoint trap.
0x01da48ec in ?? () from /lib/i386-gnu/libc.so.0.3
3: runtime_sched.gcount = 2
2: __pthread_total = 2
1: runtime_sched.mcount = 2
(gdb) continue
... (output omitted by me for brevity)

[DEBUG] (in runtime_newm) Right before the call to pthread_create.
a.out: ./pthread/pt-create.c:167: __pthread_create_internal: Assertion `({ mach_port_t ktid = __mach_thread_self (); int ok = thread->kernel_thread == ktid;
__mach_port_deallocate ((__mach_task_self + 0), ktid); ok; })' failed.
[New Thread 629.31]

Program received signal SIGABRT, Aborted.
0x01da48ec in ?? () from /lib/i386-gnu/libc.so.0.3
3: runtime_sched.gcount = 3
2: __pthread_total = 2
1: runtime_sched.mcount = 3
```
Oh my goodness. From a first glance, this seems to be a very serious inconsistency between libpthread and the goruntime. At this point, the go scheduler reports 3 threads (3 registered threads, that means that flow of execution has passed mcommoninit, the kernel thread initialisation function which also registers the kernel thread with the runtime_scheduler) whereas libpthread reports 2 threads.

But WAIT! Where are you going? Things are about to get even more interesting!
```
(gdb) info threads
 Id   Target  Id       Frame
 7    Thread  629.31   0x01f4da00 in entry_point () from /lib/i386-gnu/libpthread.so.0.3
 6    Thread  629.30   0x01da48ec in ?? () from /lib/i386-gnu/libc.so.0.3
 5    Thread  629.29   0x01da48ec in ?? () from /lib/i386-gnu/libc.so.0.3
*4    Thread  629.28   0x01da48ec in ?? () from /lib/i386-gnu/libc.so.0.3
```
GDB reports 4 threads. Yes, 4 threads ladies and gentlemen. Now take a look closely. 3 threads are in the same frame, with the one with id 4 being the one currently executed. And there is also a pattern. 0x01da48ec is the value of the eip register for all 3 of them.

That’s one thing that is for certain. Now I already have an idea. Why not change the current thread to the one with id 7? I’m sold to the idea, let’s do this:
```
(gdb) thread 7
[Switching to thread 7 (Thread 629.31)]
#0  0x01f4da00 in entry_point () from /lib/i386-gnu/libpthread.so.0.3
(gdb) continue
Continuing.

Program received signal SIGABRT, Aborted.
[Switching to Thread 629.28]
0x01da48ec in ?? () from /lib/i386-gnu/libc.so.0.3
3: runtime_sched.gcount = 3
2: __pthread_total = 2
1: runtime_sched.mcount = 3
(gdb) info threads
 Id   Target  Id       Frame
 7    Thread  629.31   0x01dc08b0 in ?? () from /lib/i386-gnu/libc.so.0.3
 6    Thread  629.30   0x01da48ec in ?? () from /lib/i386-gnu/libc.so.0.3
 5    Thread  629.29   0x01da48ec in ?? () from /lib/i386-gnu/libc.so.0.3
*4    Thread  629.28   0x01da48ec in ?? () from /lib/i386-gnu/libc.so.0.3
```
Damn. But I am curious. What’s the next value to be executed?
```
(gdb) x/i $eip
=> 0x1da48ec: ret
```
And what is the next value to be executed for the thread with id 7?
```
(gdb) x/i $eip
=> 0x1dc08b0: call *%edx
```
Conclusion

Apparently, there is still much debugging left to checkout what is really happening. But we have got some leads in the right direction, that hopefully will lead us to finally finding out where the problem lies, and correct it.

Most importantly, in my immediate plans, before iI start playing around with libpthread is to attempt the same debugging run on the same code, under linux (x86). Seeing as go is clean on linux, it would provide some clues as to what the expected results should be, and where the execution differentiates substantially, a clue that might be vital to finding the problem.
GSOC week 10 report

Aug 26, 2013 • gsoc, gcc, golang
Introduction

This week was spent attempting to debug the gccgo runtime via print statements. There were many things that I gained from this endeavour. The most significant of which, is the fact that I have got a great deal of information regarding the bootstrapping of a go process. Let’s proceed into presenting this week’s findings, shall we?

Findings

The process bootstrapping sequence

The code that begins a new go-process is conveniently located in a file called go-main.c, the most significant part of which is the following:
```
int
main (int argc, char **argv)
{
  runtime_check ();
  runtime_args (argc, (byte **) argv);
  runtime_osinit ();
  runtime_schedinit ();
  __go_go (mainstart, NULL);
  runtime_mstart (runtime_m ());
  abort ();
}

static void
mainstart (void *arg __attribute__ ((unused)))
{
  runtime_main ();
}
```
The process is as follows:
- First runtime_check runs and registers the os_Args and syscall_Envs as runtime_roots with the garbage collector. I am still investigating what this function exactly is doing, but it seems like some early initialisation of the garbage collector
- Secondly, runtime_args is run. It’s job is to call a specific argument handler for the arguments passed to main.
- Thirdly, runtime_osinit is run, whose job is to call the lowlevel _CPU_COUNT function, to get the number of CPUs (in a specific data structure that represents a set of CPUs)
- After that, runtime_schedinit is run, whose job is to create the very first goroutine (g) and system thread (m), and continues with parsing the command line arguments, and the environment variables. After that it sets the maximum number of cpus that are to be used (via GOMAXPROCS), runs the first goroutine, and does some last pieces of the scheduler’s initialisation.
- Following runtime_schedinit, __go_go is run, a function whose purpose is to create a new queue, tell it to execute the function that is passed to it as the first parameter, and then queue the goroutine in the global ready-to-run goroutine pool.
- Last but not least, runtime_mstart runs, which seems to be starting te execution of the kernel thread created during runtime_schedinit.
The very last piece of code that is run (and most probably the most important) is runtime_main. Remember that this is passed as a parameter to a goroutine created during the __go_go call, and its job is to mark the goroutine that called it as the main os thread, to initialise the sceduler, and create a goroutine whose job is to release unused memory (from the heap) back to the OS. It then starts executing the process user defined instructions (the code the programmer run) via a call to a macro that directs it to __go_init_main in the assembly generated by the compiler.

Runtime_main is also the function that terminates the execution of a go process, with a call to runtime_exit which seems to be a macro to the exit function.

Other findings

During our debugging sessions we found out that the total count of kernel threads that are running in a simple program is at least two. The first one is the bootstrap M, (the one initialised during the program’s initialisation, inside runtime_schedinit) and at least another one, (I am still invistigating the validity of the following claim) created to be used by the garbage collector.

A simple go program such as one doing arithmetic or printing a helloworld like message evidently has no issue running. The issues arrise when we use a go statement. With all our debugging messages activated, this is how a simple go program flows:
```
root@debian:~/Software/Experiments/go# ./a.out
[DEBUG] (in main) before runtime_mcheck is run
[DEBUG] (in main) before runtime_args is run
[DEBUG] (in main) before runtime_osinit is run
[DEBUG] (in main) before runtime_schedinit is run
[DEBUG] (in main) before runtime_mstart is run
[DEBUG] (in runtime_mstart) right before the call to runtime_minit
[DEBUG] (in mainstart) right before the call to runtime_main
[DEBUG] (in runtime_main) Beginning of runtime_main
[DEBUG] (start of runtime_newm) Total number of m's is 1
[DEBUG] (in runtime_newm) Preparing to create a new thread
[DEBUG] (in runtime_newm) Right before the call to pthread_create
[DEBUG] (in runtime_newm) pthread_create returned 0
[DEBUG] (in runtime_mstart) right before the call to runtime_minit
[DEBUG] (end of runtime_newm) Total number of m's is 2
Hello, fotis
[DEBUG] (in runtime_main) Right before runtime_exit
```
And this is how a goroutine powered program fails:
```
root@debian:~/Software/Experiments/go# ./a.out
[DEBUG] (in main) before runtime_mcheck is run
[DEBUG] (in main) before runtime_args is run
[DEBUG] (in main) before runtime_osinit is run
[DEBUG] (in main) before runtime_schedinit is run
[DEBUG] (in main) before runtime_mstart is run
[DEBUG] (in runtime_mstart) right before the call to runtime_minit
[DEBUG] (in mainstart) right before the call to runtime_main
[DEBUG] (in runtime_main) Beginning of runtime_main
[DEBUG] (start of runtime_newm) Total number of m's is 1
[DEBUG] (in runtime_newm) Preparing to create a new thread
[DEBUG] (in runtime_newm) Right before the call to pthread_create
[DEBUG] (in runtime_newm) pthread_create returned 0
[DEBUG] (in runtime_mstart) right before the call to runtime_minit
[DEBUG] (end of runtime_newm) Total number of m's is 2
[DEBUG] (start of runtime_new) Total number of m's is 2
[DEBUG] (in runtime_newm) Preparing to create a new thread.
[DEBUG] (in runtime_newm) Right before the call to pthread_create
a.out: ./pthread/pt-create.c:167: __pthread_create_internal: Assertion `({ mach_port_t ktid = __mach_thread_self (); int ok = thread->kernel_thread == ktid;
__mach_port_deallocate ((__mach_task_self + 0), ktid); ok; })' failed.
Aborted
```
Work for the next week

I will of course continue to print debug until I have knowledge of the exact flow of execution in the go runtime. Right now I have very good knowledge of the flow, but there are some things that I need to sort out. For instance it is not exactly clear to me why we call certain functions, or what they are supposed to be doing at certain parts. After I sort this out, I also plan to start debugging the libpthread to see what’s libpthreads status during a hello world like program, and during a goroutine powered program, to get to see if we get to find something interesting in libpthread (like how many threads does libpthread report against how many the goruntime reports)

« Older Newer »

The compiler’s composition

The (actual) compiler’s output: x86 assembly.

From source to binary: the assembler.

Last but not least: the linker

Epilogue

References

Linux from Scratch: An introduction

My experience with building LFS

A small introduction about my background

The process

Preparation and the temporary toolchain

Building the main system

Final configuration

What did I gain from that?

Epilogue

Hello world. Again.

Enter the C preprocessor.

Say hello to the compiler.

Conclusion

Introduction

My findings

gdb

Conclusion

Introduction

Findings

The process bootstrapping sequence

Other findings

Work for the next week