Files
bssdata
Folders and files
Name | Name | Last commit date | ||
---|---|---|---|---|
parent directory.. | ||||
See the top level README for information on where to find the schematic and programmers reference manual for the ARM processor on the raspberry pi. Also find information on how to load and run these programs. Based on uart02, the purpose of this example is to demonstrate what you would need to do if you assume .bss is zeros or use .data. And you are asking, what does that even mean? As stated in the top level README I personally write code so I dont have to mess with what I am about to show you. Although not carved in stone, many toolchains (assembler, compiler (C), linker) use terms .text and .data and .bss to describe where things go in a binary created by the toolchain. Now I have seen other names for segments, you will just have to translate. These are all chunks of memory if you want to think of it that way, esp with bare metal embedded you will eventually find a system where some of the memory range is a rom, and your program is there and some of the memory range is ram and you want your read/write variables there, etc. The toolchain keeps these segments of memory separate from each other and the linker places these items in the binary depending on what the linker is told to do through a configuraiton file, script, command line, whatever mechanism. What kind of binary output is affected by this as well. How can there be more than one kind of binary output file? Most of the "binary" files we run today are more than just the machine code and data that makes up our program. The files tend to have some sort of header so we can detect that is what they are, if you look at a windows/microsoft .exe file the first to letters are or at least used to be MZ, an elf file popular with linux starts with the letters ELF. Then depending on the file format there are lots of things that might be in the file, for example a bunch of stuff related to debugging, if you compile for debugging or compile with debugging symbols the file can have extra info to help the debugger find things in the code to show you on the debugger gui to allow you to understand where you are in the higher level source code (the binary file contains machine code). You will see in the disassembly below of a .elf file, there are some global names like _start and fun, etc. These strings are in the elf binary just in case we want to do things like disassemble. Otherwise without those symbols in the .elf file all we would see are some hex numbers, no ascii names. Depending on the binary format and how you liked things each segment may be in separate parts of the binary file, and the binary file would have information for the loader to place these things at the right addresses so that the code will run properly, or at least it puts it where you told it, right or wrong. .text refers to the code itself, the machine code that is your program. note that your program, the machine code, is considered read-only. .bss is used for storage of global stuff (variables, structs, etc) that were not initialized in the program (this will be explained). .data is used for storage of global stuff that was initialized in the program. .rodata is read only data, this is global stuff that was declared to be variables or whatever but declared to be read only (const). Depending on the flavor and version of toolchain or linker script you are using .rodata might be combined in the .text segment since both are read-only segments as far as the toolchain is concerned, bugs in your code may say otherwise. so we take the fun.c program in this directory. Note the fun.c part of this example is non-functional, dont load it, dont run it. unsigned int fun2 ( unsigned int ); const unsigned int x=2; unsigned int y; unsigned int z=7; void fun ( unsigned int a ) { unsigned int n; n=5; fun2(a); fun2(x); fun2(y); fun2(z); fun2(n); } Here is the linker script used. MEMORY { calvin : ORIGIN = 0x1000, LENGTH = 0x1000 hobbes : ORIGIN = 0x2000, LENGTH = 0x1000 susie : ORIGIN = 0x3000, LENGTH = 0x1000 rosalyn : ORIGIN = 0x4000, LENGTH = 0x1000 } SECTIONS { .text : { *(.text*) } > calvin .bss : { *(.bss*) } > hobbes .rodata : { *(.rodata*) } > susie .data : { *(.data*) } > rosalyn } When compiled, linked with the simple linker script and disassembled it looks like this Disassembly of section .text: 00001000 <_start>: 1000: eb000001 bl 100c <fun> 1004: eafffffe b 1004 <_start+0x4> 00001008 <fun2>: 1008: e12fff1e bx lr 0000100c <fun>: 100c: e92d4008 push {r3, lr} 1010: ebfffffc bl 1008 <fun2> 1014: e3a00002 mov r0, #2 1018: ebfffffa bl 1008 <fun2> 101c: e59f3020 ldr r3, [pc, #32] ; 1044 <fun+0x38> 1020: e5930000 ldr r0, [r3] 1024: ebfffff7 bl 1008 <fun2> 1028: e59f3018 ldr r3, [pc, #24] ; 1048 <fun+0x3c> 102c: e5930000 ldr r0, [r3] 1030: ebfffff4 bl 1008 <fun2> 1034: e3a00005 mov r0, #5 1038: ebfffff2 bl 1008 <fun2> 103c: e8bd4008 pop {r3, lr} 1040: e12fff1e bx lr 1044: 00002000 andeq r2, r0, r0 1048: 00004000 andeq r4, r0, r0 Disassembly of section .bss: 00002000 <y>: 2000: 00000000 andeq r0, r0, r0 Disassembly of section .rodata: 00003000 <x>: 3000: 00000002 andeq r0, r0, r2 Disassembly of section .data: 00004000 <z>: 4000: 00000007 andeq r0, r0, r7 I have all the types represented const unsigned int x=2; unsigned int y; unsigned int z=7; void fun ( unsigned int a ) { unsigned int n; the variable x is declared using const, this tells the compiler that this is a variable, it has this name, I want it initialized to some value before my program starts, but I will only ever read from it I will never change this variables contents. You will find this variable end up in either .rodata or .text The variable y, is a global variable, that has not been initialized. We are supposed to be able to assume that when our program starts this variable will be initialized to zero. This variable will be found in the .bss segment. The variable z is a global variable as well, but it is initialized. We expect it to be this value when our program starts. Variable a is a parameter, it is passed in based on the compiler rules for that processor, etc. typically it lives in a register or on the stack. Lastly variable n is a local variable, it also does not have a named segment, but typically lives on the stack or in registers or both. In this case with such a simple program the optimizer completely removed the variable from having a home, the constant that we loaded the variable with then used the variable in a function call was replaced with a constant being fed right into the register used to send the parameter to a function. unsigned int n; n=5; ... fun2(n); Was optimized to a simple mov 5 and call the function: 1034: e3a00005 mov r0, #5 1038: ebfffff2 bl 1008 <fun2> The variable n's home if you will is embedded in the bits in the instruction itself (note the lower bits of that instruction). The simple linker script defined four separate memory regions, and then associated those regions with the various segment definitions. Many times in a linker script you will see the words rom and ram and flash and eeprom to define the memory regions. I intentionally used non-computer like names both in simple and in memmap, to get you over this idea that those names have any special meaning to the linker tool. This would be a mistake to think the linker knows eeprom from ram and does something for you as a result. Because of the linker script we saw that our variables landed where we told the toolchain to put them. calvin : ORIGIN = 0x1000, LENGTH = 0x1000 ... .text : { *(.text*) } > calvin results in .text starting at address 0x1000 Disassembly of section .text: 00001000 <_start>: 1000: eb000001 bl 100c <fun> hobbes : ORIGIN = 0x2000, LENGTH = 0x1000 .bss : { *(.bss*) } > hobbes results in .bss starting at address 0x2000 Disassembly of section .bss: 00002000 <y>: 2000: 00000000 andeq r0, r0, r0 and x is in .rodata in the disassembly I created above and z is in .data. note that both .text .data and .rodata segments the data in the binary are filled with non-zero values. doesnt mean you cant have some zeros there, point being the z varaiable is shown as a 7 in the binary as we wanted. if you can read the assembly you will also note that even though the compiler knows that we initialized x to a 2 and z to a 7, the code reads their values from the proper memory locations and does not optimize them away like it did with the y variable. So it appears that the .elf file we created has all the parts defined to be in all the right places. for this to work though there needs to be a progrm that reads the .elf file and places these items in memory at the right places, before allowing the program to run. This comes in may forms but can be called a loader. When running a .exe file in windows or a .elf file or other file format in linux, there is a loader in the operating system that reads this extra info in the binary file and places the bits and bytes in the right place in ram. We dont have a loader, we are running bare metal embedded here, we have to do these things ourselves. How this is solved on the raspberry pi for example is that when you use the toolchain to convert your program from a .elf to a .bin file. A .bin file is for the most part or commonly assumed to be just the bits and bytes of your program, a literal image of your memory. Now to clarify the kernel.img file for the raspberry pi may not represent memory starting at ARM's address zero, in fact these programs are compiled as .bin files to be loaded at address 0x8000, and the gpu that boots the raspberry pi does that unless told otherwise using a script file that it looks for. if you have dabbled in these things before you may have found this can be dangerous. For example what if you defined one segment to be at address 0x10000000 and another at address 0x70000000. Lets say you have 0x100 bytes at 0x10000000 and only two bytes at 0x70000000, if you want to make a single file that holds the memory image of these two segments that file will need to be 0x70000002 - 0x10000000 = 0x60000002 bytes in size, that is a huge file. all to hold 0x102 bytes. Maybe you can see why most of the time our operating systems, etc dont actually use memory images of the programs but these hybrid files which are part machine code, part raw data and part descriptions of where things are to go. Imagine the typical bare metal embedded situation. your processor powers up and boots off of a rom of some flavor (rom, prom, eprom, eeprom, flash) something that is non-volatile and as a result read only or at least for practial purposes of booting your processor that memory space is read only. The memory in your bare metal system comes up filled with random garbage because that is what the transistors that store that ram do, they have not been initialized, and there is no rule that says they have to be or have to be initialized to a specific value. Many systems use dram, which often has to be initialized in some form or fashion and as a result you might end up filling that memory with some value, or leave it with the last value you used during initialization. So we have our rom and that really needs to have .text in it, we have some ram that will no doubt be the home for .bss and .data. But we have a problem. how is the ram at the .bss and .data addresses going to get loaded with the zeros or non-zero values we are expecting? We dont have an operating system here? The answer is a bit complicated, at some point before we start using any of the .bss or .data variables in our program (which we as C programmers assumed would be zero or whatever value we initialized them to) we need to prepare that memory to meet those expectations. And we need to do it in a way that doesnt require any .bss or .data variables. even worse, the non-zero items in .data need to be saved somewhere in the non-volatile memory so that we dont lose that information, we need to somehow get that data saved in the rom and get it copied to ram in the right place. The typical solution is to have the bootstrap or startup code do this, this isnt necessarily the boot code for the processor. Even when running a program on an operating system, the solution may be to have the .bss code initialized by the first bit of asm in your program before main() is called. The toolchain often supplies this startup code, if you dont tell it not to the toolchain will use its default linker script and default startup code (which as we complicate things have an intimate relationship) it will use them. Assume we dont want to compile everything count up how many bytes are in each segment, and then hardcode thse numbers by hand into some asm, re-build, make sure the sizes and offsets have not changed, repeat until they dont, and have startup code that is custom this program. Add or remove a variable somewhere or re-arrange them in the code and you would ahve to then re-touch your startup code. Possible but not wise, the better answer is generic startup code. But to have generic startup code we need to know where all these segments are and what size they are etc. The gnu solution has two parts, first you use the linker script langauge to define some variables these variables are filled in by the linker and will ultimately contain the starting address for a segment like .bss or .data and the size and or ending address or both. In the case of .data we also need to tell the linker script two things. One is here is the non-volatile memory space we want the .data to live in when the power is off, and here is the ram address space where we want it to live when we are running, our variables are read-write they just happen to be initialized to some number on start, then we can change them in our program later. So if you look at the real example in this directory and the memmap file MEMORY { bob : ORIGIN = 0x8000, LENGTH = 0x1000 ted : ORIGIN = 0xA000, LENGTH = 0x1000 } SECTIONS { .text : { *(.text*) } > bob __data_rom_start__ = .; .data : { __data_start__ = .; *(.data*) } > ted AT > bob __data_end__ = .; __data_size__ = __data_end__ - __data_start__; .bss : { __bss_start__ = .; *(.bss*) } > bob __bss_end__ = .; __bss_size__ = __bss_end__ - __bss_start__; } you see there is more stuff in the SECTIONS section of the linker script I dont really want to explain all of it, it is fairly straightforward including the ted at bob thing we are pretending here that bob is rom or flash (where .text lives) and ted is ram, sram, dram, whatever. You have to be super careful to place these variables inside or outside of the right brackets to have it all work, this takes practice and some iterations to get right. I may not have it right, but the above works today. As mentioned way above, I intentioally did not use memory segnemt names like rom and ram to demonstrate that the linker script sees those as ascii labels and for the most part doesnt care what you call them. Now this example is strange because I wanted to try to show the problems you will face with a single program, not having to have you load more than one program, etc. Notice how above I carefully stated that you need to initialized .bss and .data at some point before you use them and not using them to get to that point. Most of the time you are going to see some sort of assembly solution in the assembly code that is used before your main() C function is called. This assembly code in haromny with the linker script variables. The __bss_start__ and such variables are addresses as far as the toolchain is concerned, not values. When developing I tried to have the program display the value of __bss_start__ by declaring it an external global variable. What the compiler did was take the address __bss_start__ read that memory location and print that value. So in vectors.s I made some other global variables and then initialized them to the other variables. These are in the .text section so they are filled in for us and .bss and .data are not required to find these values and use them to prepare .bss and .data. Where my weird solution comes in is that I dont have asm code that zeros .bss and copies .data from point a to point b. I do this in the C code late in my program. As mentioned the reason why is I want to show you that when you display these global variables before preparing memory they well, as you now expect, have the wrong value. Then once we copy and zero things they then have the right values. //display before initialized hexstring(x); hexstring(y); hexstring(z); //zero out .bss for(ra=bss_start;ra<bss_end;ra+=4) PUT32(ra,0); //copy .data from non-volatile .text to its home where the code expects it //to be. for(ra=data_start,rb=data_rom_start;ra<data_end;ra+=4,rb+=4) PUT32(ra,GET32(rb)); //display the varialbes again now that ram is prepped. hexstring(x); hexstring(y); hexstring(z); I used my serial bootloader, xmodemed the program over and ran it the last part, interesting part, of the output is: 12345678 0000A008 000082EC 0000A000 00000000 00000000 00000000 00000000 00000002 00000007 here again with comments 12345678 0000A008 this is basically __bss_start__ 000082EC __data_rom_start__ 0000A000 __data_start__ 00000000 display of the x variable before memory prep 00000000 display of the y variable before memory prep 00000000 display of the z variable before memory prep 00000000 display of x after memory prep 00000002 display of y after memory prep 00000007 display of z after memory prep In this case apparently memory was zeroed by someone, so the .bss data actually looks right even though that was just dumb luck. you could easily modify my bootloader (or I should have) to make that memory random or non-zero further demonstrating the problem. So after all of that, I repeat, I dont do this with my code. Why dont I do this? First and foremost, these days I try to write portable code. This code is not portable if you do this, you have to start messing with a gnu toolchain specific and even worse sometimes the version of binutils specific linker scripts, then your startup code that comes before the first call to a C function relies on gnu linker and linker version specific linker script variables. The linker script goes from pretty to very ugly very fast, and warrants extra explaining as to what it is doing. it is just not portable, and it is ugly. (remember beauty is in the eye of the beholder, you may find all of my code ugly, but then you probably wouldnt be reading this far down into this file if that were the case). Instead of this unsigned int fun2 ( unsigned int ); const unsigned int x=2; unsigned int y; unsigned int z=7; void fun ( unsigned int a ) { unsigned int n; n=5; fun2(a); fun2(x); fun2(y); fun2(z); fun2(n); } write your code like this: unsigned int fun2 ( unsigned int ); const unsigned int x; unsigned int y; unsigned int z; void fun ( unsigned int a ) { unsigned int n; n=5; x=2; y=0; z=7; fun2(a); fun2(x); fun2(y); fun2(z); fun2(n); } and guess what, you dont have a .data segment anymore, you can remove that from the linker script and all the baggage that goes with it. Now you do need .bss but you dont need to zero it out you just need to have it acurratly defined in the linker script to an address range that is actually ram. .rodata if your toolchain needs it, well the example was demonstrating things, I simply have .rodata also part of the same space as .text so after changing those few lines of C code I would then go from this MEMORY { bob : ORIGIN = 0x8000, LENGTH = 0x1000 ted : ORIGIN = 0xA000, LENGTH = 0x1000 } SECTIONS { .text : { *(.text*) } > bob .rodata : { *(.rodata*) } > bob __data_rom_start__ = .; .data : { __data_start__ = .; *(.data*) } > ted AT > bob __data_end__ = .; __data_size__ = __data_end__ - __data_start__; .bss : { __bss_start__ = .; *(.bss*) } > ted __bss_end__ = .; __bss_size__ = __bss_end__ - __bss_start__; } to this MEMORY { bob : ORIGIN = 0x8000, LENGTH = 0x1000 ted : ORIGIN = 0xA000, LENGTH = 0x1000 } SECTIONS { .text : { *(.text*) } > bob .rodata : { *(.rodata*) } > bob .bss : { *(.bss*) } > ted } and painfully simple startup code mov sp,#0x8000 mov r0,pc bl notmain yes there is a cost. Some of those initializations that are not in .text can take up more room than they used to. Worst case for these 32 bit or smaller variables is you have one instruction that gets the value from .text, one instruction that gets the address for it in ram, an instruction that writes the value to ram. Plus a location in .text to hold the address in ram for that variable and a location to hold the constant we want to write to it, kind of like this 1010: e59f503c ldr r5, [pc, #60] ; 1054 <fun+0x48> 1014: e59f403c ldr r4, [pc, #60] ; 1058 <fun+0x4c> 1018: e3a03000 mov r3, #0 101c: e5853000 str r3, [r5] 1020: e3a03007 mov r3, #7 1024: e5843000 str r3, [r4] 1054: 00002004 andeq r2, r0, r4 1058: 00002000 andeq r2, r0, r0 because this example used small variables the mov r3,#0 for example was capable of holding the constant in the instruction encoding itself. Same for the #7 but had it been some other number say z = 0x1234; 1010: e59f503c ldr r5, [pc, #60] ; 1054 <fun+0x48> 1014: e3a03000 mov r3, #0 1018: e59f4038 ldr r4, [pc, #56] ; 1058 <fun+0x4c> 101c: e5853000 str r3, [r5] 1020: e59f3034 ldr r3, [pc, #52] ; 105c <fun+0x50> 1024: e5843000 str r3, [r4] 1054: 00002004 andeq r2, r0, r4 1058: 00002000 andeq r2, r0, r0 105c: 00001234 andeq r1, r0, r4, lsr r2 For this particular processor family, other processors like x86 manage constants differently... Now the two locations in .text for example 1054: 00002004 andeq r2, r0, r4 1058: 00002000 andeq r2, r0, r0 Are not additional costs because those would have been used by the code that reads the variables as well (I have .bss and .data separate here) 101c: e59f3020 ldr r3, [pc, #32] ; 1044 <fun+0x38> 1020: e5930000 ldr r0, [r3] 1024: ebfffff7 bl 1008 <fun2> 1028: e59f3018 ldr r3, [pc, #24] ; 1048 <fun+0x3c> 102c: e5930000 ldr r0, [r3] 1030: ebfffff4 bl 1008 <fun2> 1044: 00002000 andeq r2, r0, r0 1048: 00004000 andeq r4, r0, r0 The point here is that the address to each of these variables still took up the same amount of .text space. What we didnt have when we used a .data and assumed .bss was zeroed for us, is the code to initialize each variable one at a time. there would have been a small loop for .bss and a small loop for .data, if .bss and/or .data were of any decent size then there is a lot less waste. Another thing that may be gnawing at you is that this whole thing is about global variables. Raise your hand if you use global variables. Many folks go out of their way not to. I happen to use them from time to time, used to always and only use them. But now it is a bit of a mixture. Local variables you have to initialize inline one at a time and that is as costly as the solution I am proposing, so you are already likely programming using that one at a time solution. So you are already in tune with my solution to this .bss and .data problem. The most important thing though is when you use local variables and do those initializations locally, and manage the size of your functions. The optimizer (if you use it) will remove a lot of this extra code and memory. for example: unsigned int fun2 ( unsigned int ); const unsigned int x=2; unsigned int y; unsigned int z=7; void fun ( unsigned int a ) { unsigned int n; n=5; fun2(a); fun2(x); fun2(y); fun2(z); fun2(n); } the variable x is a read-only variable. variable n is local and only used to feed the fun2() function. 1014: e3a00002 mov r0, #2 1018: ebfffffa bl 1008 <fun2> 1034: e3a00005 mov r0, #5 1038: ebfffff2 bl 1008 <fun2> The compiler did not waste the .text space and clock cycles to fetch x from rom, it simply encoded it inline. Likewise the local variable n did not consume stack space, there was no stack frame created at all in fact, the value was encoded directly in the instruciton as well. When you use globals you can see that it has to get the address then read the contents of that address then it can do something with your variable. If you change the variable it can go through those steps to save the variable. This whole example and lengthy README is here to hopefully help you to realize when you take one of my examples: unsigned int fun2 ( unsigned int ); const unsigned int x=2; unsigned int y; unsigned int z; void fun ( unsigned int a ) { unsigned int n; y=0; z=2; n=5; fun2(a); fun2(x); fun2(y); fun2(z); fun2(n); } And start adding things or changing things: unsigned int fun2 ( unsigned int ); const unsigned int x=2; unsigned int y; unsigned int z; unsigned int m=12; void fun ( unsigned int a ) { unsigned int n; y=0; z=2; n=5; fun2(a); fun2(x); fun2(y); fun2(z); fun2(n); fun2(m); } And then spend a sleepless night or weekend struggling to understand why m is not 12 when used in the code...Well now you know. And now you know why I dont do it (not all the reasons but some), you are welcome to do your own thing. And now you know what my statement in the top level readme is all about. UPDATE: Since the ARM runs completely out of ram on the raspberry pi and usually there is no reason to split the different segments around we can pack them all up, here is a simple solution for this platform. bootstrap.s .globl _start _start: mov sp,#0x00010000 bl notmain hang: b hang notmain.c const unsigned int readonly=7; unsigned int dotdata=9; unsigned int dotbss[16]; void notmain ( void ) { dotbss[3]+=readonly; } lscript MEMORY { ram : ORIGIN = 0x8000, LENGTH = 0x18000 } SECTIONS { .text : { *(.text*) } > ram .bss : { *(.bss*) } > ram .rodata : { *(.rodata*) } > ram .data : { *(.data*) } > ram } > arm-none-eabi-as bootstrap.s -o bootstrap.o > arm-none-eabi-gcc -O2 -c notmain.c -o notmain.o > arm-none-eabi-ld -T lscript bootstrap.o notmain.o -o hello.elf > arm-none-eabi-objdump -D hello.elf hello.elf: file format elf32-littlearm Disassembly of section .text: 00008000 <_start>: 8000: e3a0d801 mov sp, #65536 ; 0x10000 8004: eb000000 bl 800c <notmain> 00008008 <hang>: 8008: eafffffe b 8008 <hang> 0000800c <notmain>: 800c: e59f300c ldr r3, [pc, #12] ; 8020 <notmain+0x14> 8010: e593200c ldr r2, [r3, #12] 8014: e2822007 add r2, r2, #7 8018: e583200c str r2, [r3, #12] 801c: e12fff1e bx lr 8020: 00008024 andeq r8, r0, r4, lsr #32 Disassembly of section .bss: 00008024 <dotbss>: ... Disassembly of section .rodata: 00008064 <readonly>: 8064: 00000007 andeq r0, r0, r7 Disassembly of section .data: 00008068 <dotdata>: 8068: 00000009 andeq r0, r0, r9 > arm-none-eabi-objcopy hello.elf -O binary kernel.img > ls -al kernel.img -rwxr-xr-x 1 root root 108 Sep 23 20:47 kernel.img > hexdump -C kernel.img 00000000 01 d8 a0 e3 00 00 00 eb fe ff ff ea 0c 30 9f e5 |.............0..| 00000010 0c 20 93 e5 07 20 82 e2 0c 20 83 e5 1e ff 2f e1 |. ... ... ..../.| 00000020 24 80 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |$...............| 00000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00000060 00 00 00 00 07 00 00 00 09 00 00 00 |............| 0000006c This: MEMORY { ram : ORIGIN = 0x8000, LENGTH = 0x18000 } SECTIONS { .text : { *(.text*) } > ram .bss : { *(.bss*) } > ram .rodata : { *(.rodata*) } > ram .data : { *(.data*) } > ram } Is not nearly as ugly as this: SECTIONS { .text : { *(.text*) } > bob __data_rom_start__ = .; .data : { __data_start__ = .; *(.data*) } > ted AT > bob __data_end__ = .; __data_size__ = __data_end__ - __data_start__; .bss : { __bss_start__ = .; *(.bss*) } > bob __bss_end__ = .; __bss_size__ = __bss_end__ - __bss_start__; } Both are using compiler/linker tricks to reach a goal. The less ugly one gives you everything you want, you get your .bss code already zeroed, you get .data where you can use it. With that simpler linker script "all you have to do is" make sure that you have at least one .data item or .rodata item so that objcopy is forced to place them after .bss in the image and forced to pad .bss with zeros in the image in order to place .data and/or .rodata in the right place. You can use this on the Raspberry Pi and it will work just fine, on other embedded platforms where you have novolatile memory (rom/flash) for booting the code and a separate place for ram and you want to keep your code in rom and data in ram, you have to use the more ugly solutions or do as I do and simply dont have .data and dont care if .bss is zeroed.