forked from dwelch67/raspberrypi
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME
852 lines (693 loc) · 29.1 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
See the top level README for information on where to find the
schematic and programmers reference manual for the ARM processor
on the raspberry pi. Also find information on how to load and run
these programs.
Based on uart02, the purpose of this example is to demonstrate what
you would need to do if you assume .bss is zeros or use .data. And
you are asking, what does that even mean? As stated in the top level
README I personally write code so I dont have to mess with what I am
about to show you.
Although not carved in stone, many toolchains (assembler, compiler (C),
linker) use terms .text and .data and .bss to describe where things
go in a binary created by the toolchain. Now I have seen other names
for segments, you will just have to translate.
These are all chunks of memory if you want to think of it that way, esp
with bare metal embedded you will eventually find a system where some
of the memory range is a rom, and your program is there and some of the
memory range is ram and you want your read/write variables there, etc.
The toolchain keeps these segments of memory separate from each other
and the linker places these items in the binary depending on what the
linker is told to do through a configuraiton file, script, command line,
whatever mechanism. What kind of binary output is affected by this
as well. How can there be more than one kind of binary output file?
Most of the "binary" files we run today are more than just the machine
code and data that makes up our program. The files tend to have some
sort of header so we can detect that is what they are, if you look at
a windows/microsoft .exe file the first to letters are or at least used
to be MZ, an elf file popular with linux starts with the letters ELF.
Then depending on the file format there are lots of things that might
be in the file, for example a bunch of stuff related to debugging, if
you compile for debugging or compile with debugging symbols the file
can have extra info to help the debugger find things in the code to show
you on the debugger gui to allow you to understand where you are in
the higher level source code (the binary file contains machine code).
You will see in the disassembly below of a .elf file, there are some
global names like _start and fun, etc. These strings are in the
elf binary just in case we want to do things like disassemble. Otherwise
without those symbols in the .elf file all we would see are some
hex numbers, no ascii names. Depending on the binary format and how
you liked things each segment may be in separate parts of the binary
file, and the binary file would have information for the loader to
place these things at the right addresses so that the code will run
properly, or at least it puts it where you told it, right or wrong.
.text refers to the code itself, the machine code that is your program.
note that your program, the machine code, is considered read-only.
.bss is used for storage of global stuff (variables, structs, etc)
that were not initialized in the program (this will be explained).
.data is used for storage of global stuff that was initialized in the
program.
.rodata is read only data, this is global stuff that was declared to
be variables or whatever but declared to be read only (const). Depending
on the flavor and version of toolchain or linker script you are using
.rodata might be combined in the .text segment since both are read-only
segments as far as the toolchain is concerned, bugs in your code may
say otherwise.
so we take the fun.c program in this directory. Note the fun.c part
of this example is non-functional, dont load it, dont run it.
unsigned int fun2 ( unsigned int );
const unsigned int x=2;
unsigned int y;
unsigned int z=7;
void fun ( unsigned int a )
{
unsigned int n;
n=5;
fun2(a);
fun2(x);
fun2(y);
fun2(z);
fun2(n);
}
Here is the linker script used.
MEMORY
{
calvin : ORIGIN = 0x1000, LENGTH = 0x1000
hobbes : ORIGIN = 0x2000, LENGTH = 0x1000
susie : ORIGIN = 0x3000, LENGTH = 0x1000
rosalyn : ORIGIN = 0x4000, LENGTH = 0x1000
}
SECTIONS
{
.text : { *(.text*) } > calvin
.bss : { *(.bss*) } > hobbes
.rodata : { *(.rodata*) } > susie
.data : { *(.data*) } > rosalyn
}
When compiled, linked with the simple linker script and disassembled it
looks like this
Disassembly of section .text:
00001000 <_start>:
1000: eb000001 bl 100c <fun>
1004: eafffffe b 1004 <_start+0x4>
00001008 <fun2>:
1008: e12fff1e bx lr
0000100c <fun>:
100c: e92d4008 push {r3, lr}
1010: ebfffffc bl 1008 <fun2>
1014: e3a00002 mov r0, #2
1018: ebfffffa bl 1008 <fun2>
101c: e59f3020 ldr r3, [pc, #32] ; 1044 <fun+0x38>
1020: e5930000 ldr r0, [r3]
1024: ebfffff7 bl 1008 <fun2>
1028: e59f3018 ldr r3, [pc, #24] ; 1048 <fun+0x3c>
102c: e5930000 ldr r0, [r3]
1030: ebfffff4 bl 1008 <fun2>
1034: e3a00005 mov r0, #5
1038: ebfffff2 bl 1008 <fun2>
103c: e8bd4008 pop {r3, lr}
1040: e12fff1e bx lr
1044: 00002000 andeq r2, r0, r0
1048: 00004000 andeq r4, r0, r0
Disassembly of section .bss:
00002000 <y>:
2000: 00000000 andeq r0, r0, r0
Disassembly of section .rodata:
00003000 <x>:
3000: 00000002 andeq r0, r0, r2
Disassembly of section .data:
00004000 <z>:
4000: 00000007 andeq r0, r0, r7
I have all the types represented
const unsigned int x=2;
unsigned int y;
unsigned int z=7;
void fun ( unsigned int a )
{
unsigned int n;
the variable x is declared using const, this tells the compiler that
this is a variable, it has this name, I want it initialized to some
value before my program starts, but I will only ever read from it I
will never change this variables contents. You will find this variable
end up in either .rodata or .text
The variable y, is a global variable, that has not been initialized. We
are supposed to be able to assume that when our program starts this
variable will be initialized to zero. This variable will be found in
the .bss segment.
The variable z is a global variable as well, but it is initialized. We
expect it to be this value when our program starts.
Variable a is a parameter, it is passed in based on the compiler rules
for that processor, etc. typically it lives in a register or on the
stack.
Lastly variable n is a local variable, it also does not have a named
segment, but typically lives on the stack or in registers or both. In
this case with such a simple program the optimizer completely removed
the variable from having a home, the constant that we loaded the variable
with then used the variable in a function call was replaced with a constant
being fed right into the register used to send the parameter to a function.
unsigned int n;
n=5;
...
fun2(n);
Was optimized to a simple mov 5 and call the function:
1034: e3a00005 mov r0, #5
1038: ebfffff2 bl 1008 <fun2>
The variable n's home if you will is embedded in the bits in the
instruction itself (note the lower bits of that instruction).
The simple linker script defined four separate memory regions, and then
associated those regions with the various segment definitions. Many times
in a linker script you will see the words rom and ram and flash and eeprom
to define the memory regions. I intentionally used non-computer like
names both in simple and in memmap, to get you over this idea that those
names have any special meaning to the linker tool. This would be a
mistake to think the linker knows eeprom from ram and does something
for you as a result.
Because of the linker script we saw that our variables landed where we
told the toolchain to put them.
calvin : ORIGIN = 0x1000, LENGTH = 0x1000
...
.text : { *(.text*) } > calvin
results in .text starting at address 0x1000
Disassembly of section .text:
00001000 <_start>:
1000: eb000001 bl 100c <fun>
hobbes : ORIGIN = 0x2000, LENGTH = 0x1000
.bss : { *(.bss*) } > hobbes
results in .bss starting at address 0x2000
Disassembly of section .bss:
00002000 <y>:
2000: 00000000 andeq r0, r0, r0
and x is in .rodata in the disassembly I created above
and z is in .data.
note that both .text .data and .rodata segments the data in the binary
are filled with non-zero values. doesnt mean you cant have some zeros
there, point being the z varaiable is shown as a 7 in the binary as we
wanted.
if you can read the assembly you will also note that even though the
compiler knows that we initialized x to a 2 and z to a 7, the code
reads their values from the proper memory locations and does not
optimize them away like it did with the y variable.
So it appears that the .elf file we created has all the parts defined
to be in all the right places. for this to work though there needs
to be a progrm that reads the .elf file and places these items in
memory at the right places, before allowing the program to run. This
comes in may forms but can be called a loader. When running a .exe
file in windows or a .elf file or other file format in linux, there is
a loader in the operating system that reads this extra info in the
binary file and places the bits and bytes in the right place in ram.
We dont have a loader, we are running bare metal embedded here, we have
to do these things ourselves. How this is solved on the raspberry pi
for example is that when you use the toolchain to convert your program
from a .elf to a .bin file. A .bin file is for the most part or commonly
assumed to be just the bits and bytes of your program, a literal image
of your memory. Now to clarify the kernel.img file for the raspberry
pi may not represent memory starting at ARM's address zero, in fact these
programs are compiled as .bin files to be loaded at address 0x8000, and
the gpu that boots the raspberry pi does that unless told otherwise
using a script file that it looks for. if you have dabbled in these
things before you may have found this can be dangerous. For example
what if you defined one segment to be at address 0x10000000 and another
at address 0x70000000. Lets say you have 0x100 bytes at 0x10000000
and only two bytes at 0x70000000, if you want to make a single file
that holds the memory image of these two segments that file will need
to be 0x70000002 - 0x10000000 = 0x60000002 bytes in size, that is a huge
file. all to hold 0x102 bytes. Maybe you can see why most of the time
our operating systems, etc dont actually use memory images of the
programs but these hybrid files which are part machine code, part raw
data and part descriptions of where things are to go.
Imagine the typical bare metal embedded situation. your processor
powers up and boots off of a rom of some flavor (rom, prom, eprom, eeprom,
flash) something that is non-volatile and as a result read only or at
least for practial purposes of booting your processor that memory
space is read only. The memory in your bare metal system comes up filled
with random garbage because that is what the transistors that store that
ram do, they have not been initialized, and there is no rule that says
they have to be or have to be initialized to a specific value. Many
systems use dram, which often has to be initialized in some form or
fashion and as a result you might end up filling that memory with some
value, or leave it with the last value you used during initialization.
So we have our rom and that really needs to have .text in it, we have
some ram that will no doubt be the home for .bss and .data. But we have
a problem. how is the ram at the .bss and .data addresses going to
get loaded with the zeros or non-zero values we are expecting? We dont
have an operating system here? The answer is a bit complicated, at some
point before we start using any of the .bss or .data variables in our
program (which we as C programmers assumed would be zero or whatever
value we initialized them to) we need to prepare that memory to meet
those expectations. And we need to do it in a way that doesnt require
any .bss or .data variables. even worse, the non-zero items in .data
need to be saved somewhere in the non-volatile memory so that we dont
lose that information, we need to somehow get that data saved in the
rom and get it copied to ram in the right place.
The typical solution is to have the bootstrap or startup code do this,
this isnt necessarily the boot code for the processor. Even when running
a program on an operating system, the solution may be to have the .bss
code initialized by the first bit of asm in your program before main()
is called. The toolchain often supplies this startup code, if you dont
tell it not to the toolchain will use its default linker script and
default startup code (which as we complicate things have an intimate
relationship) it will use them. Assume we dont want to compile everything
count up how many bytes are in each segment, and then hardcode thse
numbers by hand into some asm, re-build, make sure the sizes and offsets
have not changed, repeat until they dont, and have startup code that
is custom this program. Add or remove a variable somewhere or re-arrange
them in the code and you would ahve to then re-touch your startup code.
Possible but not wise, the better answer is generic startup code. But
to have generic startup code we need to know where all these segments
are and what size they are etc. The gnu solution has two parts, first
you use the linker script langauge to define some variables these
variables are filled in by the linker and will ultimately contain the
starting address for a segment like .bss or .data and the size and
or ending address or both. In the case of .data we also need to tell
the linker script two things. One is here is the non-volatile memory
space we want the .data to live in when the power is off, and here
is the ram address space where we want it to live when we are running,
our variables are read-write they just happen to be initialized to
some number on start, then we can change them in our program later.
So if you look at the real example in this directory and the memmap
file
MEMORY
{
bob : ORIGIN = 0x8000, LENGTH = 0x1000
ted : ORIGIN = 0xA000, LENGTH = 0x1000
}
SECTIONS
{
.text : { *(.text*) } > bob
__data_rom_start__ = .;
.data : {
__data_start__ = .;
*(.data*)
} > ted AT > bob
__data_end__ = .;
__data_size__ = __data_end__ - __data_start__;
.bss : {
__bss_start__ = .;
*(.bss*)
} > bob
__bss_end__ = .;
__bss_size__ = __bss_end__ - __bss_start__;
}
you see there is more stuff in the SECTIONS section of the linker script
I dont really want to explain all of it, it is fairly straightforward
including the ted at bob thing we are pretending here that bob is rom
or flash (where .text lives) and ted is ram, sram, dram, whatever. You
have to be super careful to place these variables inside or outside
of the right brackets to have it all work, this takes practice and some
iterations to get right. I may not have it right, but the above works
today. As mentioned way above, I intentioally did not use memory segnemt
names like rom and ram to demonstrate that the linker script sees those
as ascii labels and for the most part doesnt care what you call them.
Now this example is strange because I wanted to try to show the problems
you will face with a single program, not having to have you load more
than one program, etc. Notice how above I carefully stated that you need
to initialized .bss and .data at some point before you use them and not
using them to get to that point. Most of the time you are going to see
some sort of assembly solution in the assembly code that is used before
your main() C function is called. This assembly code in haromny with
the linker script variables. The __bss_start__ and such variables are
addresses as far as the toolchain is concerned, not values. When developing
I tried to have the program display the value of __bss_start__ by
declaring it an external global variable. What the compiler did was take
the address __bss_start__ read that memory location and print that
value. So in vectors.s I made some other global variables and then
initialized them to the other variables. These are in the .text section
so they are filled in for us and .bss and .data are not required to
find these values and use them to prepare .bss and .data. Where my
weird solution comes in is that I dont have asm code that zeros .bss
and copies .data from point a to point b. I do this in the C code late
in my program. As mentioned the reason why is I want to show you that
when you display these global variables before preparing memory they
well, as you now expect, have the wrong value. Then once we copy
and zero things they then have the right values.
//display before initialized
hexstring(x);
hexstring(y);
hexstring(z);
//zero out .bss
for(ra=bss_start;ra<bss_end;ra+=4) PUT32(ra,0);
//copy .data from non-volatile .text to its home where the code expects it
//to be.
for(ra=data_start,rb=data_rom_start;ra<data_end;ra+=4,rb+=4) PUT32(ra,GET32(rb));
//display the varialbes again now that ram is prepped.
hexstring(x);
hexstring(y);
hexstring(z);
I used my serial bootloader, xmodemed the program over and ran it
the last part, interesting part, of the output is:
12345678
0000A008
000082EC
0000A000
00000000
00000000
00000000
00000000
00000002
00000007
here again with comments
12345678
0000A008 this is basically __bss_start__
000082EC __data_rom_start__
0000A000 __data_start__
00000000 display of the x variable before memory prep
00000000 display of the y variable before memory prep
00000000 display of the z variable before memory prep
00000000 display of x after memory prep
00000002 display of y after memory prep
00000007 display of z after memory prep
In this case apparently memory was zeroed by someone, so the .bss
data actually looks right even though that was just dumb luck. you could
easily modify my bootloader (or I should have) to make that memory
random or non-zero further demonstrating the problem.
So after all of that, I repeat, I dont do this with my code. Why dont
I do this? First and foremost, these days I try to write portable code.
This code is not portable if you do this, you have to start messing with
a gnu toolchain specific and even worse sometimes the version of binutils
specific linker scripts, then your startup code that comes before the
first call to a C function relies on gnu linker and linker version specific
linker script variables. The linker script goes from pretty to very
ugly very fast, and warrants extra explaining as to what it is doing.
it is just not portable, and it is ugly. (remember beauty is in the
eye of the beholder, you may find all of my code ugly, but then you
probably wouldnt be reading this far down into this file if that were
the case). Instead of this
unsigned int fun2 ( unsigned int );
const unsigned int x=2;
unsigned int y;
unsigned int z=7;
void fun ( unsigned int a )
{
unsigned int n;
n=5;
fun2(a);
fun2(x);
fun2(y);
fun2(z);
fun2(n);
}
write your code like this:
unsigned int fun2 ( unsigned int );
const unsigned int x;
unsigned int y;
unsigned int z;
void fun ( unsigned int a )
{
unsigned int n;
n=5;
x=2;
y=0;
z=7;
fun2(a);
fun2(x);
fun2(y);
fun2(z);
fun2(n);
}
and guess what, you dont have a .data segment anymore, you can remove
that from the linker script and all the baggage that goes with it. Now
you do need .bss but you dont need to zero it out you just need to
have it acurratly defined in the linker script to an address range that
is actually ram. .rodata if your toolchain needs it, well the example
was demonstrating things, I simply have .rodata also part of the
same space as .text so after changing those few lines of C code I would
then go from this
MEMORY
{
bob : ORIGIN = 0x8000, LENGTH = 0x1000
ted : ORIGIN = 0xA000, LENGTH = 0x1000
}
SECTIONS
{
.text : { *(.text*) } > bob
.rodata : { *(.rodata*) } > bob
__data_rom_start__ = .;
.data : {
__data_start__ = .;
*(.data*)
} > ted AT > bob
__data_end__ = .;
__data_size__ = __data_end__ - __data_start__;
.bss : {
__bss_start__ = .;
*(.bss*)
} > ted
__bss_end__ = .;
__bss_size__ = __bss_end__ - __bss_start__;
}
to this
MEMORY
{
bob : ORIGIN = 0x8000, LENGTH = 0x1000
ted : ORIGIN = 0xA000, LENGTH = 0x1000
}
SECTIONS
{
.text : { *(.text*) } > bob
.rodata : { *(.rodata*) } > bob
.bss : { *(.bss*) } > ted
}
and painfully simple startup code
mov sp,#0x8000
mov r0,pc
bl notmain
yes there is a cost. Some of those initializations that are not in
.text can take up more room than they used to. Worst case for these
32 bit or smaller variables is you have one instruction that gets the
value from .text, one instruction that gets the address for it in ram,
an instruction that writes the value to ram. Plus a location in .text
to hold the address in ram for that variable and a location to hold
the constant we want to write to it, kind of like this
1010: e59f503c ldr r5, [pc, #60] ; 1054 <fun+0x48>
1014: e59f403c ldr r4, [pc, #60] ; 1058 <fun+0x4c>
1018: e3a03000 mov r3, #0
101c: e5853000 str r3, [r5]
1020: e3a03007 mov r3, #7
1024: e5843000 str r3, [r4]
1054: 00002004 andeq r2, r0, r4
1058: 00002000 andeq r2, r0, r0
because this example used small variables the mov r3,#0 for example was
capable of holding the constant in the instruction encoding itself.
Same for the #7 but had it been some other number say z = 0x1234;
1010: e59f503c ldr r5, [pc, #60] ; 1054 <fun+0x48>
1014: e3a03000 mov r3, #0
1018: e59f4038 ldr r4, [pc, #56] ; 1058 <fun+0x4c>
101c: e5853000 str r3, [r5]
1020: e59f3034 ldr r3, [pc, #52] ; 105c <fun+0x50>
1024: e5843000 str r3, [r4]
1054: 00002004 andeq r2, r0, r4
1058: 00002000 andeq r2, r0, r0
105c: 00001234 andeq r1, r0, r4, lsr r2
For this particular processor family, other processors like x86 manage
constants differently...
Now the two locations in .text for example
1054: 00002004 andeq r2, r0, r4
1058: 00002000 andeq r2, r0, r0
Are not additional costs because those would have been used by the code
that reads the variables as well (I have .bss and .data separate here)
101c: e59f3020 ldr r3, [pc, #32] ; 1044 <fun+0x38>
1020: e5930000 ldr r0, [r3]
1024: ebfffff7 bl 1008 <fun2>
1028: e59f3018 ldr r3, [pc, #24] ; 1048 <fun+0x3c>
102c: e5930000 ldr r0, [r3]
1030: ebfffff4 bl 1008 <fun2>
1044: 00002000 andeq r2, r0, r0
1048: 00004000 andeq r4, r0, r0
The point here is that the address to each of these variables still took
up the same amount of .text space. What we didnt have when we used
a .data and assumed .bss was zeroed for us, is the code to initialize
each variable one at a time. there would have been a small loop for .bss
and a small loop for .data, if .bss and/or .data were of any decent size
then there is a lot less waste.
Another thing that may be gnawing at you is that this whole thing is
about global variables. Raise your hand if you use global variables.
Many folks go out of their way not to. I happen to use them from time
to time, used to always and only use them. But now it is a bit of
a mixture. Local variables you have to initialize inline one at a time
and that is as costly as the solution I am proposing, so you are already
likely programming using that one at a time solution. So you are already
in tune with my solution to this .bss and .data problem.
The most important thing though is when you use local variables and
do those initializations locally, and manage the size of your functions.
The optimizer (if you use it) will remove a lot of this extra code and
memory.
for example:
unsigned int fun2 ( unsigned int );
const unsigned int x=2;
unsigned int y;
unsigned int z=7;
void fun ( unsigned int a )
{
unsigned int n;
n=5;
fun2(a);
fun2(x);
fun2(y);
fun2(z);
fun2(n);
}
the variable x is a read-only variable. variable n is local and only
used to feed the fun2() function.
1014: e3a00002 mov r0, #2
1018: ebfffffa bl 1008 <fun2>
1034: e3a00005 mov r0, #5
1038: ebfffff2 bl 1008 <fun2>
The compiler did not waste the .text space and clock cycles to fetch
x from rom, it simply encoded it inline. Likewise the local variable
n did not consume stack space, there was no stack frame created at all
in fact, the value was encoded directly in the instruciton as well.
When you use globals you can see that it has to get the address then
read the contents of that address then it can do something with your
variable. If you change the variable it can go through those steps
to save the variable.
This whole example and lengthy README is here to hopefully help you
to realize when you take one of my examples:
unsigned int fun2 ( unsigned int );
const unsigned int x=2;
unsigned int y;
unsigned int z;
void fun ( unsigned int a )
{
unsigned int n;
y=0;
z=2;
n=5;
fun2(a);
fun2(x);
fun2(y);
fun2(z);
fun2(n);
}
And start adding things or changing things:
unsigned int fun2 ( unsigned int );
const unsigned int x=2;
unsigned int y;
unsigned int z;
unsigned int m=12;
void fun ( unsigned int a )
{
unsigned int n;
y=0;
z=2;
n=5;
fun2(a);
fun2(x);
fun2(y);
fun2(z);
fun2(n);
fun2(m);
}
And then spend a sleepless night or weekend struggling to understand
why m is not 12 when used in the code...Well now you know. And now
you know why I dont do it (not all the reasons but some), you are
welcome to do your own thing. And now you know what my statement in
the top level readme is all about.
UPDATE:
Since the ARM runs completely out of ram on the raspberry pi and usually
there is no reason to split the different segments around we can pack
them all up, here is a simple solution for this platform.
bootstrap.s
.globl _start
_start:
mov sp,#0x00010000
bl notmain
hang: b hang
notmain.c
const unsigned int readonly=7;
unsigned int dotdata=9;
unsigned int dotbss[16];
void notmain ( void )
{
dotbss[3]+=readonly;
}
lscript
MEMORY
{
ram : ORIGIN = 0x8000, LENGTH = 0x18000
}
SECTIONS
{
.text : { *(.text*) } > ram
.bss : { *(.bss*) } > ram
.rodata : { *(.rodata*) } > ram
.data : { *(.data*) } > ram
}
> arm-none-eabi-as bootstrap.s -o bootstrap.o
> arm-none-eabi-gcc -O2 -c notmain.c -o notmain.o
> arm-none-eabi-ld -T lscript bootstrap.o notmain.o -o hello.elf
> arm-none-eabi-objdump -D hello.elf
hello.elf: file format elf32-littlearm
Disassembly of section .text:
00008000 <_start>:
8000: e3a0d801 mov sp, #65536 ; 0x10000
8004: eb000000 bl 800c <notmain>
00008008 <hang>:
8008: eafffffe b 8008 <hang>
0000800c <notmain>:
800c: e59f300c ldr r3, [pc, #12] ; 8020 <notmain+0x14>
8010: e593200c ldr r2, [r3, #12]
8014: e2822007 add r2, r2, #7
8018: e583200c str r2, [r3, #12]
801c: e12fff1e bx lr
8020: 00008024 andeq r8, r0, r4, lsr #32
Disassembly of section .bss:
00008024 <dotbss>:
...
Disassembly of section .rodata:
00008064 <readonly>:
8064: 00000007 andeq r0, r0, r7
Disassembly of section .data:
00008068 <dotdata>:
8068: 00000009 andeq r0, r0, r9
> arm-none-eabi-objcopy hello.elf -O binary kernel.img
> ls -al kernel.img
-rwxr-xr-x 1 root root 108 Sep 23 20:47 kernel.img
> hexdump -C kernel.img
00000000 01 d8 a0 e3 00 00 00 eb fe ff ff ea 0c 30 9f e5 |.............0..|
00000010 0c 20 93 e5 07 20 82 e2 0c 20 83 e5 1e ff 2f e1 |. ... ... ..../.|
00000020 24 80 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |$...............|
00000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000060 00 00 00 00 07 00 00 00 09 00 00 00 |............|
0000006c
This:
MEMORY
{
ram : ORIGIN = 0x8000, LENGTH = 0x18000
}
SECTIONS
{
.text : { *(.text*) } > ram
.bss : { *(.bss*) } > ram
.rodata : { *(.rodata*) } > ram
.data : { *(.data*) } > ram
}
Is not nearly as ugly as this:
SECTIONS
{
.text : { *(.text*) } > bob
__data_rom_start__ = .;
.data : {
__data_start__ = .;
*(.data*)
} > ted AT > bob
__data_end__ = .;
__data_size__ = __data_end__ - __data_start__;
.bss : {
__bss_start__ = .;
*(.bss*)
} > bob
__bss_end__ = .;
__bss_size__ = __bss_end__ - __bss_start__;
}
Both are using compiler/linker tricks to reach a goal. The less
ugly one gives you everything you want, you get your .bss code already
zeroed, you get .data where you can use it. With that simpler
linker script "all you have to do is" make sure that you have at least
one .data item or .rodata item so that objcopy is forced to place them
after .bss in the image and forced to pad .bss with zeros in the image
in order to place .data and/or .rodata in the right place.
You can use this on the Raspberry Pi and it will work just fine, on other
embedded platforms where you have novolatile memory (rom/flash) for
booting the code and a separate place for ram and you want to keep your
code in rom and data in ram, you have to use the more ugly solutions or
do as I do and simply dont have .data and dont care if .bss is zeroed.