Skip to content

Commit e5fab33

Browse files
committed
Replace exchange_malloc/exchange_free demo with something unrelated to Box.
Precursor for landing overloaded-`box`, since that will decouple the `box` syntax from the exchange heap (and in fact will eliminate the use of the two aforementioned lang items). Instead, the new demonstration program shows a definition of the `str_eq` lang item. (We do not have that many procedural lang-items to choose from, which is a good sign for our efforts to decouple the compiler from the runtime!) (This previously used a demo of `panic_bounds_check`, but a `str_eq` demonstration is both easier to code and arguably a more interesting aspect of the language to discuss.)
1 parent 14f0942 commit e5fab33

File tree

1 file changed

+124
-31
lines changed

1 file changed

+124
-31
lines changed

src/doc/trpl/unsafe.md

+124-31
Original file line numberDiff line numberDiff line change
@@ -649,69 +649,162 @@ it exists. The marker is the attribute `#[lang="..."]` and there are
649649
various different values of `...`, i.e. various different 'lang
650650
items'.
651651

652-
For example, `Box` pointers require two lang items, one for allocation
653-
and one for deallocation. A freestanding program that uses the `Box`
654-
sugar for dynamic allocations via `malloc` and `free`:
652+
For example, there are lang items related to the implementation of
653+
string slices (`&str`); one of these is `str_eq`, which implements the
654+
equivalence relation on two string slices. This is a lang item because
655+
string equivalence is used for more than just the `==` operator; in
656+
particular, it is also used when pattern matching string literals.
657+
658+
A freestanding program that provides its own definition of the
659+
`str_eq` lang item, with a slightly different semantics than
660+
usual in Rust:
655661

656662
```
657-
#![feature(lang_items, box_syntax, start, no_std)]
663+
#![feature(lang_items, intrinsics, start, no_std)]
658664
#![no_std]
659665
660-
extern crate libc;
666+
// Our str_eq lang item; it normalizes ASCII letters to lowercase.
667+
#[lang="str_eq"]
668+
fn eq_slice(s1: &str, s2: &str) -> bool {
669+
unsafe {
670+
let (p1, s1_len) = str::repr(s1);
671+
let (p2, s2_len) = str::repr(s2);
661672
662-
extern {
663-
fn abort() -> !;
664-
}
673+
if s1_len != s2_len { return false; }
674+
675+
let mut i = 0;
676+
while i < s1_len {
677+
let b1 = str::at_offset(p1, i);
678+
let b2 = str::at_offset(p2, i);
665679
666-
#[lang = "owned_box"]
667-
pub struct Box<T>(*mut T);
680+
let b1 = lower_if_ascii(b1);
681+
let b2 = lower_if_ascii(b2);
668682
669-
#[lang="exchange_malloc"]
670-
unsafe fn allocate(size: usize, _align: usize) -> *mut u8 {
671-
let p = libc::malloc(size as libc::size_t) as *mut u8;
683+
if b1 != b2 { return false; }
672684
673-
// malloc failed
674-
if p as usize == 0 {
675-
abort();
685+
i += 1;
686+
}
676687
}
677688
678-
p
679-
}
680-
#[lang="exchange_free"]
681-
unsafe fn deallocate(ptr: *mut u8, _size: usize, _align: usize) {
682-
libc::free(ptr as *mut libc::c_void)
689+
return true;
690+
691+
fn lower_if_ascii(b: u8) -> u8 {
692+
if 'A' as u8 <= b && b <= 'Z' as u8 {
693+
b - ('A' as u8) + ('a' as u8)
694+
} else {
695+
b
696+
}
697+
}
683698
}
684699
685700
#[start]
686-
fn main(argc: isize, argv: *const *const u8) -> isize {
687-
let x = box 1;
701+
fn main(_argc: isize, _argv: *const *const u8) -> isize {
702+
let a = "HELLO\0";
703+
let b = "World\0";
704+
unsafe {
705+
let (a_ptr, b_ptr) = (str::as_bytes(a), str::as_bytes(b));
706+
match (a,b) {
707+
("hello\0", "world\0") => {
708+
printf::print2p("Whoa; matched \"hello world\" on \"%s, %s\"\n\0",
709+
a_ptr, b_ptr);
710+
}
711+
712+
("HELLO\0", "World\0") => {
713+
printf::print2p("obviously match on %s, %s\n\0", a_ptr, b_ptr);
714+
}
715+
716+
_ => printf::print0("No matches at all???\n\0"),
717+
}
718+
}
719+
return 0;
720+
}
688721
689-
0
722+
// To be able to print to standard output from this demonstration
723+
// program, we link with `printf` from the C standard library. Note
724+
// that this requires we null-terminate our strings with "\0".
725+
mod printf {
726+
use super::str;
727+
728+
#[link(name="c")]
729+
extern { fn printf(f: *const u8, ...); }
730+
731+
pub unsafe fn print0(s: &str) {
732+
// guard against failure to include '\0'
733+
if str::last_byte(s) != '\0' as u8 {
734+
printf(str::as_bytes("(invalid input str)\n\0"));
735+
} else {
736+
let bytes = str::as_bytes(s);
737+
printf(bytes);
738+
}
739+
}
740+
741+
pub unsafe fn print2p<T,U>(s: &str, arg1: *const T, arg2: *const U) {
742+
// guard against failure to include '\0'
743+
if str::last_byte(s) != '\0' as u8 {
744+
printf(str::as_bytes("(invalid input str)\n\0"));
745+
} else {
746+
let bytes = str::as_bytes(s);
747+
printf(bytes, arg1, arg2);
748+
}
749+
}
750+
}
751+
752+
/// A collection of functions to operate on string slices.
753+
mod str {
754+
/// Extracts the underlying representation of a string slice.
755+
pub unsafe fn repr(s: &str) -> (*const u8, usize) {
756+
extern "rust-intrinsic" { fn transmute<T,U>(e: T) -> U; }
757+
transmute(s)
758+
}
759+
760+
/// Extracts the pointer to bytes representing the string slice.
761+
pub fn as_bytes(s: &str) -> *const u8 {
762+
unsafe { repr(s).0 }
763+
}
764+
765+
/// Returns the last byte in the string slice.
766+
pub fn last_byte(s: &str) -> u8 {
767+
unsafe {
768+
let (bytes, len): (*const u8, usize) = repr(s);
769+
at_offset(bytes, len-1)
770+
}
771+
}
772+
773+
/// Returns the byte at offset `i` in the byte string.
774+
pub unsafe fn at_offset(p: *const u8, i: usize) -> u8 {
775+
*((p as usize + i) as *const u8)
776+
}
690777
}
691778
779+
// Again, these functions and traits are used by the compiler, and are
780+
// normally provided by libstd. (The `Sized` and `Copy` lang_items
781+
// require definitions due to the type-parametric code above.)
782+
692783
#[lang = "stack_exhausted"] extern fn stack_exhausted() {}
693784
#[lang = "eh_personality"] extern fn eh_personality() {}
694785
#[lang = "panic_fmt"] fn panic_fmt() -> ! { loop {} }
786+
787+
#[lang="sized"] pub trait Sized: PhantomFn<Self,Self> {}
788+
#[lang="copy"] pub trait Copy: PhantomFn<Self,Self> {}
789+
#[lang="phantom_fn"] pub trait PhantomFn<A:?Sized,R:?Sized=()> { }
695790
```
696791

697-
Note the use of `abort`: the `exchange_malloc` lang item is assumed to
698-
return a valid pointer, and so needs to do the check internally.
699792

700793
Other features provided by lang items include:
701794

702795
- overloadable operators via traits: the traits corresponding to the
703796
`==`, `<`, dereferencing (`*`) and `+` (etc.) operators are all
704797
marked with lang items; those specific four are `eq`, `ord`,
705798
`deref`, and `add` respectively.
706-
- stack unwinding and general failure; the `eh_personality`, `fail`
707-
and `fail_bounds_checks` lang items.
799+
- stack unwinding and general failure; the `eh_personality`, `panic`
800+
`panic_fmt`, and `panic_bounds_check` lang items.
708801
- the traits in `std::marker` used to indicate types of
709802
various kinds; lang items `send`, `sync` and `copy`.
710803
- the marker types and variance indicators found in
711804
`std::marker`; lang items `covariant_type`,
712805
`contravariant_lifetime`, etc.
713806

714807
Lang items are loaded lazily by the compiler; e.g. if one never uses
715-
`Box` then there is no need to define functions for `exchange_malloc`
716-
and `exchange_free`. `rustc` will emit an error when an item is needed
717-
but not found in the current crate or any that it depends on.
808+
array indexing (`a[i]`) then there is no need to define a function for
809+
`panic_bounds_check`. `rustc` will emit an error when an item is
810+
needed but not found in the current crate or any that it depends on.

0 commit comments

Comments
 (0)