Implement bcmp (#315)
As of LLVM 9.0, certain calls to memcmp may be converted to bcmp, which I guess
could save a single subtraction on some architectures. [1]
bcmp is just like memcmp except instead of returning the difference between the
two differing bytes, it returns non-zero instead. As such, memcmp is a valid
implementation of bcmp.
If we care about size, bcmp should just call memcmp.
If we care about speed, we can change bcmp to look like this instead:
```rust
pub unsafe extern "C" fn bcmp(s1: *const u8, s2: *const u8, n: usize) -> i32 {
let mut i = 0;
while i < n {
let a = *s1.offset(i as isize);
let b = *s2.offset(i as isize);
if a != b {
return 1;
}
i += 1;
}
0
}
```
In this PR I do not address any changes which may or may not be needed for arm
aebi as I lack proper test hardware.
[1]: https://releases.llvm.org/9.0.0/docs/ReleaseNotes.html#noteworthy-optimizations