I have created a similar issue more than 3 years ago but am now back at the same problem. I am working on bare metal with #![no_std] and formatting a number in decimal stops execution (without panicking) if the number has more than one digit.
log_line!("{}", 9); // runs just fine
log_line!("{}", 10); // will halt
But weirdly:
log_line!("{:x}", 1000); // also works fine
log_line!("{:b}", 1000); // and so does this
I would assume that it has something to do with the amount of space that is being preallocated for the formatting but i find it extremely difficult to verify this, let alone fix it.
If anyone has an idea on what could cause this i would greatly appreciate any help
works like a charm. The only difference i can think of there is that this will actually use a udiv instruction and the formatter seems to use a function from the compiler builtins crate for division.
yeah.. i don't know what i was thinking, of course there was no instruction. I tried it again with a volatile read to avoid optimization and double checked that there was a udiv instruction, which there was, and the code still ran successfully.
I am just beyond confused. So i was playing around with some target parameters and tested it again and wouldn't you know, it actually worked. Until i added a panic for testing way further down in my code, at which point the formatter halted again. So then i removed the panic and said parameters again to see if it would still work. And it does. So now i have no idea what's going on. To me it makes no sense at all that a panic at the end of my kernel main should effect a format function that i call on setup, unless the problem is related to alignment and the panic is changing the memory layout to the point where the formatter no longer works.
The fact that it does not seem to be directly related to the log_line! input indicates to me that there might be a more general issue. Alignment could be an issue, but I don't know enough about aarch64 or xargo to be of much help...
You could also try to isolate if it's the formatting or the printing by not printing it to screen and only formatting it.
My experience of such "Heisenbugs", where things mysteriously fail in random ways, when unrelated code induces that, when trying to debug it makes it go away, have often been that there is a memory corruption problem going on somewhere.
Perhaps running out of stack.
Something misusing the heap.
Some threads trampling each other.
Would be interesting to see what code that log_line! actually becomes.
I think currently it's almost impossible for me to run out of stack since i set the stack size to 0x60000 (out of paranoia). The heap is really super simple currently, you can see the heap allocator here https://github.com/vE5li/alb1/blob/master/bootloader/src/memory/heap/mod.rs. I will definitely look into the heap some more, since i actually hadn't considered that yet. And lastly, currently 3 of the 4 cores are halting, so it's definitely not caused by multi-threading.
Your framebuffer code does a lot of pointer math; if something asks to draw to an off-screen position, it will happily write into arbitrary memory locations. While you're trying to debug the system, I'd put a check for this kind of overrun into draw_pixel()— It'll slow things down a lot, but might prevent some spurious halts.
EDIT: In particular, your logger currently has no protections against a message being too wide for the screen. In most cases, this will just write into the next line but if you're at the bottom of the screen, it'll write into adjacent non-framebuffer memory instead.
I do have a panic handler implemented, but whatever is failing doesn't panic. Could be an unaligned memory access but i currently have no method of verifying that
That is true, but i don't have many messages currently. They hardly take up half of the screen, so i'm pretty sure that the framebuffer is not the culprit in this case. Also the image comes out exactly as i expect, so the pointer math should be correct too.
Edit: just tested it and not using the framebuffer yields the exact same result.