[adelie-devel] Adelie on QEMU PPC

From: BALATON Zoltan <balaton_at_eik.bme.hu>
Date: Sun, 10 Feb 2019 02:25:10 +0000


I'm not subscribed here (hope this still gets through), please cc me on
any reply. I'm also not sure this is the right place to send this but I
guess you may be more interested than the general QEMU crowd and could
help more with debugging the Adelie Linux side so I'm sending it here.

I'm trying to run adelie-live-ppc-1.0-beta2-20181218.iso on QEMU (mainly
to have something that's known to work on real hardware to test with).
Actually I eventually could boot it and it seems to work but I've found
some strange problems during this that I'm not sure if bug in QEMU or
guest code and how to debug it. Hope you have some idea. Here are the
details. I'm using this command with QEMU from git master as of today:

qemu-system-ppc -M mac99,via=pmu -m 1024 -boot d \
   -cdrom adelie-live-ppc-1.0-beta2-20181218.iso \
   -d unimp,guest_errors -serial stdio

on an x86_64 host. (This approximately emulates a PowerMac3,1 but not
exactly. I've also enabled some debug to get more details on what's
happening: #define DEBUG_EXCEPTIONS in target/ppc/excp_helper.c)

I get the grub boot menu after some errors in OF console (not sure what
are these and if could be related to the problem) then pressing enter
starts to load kernel but ends in an unexpected exception around loading
initrd that jumps off to a non-existent handler so I think this should not
happen. This is what I could find out about this:

DSI exception: DSISR=42000000 DAR=02e8d000
DSI exception: DSISR=42000000 DAR=02e8e000
^ These are last lines of loading /bzImage I think

ISI exception: msr=00003030, nip=0480afc0
ISI exception: msr=00003030, nip=0480c294
DSI exception: DSISR=40000000 DAR=04861650
^ Some grub code running?

DSI exception: DSISR=40000000 DAR=3fc5b1f4
DSI exception: DSISR=42000000 DAR=3fcc05fc
DSI exception: DSISR=40000000 DAR=3fcbed2c
DSI exception: DSISR=40000000 DAR=3fcbc974
DSI exception: DSISR=40000000 DAR=3fcba84c
DSI exception: DSISR=40000000 DAR=3fcb8f94
DSI exception: DSISR=40000000 DAR=3fcb7c78
DSI exception: DSISR=40000000 DAR=3fcb6be4
DSI exception: DSISR=40000000 DAR=3fcb5b64
DSI exception: DSISR=40000000 DAR=3fcb3a50
DSI exception: DSISR=40000000 DAR=3fcb2ffc
DSI exception: DSISR=40000000 DAR=3fca8890
DSI exception: DSISR=40000000 DAR=3fc9bffc
DSI exception: DSISR=40000000 DAR=3fcacc1c
DSI exception: DSISR=40000000 DAR=3fc5dfe0
^ Not sure what are these

ISI exception: msr=00003030, nip=048a903c
ISI exception: msr=00003030, nip=048b8fe4
ISI exception: msr=00003030, nip=048ada7c
^ More grub code

DSI exception: DSISR=42000000 DAR=00002000
ISI exception: msr=00003030, nip=048abfa4
invalid/unsupported opcode: 00 - 00 - 00 - 00 (00000000) 00002428 0
Invalid instruction at 00002428

and ends with the exception that should not happen and hangs here. The
interesting part is that this seems to depend on what's in the memory or
layout or positions so it may be a problem in guest code (like using an
unitialised pointer which may work if it luckily points to some data that
does not cause big harm but fails otherwise) or could also be problem in
QEMU or OpenBIOS if it does not provide something that grub expects and
this causes problem (or anything else really as I'm only guessing here).

What I've found is that when I press 'c' at the boot menu to get to grub>
prompt and then manually do:

linux /bzImage
initrd /initrd

then I get exception slightly differently, such as:

DSI exception: DSISR=42000000 DAR=02e8d000
DSI exception: DSISR=42000000 DAR=02e8e000
DSI exception: DSISR=40000000 DAR=3fde7cdc
DSI exception: DSISR=40000000 DAR=04861650
DSI exception: DSISR=42000000 DAR=3fcbaaec
DSI exception: DSISR=40000000 DAR=3fcbb2f8
DSI exception: DSISR=40000000 DAR=3fcb3a50
DSI exception: DSISR=40000000 DAR=3fcad7f8
DSI exception: DSISR=40000000 DAR=3fcae01c
DSI exception: DSISR=40000000 DAR=3fcacc1c
ISI exception: msr=00003030, nip=048a903c
ISI exception: msr=00003030, nip=048b8fe4
DSI exception: DSISR=40000000 DAR=048b0b28
ISI exception: msr=00003030, nip=048a78d8
ISI exception: msr=00003030, nip=048971e4
ISI exception: msr=00003030, nip=048981b8
DSI exception: DSISR=40000000 DAR=048afc90
ISI exception: msr=00003030, nip=048a1750
DSI exception: DSISR=40000000 DAR=8115d380
ISI exception: msr=00003030, nip=0489fc74
DSI exception: DSISR=40000000 DAR=04830520
ISI exception: msr=00003030, nip=04837e88
DSI exception: DSISR=40000000 DAR=81165080
ISI exception: msr=00003030, nip=048ada7c
DSI exception: DSISR=42000000 DAR=00002000
DSI exception: DSISR=42000000 DAR=00003000
invalid/unsupported opcode: 00 - 00 - 00 - 00 (00000000) 0000238c 0
Invalid instruction at 0000238c

So there seems to be something non-deterministic in this. I guess my first
question would be what does grub do here at nip=048ada7c? Is there a way
to guess this from the above (it's somewhere around loading initrd) or can
you make an iso with unstripped grub and if that reproduces the problem
then maybe we can get something from grub source? Where are the sources of
grub that's on the iso?

Then I've tried with my pathched OpenBIOS from here (at Known problems 1.):

which fixes a device tree problem I know about. Add
'-bios openbios-qemu.elf' to qemu command above to use it.

With that it does not get the above problem and starts to boot but seems
to get a panic (based on where it hangs) but I could not make it print
this on serial or anywhere to get more info. Do you know what options
could make the kernel log to serial during boot? I've tried earlyprintk
console=ttyPZ0 console=ttyS0 and similar but could not get output with any
of these.

But during experimenting with this sometimes I managed to boot it. Finally
I've found that if I do exactly this:

1. Use -bios openbios-qemu.elf
2. press c at boot menu to get grub> prompt
3. Type these (issue 'set' 2 times!)
grub> ls
grub> set
grub> set
grub> linux /bzImage
grub> initrd /initrd
grub> boot

then it boots. Strange! So I wonder if you have an idea how this could be
debugged and identify if it's a QEMU, OpenBIOS or guest code problem. I
think most likely could be something is missing from OpenBIOS which then
leads to using uninitialised data but without getting some debug output
from the kernel panic at least or finding out where the crash in grub is
happening I have no idea how to find what could cause this.

Thank you,
Received on Fri Mar 01 2019 - 07:17:32 UTC

This archive was generated by hypermail 2.4.0 : Sat May 08 2021 - 22:54:40 UTC