Discussion:
OpenBSD on AWS EC2 Nitro
Ilya Voronin
2021-05-16 21:28:39 UTC
Permalink
I was able to fix boot error on t3a (AMD EPYC based) instances (kernel:
protection fault trap at lapic_set_lvt:rdmsr) with this patch (tested
against 6.9):

Index: arch/amd64/amd64/lapic.c
===================================================================
RCS file: /cvs/src/sys/arch/amd64/amd64/lapic.c,v
retrieving revision 1.57
diff -u -p -r1.57 lapic.c
--- arch/amd64/amd64/lapic.c    6 Sep 2020 20:50:00 -0000 1.57
+++ arch/amd64/amd64/lapic.c    16 May 2021 15:25:55 -0000
@@ -300,7 +300,8 @@ lapic_set_lvt(void)
                 *   #32559 revision 3.00
                 */
                if ((cpu_id & 0x00000f00) == 0x00000f00 &&
-                   (cpu_id & 0x0fff0000) >= 0x00040000) {
+                   (cpu_id & 0x0fff0000) >= 0x00040000 &&
+                   (cpu_id & 0x0fff0000) < 0x00800000) {
                        uint64_t msr;

                        msr = rdmsr(MSR_INT_PEN_MSG);

It seems EPYC CPUs no longer needs the workaround, which is being
applied here.

Of course OS wasn't able to boot completely - NVMe driver doesn't work
("unable to create io q"), no NIC support.

On 10/7/2020 7:01 PM, Kirill Peskov wrote:
> OK, looks like ENA (Elastic Network Adapter) is the main show stopper
> here,
>
> There is a glimpse of optimism here, FreeBSD port of ENA driver is
> already out there:
>
> https://github.com/amzn/amzn-drivers/tree/master/kernel/fbsd/ena
> <https://github.com/amzn/amzn-drivers/tree/master/kernel/fbsd/ena>
>
> I'm trying to catch the AMD-specific crash logs from t3a-type instances
> to post them here.
Jonathan Gray
2021-05-19 09:25:17 UTC
Permalink
On Mon, May 17, 2021 at 12:28:39AM +0300, Ilya Voronin wrote:
> I was able to fix boot error on t3a (AMD EPYC based) instances (kernel:
> protection fault trap at lapic_set_lvt:rdmsr) with this patch (tested
> against 6.9):
>
> Index: arch/amd64/amd64/lapic.c
> ===================================================================
> RCS file: /cvs/src/sys/arch/amd64/amd64/lapic.c,v
> retrieving revision 1.57
> diff -u -p -r1.57 lapic.c
> --- arch/amd64/amd64/lapic.c    6 Sep 2020 20:50:00 -0000 1.57
> +++ arch/amd64/amd64/lapic.c    16 May 2021 15:25:55 -0000
> @@ -300,7 +300,8 @@ lapic_set_lvt(void)
>                  *   #32559 revision 3.00
>                  */
>                 if ((cpu_id & 0x00000f00) == 0x00000f00 &&
> -                   (cpu_id & 0x0fff0000) >= 0x00040000) {
> +                   (cpu_id & 0x0fff0000) >= 0x00040000 &&
> +                   (cpu_id & 0x0fff0000) < 0x00800000) {
>                         uint64_t msr;
>
>                         msr = rdmsr(MSR_INT_PEN_MSG);
>
> It seems EPYC CPUs no longer needs the workaround, which is being applied
> here.

Running virtualised it is unclear what msrs the hardware implements.

While you are testing family < 17h the 16h bkdgs have it as
RAZ/non-functional as well. Bits are documented in 15h.

BKDG for AMD Family 16h Models 00h-0Fh Processors
MSRC001_0055 Interrupt Pending
63:0 RAZ.

BKDG for AMD Family 16h Models 30h-3Fh Processors
MSRC001_0055 Interrupt Pending
63:0 RAZ

PPR for AMD Family 17h Model 71h B0
MSRC001_0055 [Reserved.] (Core::X86::Msr::IntPend)
Read-only. Reset: Fixed,0000_0000_0000_0000h.

Index: sys/arch/amd64/amd64/lapic.c
===================================================================
RCS file: /cvs/src/sys/arch/amd64/amd64/lapic.c,v
retrieving revision 1.57
diff -u -p -r1.57 lapic.c
--- sys/arch/amd64/amd64/lapic.c 6 Sep 2020 20:50:00 -0000 1.57
+++ sys/arch/amd64/amd64/lapic.c 19 May 2021 09:16:37 -0000
@@ -299,8 +299,7 @@ lapic_set_lvt(void)
* Family 0Fh Processors"
* #32559 revision 3.00
*/
- if ((cpu_id & 0x00000f00) == 0x00000f00 &&
- (cpu_id & 0x0fff0000) >= 0x00040000) {
+ if (ci->ci_family >= 0xf && ci->ci_family < 0x16) {
uint64_t msr;

msr = rdmsr(MSR_INT_PEN_MSG);
Index: sys/arch/i386/i386/lapic.c
===================================================================
RCS file: /cvs/src/sys/arch/i386/i386/lapic.c,v
retrieving revision 1.47
diff -u -p -r1.47 lapic.c
--- sys/arch/i386/i386/lapic.c 30 Jul 2018 14:19:12 -0000 1.47
+++ sys/arch/i386/i386/lapic.c 19 May 2021 09:19:41 -0000
@@ -160,8 +160,7 @@ lapic_set_lvt(void)
* Family 0Fh Processors"
* #32559 revision 3.00
*/
- if ((cpu_id & 0x00000f00) == 0x00000f00 &&
- (cpu_id & 0x0fff0000) >= 0x00040000) {
+ if (ci->ci_family >= 0xf && ci->ci_family < 0x16) {
uint64_t msr;

msr = rdmsr(MSR_INT_PEN_MSG);
Loading...