Wednesday 24 February 2021

Multicore programming for DOS

Let's see if we can utilize two cores of an x86 processor in plain old DOS. Here's the code.

; nasm -O0 -fbin u.asm -o u.com

org     100h

; copy CPU1 code
mov bx, cs
mov ds, bx
mov si, cpu1
mov bx, 0x9000
mov es, bx
xor di, di
mov cx, 512
rep movsb

jmp protected
cpu1:
mov di, 0xB800
mov es, di
mov di, 2
mov al, '1'
stosb
mov al, 15
stosb
hlt
protected:
cli

mov     eax, cs
shl     eax, 4
add     eax, gdt
mov     [gdtinfo+2], eax

mov     eax, cs
shl     eax, 4
add     [start_addr], eax

lgdt    [gdtinfo]

mov     eax, cr0
or      eax, 1
mov     cr0, eax

db      0x66
db      0xEA
start_addr:
dd      start
dw      8
start:
bits    32
mov     ax, 0x10
mov     ds, ax
mov     es, ax

mov     edi, 0xB8000
mov     [edi+0], byte '0'
mov     [edi+1], byte 15

; start CPU1
mov edi, 0xFEE00300
mov [edi], dword 0x000C4500
mov ecx, 10000000
delay:
times 4 nop
loop delay
mov [edi], dword (0x000C4600 + 0x90000/0x1000)

hlt
gdt:
dd      0, 0
db      0xFF, 0xFF, 0, 0, 0, 10011010b, 11001111b, 0
db      0xFF, 0xFF, 0, 0, 0, 10010010b, 11001111b, 0
gdtinfo:
dw      $ - gdt - 1
dd      0




Can also be checked in virtualbox. I tested with my dual core laptop. If you have only one CPU, then only the number 0 will appear, if you have two CPUs, then also number 1 will appear.


This code has to be run without EMM386 and similar, because they limit access to the protected mode.

Thursday 11 February 2021

PCI - FPGA

After the previous project, I started thinking that perhaps it would be nice and possible to connect my cheap FPGA board to PCI-bus in such a manner that everything works in DOS without drivers. Non-PnP style. Well, at least IO-ports 0x388-0x38b, like used by Adlib and OPL3 cards.

I figured this would in fact require nothing besides directly attaching a bunch of pins from the PCI bus to my Cyclone II. One issue potentially being that Cyclone II isn't really 5V tolerant, but I just decided to ignore the issue and everything seemed to work fine (so far). Though, if you value your hardware, don't do it. It could break everything.


I compiled programs such as the one below for testing.

    ; nasm -fbin 3.asm -o 3.com

    org
    100h

    mov    al, 3
    mov    dx, 0x388
    out    dx, al

    int    0x20

This simply writes the value of register al to port 388h. Program executed in DOS without any drivers and resulted in immediate FPGA response, in this case changing LED states on the FPGA board. All good. So this kind of basic PCI is in fact really simple. It should be good enough to make a PCI based OPL3 sound card that works out of the box in DOS like the ISA cards back in the day.

Most music, like FM-synthesis and MIDI was played solely through IO-ports so stuff like Adlib, Gravis UltraSound and Roland MT-32 should be doable in principle. Digital Sound effects that utilized DMA and/or IRQ might be more problematic to do in a compatible manner.

There exists free SystemVerilog implementation of OPL3 which I also tested (on DE0CV) including a small Delta-sigma modulator for analog output. My plan next was to create a PCI-based OPL3 card that utilizes no other active components besides the FPGA and maybe a few regulators.

This way one could do retro gaming with a bit newer PC which doesn't come with ISA-bus. Although, it would be nice to get Sound Blaster compatible PCM working as well from DOS.

Ideally one would implement a PCIe card with similar functionalities. Unfortunately PCIe capable FPGAs tend to be somewhat expensive. Optical toslink or HDMI output for sound would also be nice. Not sure if 49.??? kHz that OPLs use work with optical or HDMI. Resampling might be nice anyway if PCM output is to be combined. Then it would also be nice to implement VGA-adapter so one could get video through HDMI with optimized timing and scaling.


Here's the Verilog. Only the wires marked by * were connected for this test.

/*
            PCI

             _
 B01        | |        A01
 B02        | |        A02
 B03        | |        A03
 B04        | |        A04
 B05        | |        A05
 B06        | | INTA   A06
 B07 INTB   | | INTC   A07
 B08 INTD   | |        A08
 B09        | |        A09
 B10        | |        A10
 B11        | |        A11
 B12        | |        A12
 B13        | |        A13
 B14        | |        A14
 B15        | |        A15
*B16 CLK    | |        A16
 B17        | |        A17
 B18        | |        A18
 B19        | |        A19
 B20 AD31   | | AD30   A20
 B21 AD29   | |        A21
 B22        | | AD28   A22
 B23 AD27   | | AD26   A23
 B24 AD25   | |        A24
 B25        | | AD24   A25
*B26 C/BE3  | |        A26
 B27 AD23   | |        A27
 B28        | | AD22   A28
 B29 AD21   | | AD20   A29
 B30 AD19   | | GND    A30
 B31        | | AD18   A31
 B32 AD17   | | AD16   A32
*B33 C/BE2  | |        A33
 B34        | | FRAME  A34*
*B35 IRDY   | |        A35
 B36        | | TRDY   A36*
*B37 DEVSEL | |        A37
 B38        | |        A38
 B39        | |        A39
 B40        | |        A40
 B41        | |        A41
 B42        | | GND    A42*
 B43        | |        A43
*B44 C/BE1  | | AD15   A44
 B45 AD14   | |        A45
 B46        | | AD13   A46
 B47 AD12   | | AD11   A47
 B48 AD10   | |        A48
 B49        |_| AD9    A49*
 B50           
 B51         _  
*B52 AD8    | | C/BE0  A52*
*B53 AD7    | |        A53
 B54        | | AD6    A54*
*B55 AD5    | | AD4    A55*
*B56 AD3    | |        A56
 B57        | | AD2    A57*
*B58 AD1    | | AD0    A58*
 B59        | |        A59
 B60        | |        A60
 B61        | |        A61
 B62        |_|        A62
*/

module C2PCI(CLK50, PCICLK, LED, FRAMEn, AD, CBE, IRDYn, TRDYn, DEVSELn);

input            CLK50;
input            PCICLK;
output reg [2:0] LED;
input            FRAMEn;
input      [9:0] AD;
input      [3:0] CBE;
input            IRDYn;
inout            TRDYn;
inout            DEVSELn;

parameter IO_address = 10'h388;
parameter CBECD_IOWrite = 4'b0011;

reg Transaction;
wire TransactionStart = ~Transaction & ~FRAMEn;
wire TransactionEnd = Transaction & FRAMEn & IRDYn;
wire Targeted = TransactionStart & (AD==IO_address) &(CBE==CBECD_IOWrite);
wire LastDataTransfer = FRAMEn & ~IRDYn & ~TRDYn;
wire DataTransfer = DevSel & ~IRDYn & ~TRDYn;
reg DevSelOE;
reg DevSel;

always @(posedge PCICLK)
case(Transaction)
  1'b0: Transaction <= TransactionStart;
  1'b1: Transaction <= ~TransactionEnd;
endcase

always @(posedge PCICLK)
case(Transaction)
  1'b0: DevSelOE <= Targeted;
  1'b1: if(TransactionEnd) DevSelOE <= 1'b0;
endcase

always @(posedge PCICLK)
case(Transaction)
  1'b0: DevSel <= Targeted;
  1'b1: DevSel <= DevSel & ~LastDataTransfer;
endcase

assign DEVSELn = DevSelOE ? ~DevSel : 1'bZ;
assign TRDYn = DevSelOE ? ~DevSel : 1'bZ;

always @(posedge PCICLK)
if(DataTransfer)
LED[2:0] <= ~AD[2:0];

endmodule