Image Map Image Map
Page 3 of 4 FirstFirst 1234 LastLast
Results 21 to 30 of 31

Thread: 8088/8086 microcode disassembly

  1. #21

    Default

    I've found a high-resolution image of an 80186 chip. After spending way too much time staring at it, I've got most (not all unfortunately) of the microcode bits out!

    The format is similar, but not exactly the same. The field and instruction order are different, register codes as well.

    Code:
    ------------------------ 0.1100011?.00 MOV rm,imm
    00111.01101.0001000001.0 Q     -> tmpb  0 NCZ  1
    01101.10010.1001111000.0 tmpb  -> M     4 none  RNI   
    10101.00111.1100110101.0                6 W X D0,BL
    10101.00111.1001111111.0                4 none  none
    Q can now be either 8 or 16 bit, likely decided by the group decode ROM.

    The jump condition in the first line may be a bit error, but this pattern of "jump to next instruction" appears a lot for some reason.

    "W X D0,BL" after RNI appears a lot, it must be a special form to write back a memory operand. The "X" flag on a memory access seems to indicate it is the last instruction (also on the 8086).

    Code:
    ------------------------ 0.01100000.00 PUSHA
    11100.01110.0111101100.0 SP    -> tmpc  1 DEC2 tmpc 
    10100.00101.1001111111.0 SIGMA -> IND   4 none  none  
    11000.00110.1100101000.0 XA    -> OPR   6 W DS,M2
    11001.00110.1100101000.0 BC    -> OPR   6 W DS,M2
    ------------------------ 0.01100000.01
    11010.00110.1100101000.0 DE    -> OPR   6 W DS,M2
    11011.00110.1100101000.0 HL    -> OPR   6 W DS,M2
    11100.00110.1100101000.0 SP    -> OPR   6 W DS,M2
    11101.00110.1100101000.0 MP    -> OPR   6 W DS,M2
    ------------------------ 0.01100000.10
    11110.00110.1100101000.0 IJ    -> OPR   6 W DS,M2
    11111.00110.1100101011.0 IK    -> OPR   6 W DS,P0
    10101.00111.1001111101.0                4 none  NWB,NX
    00101.11100.1001111000.0 IND   -> SP    4 none  RNI
    Example of register and stack accesses. The encodings for both source and destination fields are now mostly the same. POPA is similar, it does read all 8 words from memory but discards SP immediately (OPR -> none).

    Code:
    ------------------------ 0.011010?1.00 IMUL rw,rm,imm
    00111.01101.1001111111.0 Q     -> tmpb  4 none  none  
    10101.00111.1000010111.0                4 CF1   none  
    10111.01110.1010000001.1 ZEROS -> tmpc  5 UNC  1 F
    10010.01110.1000000111.0 M     -> tmpc  4 MAXC  none
    Immediate IMUL toggles F1 before jumping to the multiply routine (which I need more time to figure out / correct bit errors).

    Code:
    ------------------------ 1.10000000.00 RESET
    10111.00100.1001111011.0 ZEROS -> PC    4 none  SUSP  
    10101.00001.1000001111.0 ONES  -> RC    4 FLUSH none  
    10111.00011.1010001000.1 ZEROS -> RD    5 UNC  8 F    ->reset2
    10111.01111.1001111111.0 ZEROS -> F     4 none  none  
    ------------------------ 0.11010111.01
    00110.01000.1001111000.0 OPR   -> A     4 none  RNI   
    10111.00000.1001111111.0 ZEROS -> RA    4 none  none  (reset2)
    10111.00010.1001111000.0 ZEROS -> RS    4 none  RNI   
    10101.00111.1001111111.0                4 none  none
    Is there some sort of branch delay slot? Otherwise this wouldn't clear flags...

    Code:
    ------------------------ 0.11011???.00 ESC
    10101.00111.0010010100.0                0 NF1  4
    10101.00111.1001111011.0                4 none  SUSP  
    00100.01101.0111001010.0 PC    -> tmpb  1 DEC  tmpb 
    10100.00100.0000100101.0 SIGMA -> PC    0 L8   5
    ------------------------ 0.11011???.01
    10101.00111.1001111000.0                4 none  RNI   
    10100.01101.1001111111.0 SIGMA -> tmpb  4 none  none  
    10100.00100.1001111000.0 SIGMA -> PC    4 none  RNI   
    10101.00111.1001111111.0                4 none  none
    If REP is used with FPU opcodes, decrement PC by 1 or 2 for some reason?

  2. #22
    Join Date
    Jan 2007
    Location
    Pacific Northwest, USA
    Posts
    35,121
    Blog Entries
    18

    Default

    I recall from working with pre-release steppings of the 80186 that the integrated DMA controller "borrowed" the SI and DI registers during a transfer. It cropped up in testing. If a DMA transfer hit during instructions that used SI or DI (e.g. MOVS), the registers would get clobbered and the program would crash, erratically. STI/CLI, of course, didn't do a thing as they had no effect on DMA. It took us almost 3 weeks with a Biomation (IIRC) LA to figure out what was happening. When we reported it to the application engineers at Intel, it was "oh, we found out about that a week ago." Grrr. The maddening thing was that if you stepped through code manually, it didn't happen (the DMA transfer was over before you could hit the "step" button).

    This was somewhere around stepping 7, I think.

  3. #23

    Default

    Quote Originally Posted by Chuck(G) View Post
    I recall from working with pre-release steppings of the 80186 that the integrated DMA controller "borrowed" the SI and DI registers during a transfer. It cropped up in testing. If a DMA transfer hit during instructions that used SI or DI (e.g. MOVS), the registers would get clobbered and the program would crash, erratically. STI/CLI, of course, didn't do a thing as they had no effect on DMA. It took us almost 3 weeks with a Biomation (IIRC) LA to figure out what was happening. When we reported it to the application engineers at Intel, it was "oh, we found out about that a week ago." Grrr. The maddening thing was that if you stepped through code manually, it didn't happen (the DMA transfer was over before you could hit the "step" button).

    This was somewhere around stepping 7, I think.
    Did this only happen with string instructions, or also PUSH/POP(A)? Because the microcode uses an internal index register which auto-increments or decrements, then copies that back to SI/DI/SP when it is finished. If the DMA controller uses the same register, it would explain the corruption.

  4. #24
    Join Date
    Jan 2007
    Location
    Pacific Northwest, USA
    Posts
    35,121
    Blog Entries
    18

    Default

    As far as I can remember (recall that this is nearly 40 years ago), where we discovered it was with string instructions. It could have been with PUSH/POP operations, but that's where we trapped it. You could compare the values of SI and DI after the operation with what they were supposed to be and trap if they weren't right. In any case, the way the thing manifested was a system crash, which made it very difficult to find, as everything on that beast was interrupt-driven.

  5. #25

    Default

    Quote Originally Posted by dreNorteR View Post
    I've found a high-resolution image of an 80186 chip. After spending way too much time staring at it, I've got most (not all unfortunately) of the microcode bits out!
    Oh wow, that's awesome! I had no idea it would be so similar. I will be interested to see your disassembly and compare it to the 8088/8086 ones. I didn't see any evidence of a branch delay slot in the 8088/8086 microcode. That delay slot sometimes being empty would be a smoking gun there.

    Quote Originally Posted by dreNorteR View Post
    If REP is used with FPU opcodes, decrement PC by 1 or 2 for some reason?
    Very bizarre! Does that combination stall the CPU until an interrupt is issued? This looks like they were planning to do something else with that part of the opcode space.

  6. #26

    Default

    Since I likely won't be improving this further (or even be able to without a better photo), I've decided to upload the files.

    Quote Originally Posted by reenigne View Post
    Oh wow, that's awesome! I had no idea it would be so similar. I will be interested to see your disassembly and compare it to the 8088/8086 ones. I didn't see any evidence of a branch delay slot in the 8088/8086 microcode. That delay slot sometimes being empty would be a smoking gun there.
    It seems to be a new feature in the 186, when the "F" bit is set on a jump, it executes the following instruction before continuing at the jump target.
    Maybe jumps take an extra cycle when this feature is not used?

    Quote Originally Posted by reenigne View Post
    Very bizarre! Does that combination stall the CPU until an interrupt is issued? This looks like they were planning to do something else with that part of the opcode space.
    FPU opcodes actually generate exception 7, like on a 286 with the emulation bit set. Must be done in random logic, with the microcode only there to correct IP. I didn't know this before reading the datasheet.
    Attached Files Attached Files
    Last edited by dreNorteR; September 24th, 2020 at 09:37 AM.

  7. Default

    Having found another photo of an 80186 (http://visual6502.org/images/pages/I...die_shots.html), I made some progress on correcting my microcode dump. Final version coming soon (hopefully)!

    (There's also a 286 on that site, but only the scaled down picture is still online, and it's too small to extract any bits...)

    Here's another instruction which uses the F1 flag internally:

    Code:
    --- 49------------------ 0.01100010.00 BOUND
    10011.01100.1100001110.0 R     -> tmpa  6 R DD,P2      ;read lower bound
    00110.01101.0100101000.0 OPR   -> tmpb  1 SUBT tmpa    ;compare
    01111.01110.0010001001.1 F     -> tmpc  0 UNC  9 F     ;save user visible flags
    10100.01100.0101100000.1 SIGMA -> tmpa  1 [0C] tmpa  F ;???
    --- 50------------------ 0.01100010.01
    10110.00110.1010000010.0 CR    -> OPR   5 UNC  2       ;out of bounds, INT 5
    10011.01101.1100001111.0 R     -> tmpb  6 R DD,P0      ;read upper bound
    00110.01100.0100101000.0 OPR   -> tmpa  1 SUBT tmpa    ;compare
    10101.00111.1000010111.0                4 CF1   none   ;both tests done
    --- 51------------------ 0.01100010.10
    10100.01100.0101100000.1 SIGMA -> tmpa  1 [0C] tmpa  F ;???
    10100.00111.0001101100.0 SIGMA -> none  0 OF   12      ;overflow?
    01110.01111.0001110100.0 tmpc  -> F     0 CY   4       ;no, carry=less than
    10101.00111.0010001101.0                0 UNC  13
    --- 52------------------ 0.01100010.11
    01110.01111.0011000100.0 tmpc  -> F     0 NCY  4       ;yes, carry=not less
    10101.00111.0010010101.0                0 NF1  5       ;do second test
    10101.00111.1001111000.0                4 none  RNI   
    10101.00111.1001111111.0                4 none  none
    Tested it on real hardware and it does indeed only compare the lower bound when prefixed with REP.

    Anyone have an idea what the ALU op [0C] does? It doesn't seem necessary to me.

  8. #28
    Join Date
    Jan 2007
    Location
    Pacific Northwest, USA
    Posts
    35,121
    Blog Entries
    18

    Default

    Quote Originally Posted by dreNorteR View Post

    It seems to be a new feature in the 186, when the "F" bit is set on a jump, it executes the following instruction before continuing at the jump target.
    Maybe jumps take an extra cycle when this feature is not used?.
    Not unlike branches and jumps on the MIPS architecture. The instruction following the jump is always executed. On the i860, the same applies for so-called "delayed" branches.

  9. Default

    Quote Originally Posted by dreNorteR View Post
    Anyone have an idea what the ALU op [0C] does? It doesn't seem necessary to me.
    Mostly figured it out, signed comparisons use sign xor overflow, not carry - d'oh!
    So this opcode should move the sign bit into carry, while preserving overflow. (some left shift / rotate variant?)

  10. Default

    Quote Originally Posted by Chuck(G) View Post
    Not unlike branches and jumps on the MIPS architecture. The instruction following the jump is always executed. On the i860, the same applies for so-called "delayed" branches.
    Yes, that's why I used that term.

    ---
    Thinking more about the 80286 microcode, some things become clear even without having the actual bits:

    There are 35 x 4 rows: most likely 4 microinstructions of 35 bits.

    Code:
    Microcode size:
     8086 : 512 -instrs x 21 bits = 10752
    80186 : 576 -instrs x 21 bits = 12096
    80286 : 748 -instrs x 35 bits = 26180
    Most of the additional instructions are likely taken up by SWITCH_TASKS and LOAD/STOREALL. Figuring out the register encoding from those three will be easy, since we know how the regs are ordered in memory.

    ---

    Like the previous chips, the microcode is split into register selection and opcode/misc. On the 286, layout of the actual ROM reflects this, with 17 rows in the upper half and 18 in the lower. The lines coming out of the lower ROM are clearly in three groups of 6 bits.

    There are at least 51 word-sized registers, so 6 bits makes sense. New ones are:
    - 7 more temporaries
    - MSW
    - LDT selector
    - TSS selector
    - descriptor cache: 8 x 3 words

    ---

    My guess is that the third register field selects which one is used to address memory:

    LOADALL uses one of the temporary regs (the one stored at 812H, which some old SMM docs suggest is named "tmpb"). At completion, it always has the value 0x864, pointing to the last word loaded. Does this count up from 0x800 during execution, like IND would on earlier CPUs?

    Here is my slightly brutal way to observe what the microcode is doing: reset the CPU while it is executing LOADALL!

    When the BIOS is loaded into shadow RAM, we can overwrite its entry point to gain control immediately & dump the registers using STOREALL. Precise timing of the reset doesn't matter, we only care if different intermediate values show up depending on where it stopped - and they do!

    But other memory accesses don't change this register. PUSH/POP uses another one, most instructions don't change any. So it must be selectable where the address comes from.

    (STOREALL is able to save all registers - it probably moves the first one to OPR before loading the address into it)

    ---

    Only these registers are initialized by reset:

    Code:
    MSW           = FFF0
    FLAGS         = 0002
    'X2' / 'tmph' = 002A (why?)
    
    IDTR          = base 000000, limit FFFF
    
    TSS selector  = 0000 (but descriptor cache preserved)
    LDT selector  = 0000 (same)
    
    CS            = F000, base FF0000, limit FFFF, attr 82
    IP            = FFF0
    
    DS = ES = SS  = 0000, base 000000, limit FFFF, attr 82
    Loading any segment in real mode also reloads attr with 0x82 (writable data, bit 4 is don't care)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •