• Please review our updated Terms and Rules here

DEBUG and the 8087 FPU

per

Veteran Member
Joined
Jan 21, 2008
Messages
3,052
Location
Western Norway
Well, I tried to test my 8087 by using DEBUG from DOS 2.1. The results was somewhat unexpected.

The test I ran was this: http://www.ee.unb.ca/tervo/ee6373/8087ex1.htm

When I should have gotten 3FF0000000000000, I got FFF8000000000000. However, when I do an extra FST1, THEN I get 3FF0000000000000.

On my new computer, it works just fine without that extra FST1. Strange :confused:
 
Try it.
If the BIOS or some other program hasn't done it, the FPU could be in an unknown state.

Bill
 
I apologize for resurrecting this old thread, but the original post is short, and the problem was not resolved. The OP could not test his 8087 FPU using some simple instructions compiled using debug. (See the link to the code above.) Briefly the source code to see PI (Example #2) is as follows:

-a 100 ; assemble a program at address 100
3AA0:0100 FLDPI
3AA0:0102 FSTP QWORD [200]
3AA0:0106 INT 20
3AA0:0108

I found that this code did not work on a stock IBM PC 5150. I tried the suggestion of including FINIT and that also did not work. What did work was to include an FWAIT between the 2 instructions for the FPU. (Between FLDPI and FSTP). I assume this is because the CPU was sending the next instruction for the 8087 before the first had completed.

As an aside, I became interested in this topic because it appears to me it would be possible to do parallel programming with both processors if the commands are carefully chosen not to conflict with each other. Also, if anyone can suggest a general rule of thumb of when FWAIT should be added, it would be of interest. I am not using a compiler, rather I am creating hex code strings directly. Thanks! Michael
 
Last edited:
Well, if you don't get any decent data, I can research it. I did the x86 math package for Sorcim Supercalc back in the day for both x80 and x86--and it was done in assembly. I still have the code and can see what I did back then, too many years ago.
 
If you use an assembler or compiler, FWAIT will be automatically inserted for you.

If you are producing your own binary code, it is your responsibility...

FINIT is required at some point before you use the coprocessor in anger.

The other thing you will also need to do is to configure the mode of operation of the 8087 for how you want it to behave.

FWAIT is required where the following instruction relies on the previous one completing. In your example, the FLDPI needs to have completed before you can store it somewhere. As a result, you need an FWAIT between the instructions. You can either automatically prepend the FWAIT as a matter of course, or (if ultimate speed is the goal) you need to work out exactly where you need to place them (as you would for a pipelined processor).

I will dig my "idiot's guide" out and report back further.

Dave
 
Last edited:
Found it!

The rules for waiting are:

1. The CPU shouldn't attempt to start an instruction requiring the numerical coprocessor if the coprocessor is still executing an instruction.

2. The CPU should not execute an instruction that accesses the memory operand being referenced by the numerical coprocessor until the coprocessor has actually written the result.

So, in your example above:

FLDPI
FSTP ...

If you can't guarantee that the NDP is not idle, you must prefix the FLDPI with an FWAIT.

You can't permit the FSTP ... to execute until the FLDPI has completed, so it should also be prefixed by FWAIT.

If you want to see/read/process the result, you should insert an FWAIT.

The other thing I was thinking of was to setup the control word.

Dave
 
As an aside, I became interested in this topic because it appears to me it would be possible to do parallel programming with both processors if the commands are carefully chosen not to conflict with each other.

Not only is it possible, it was Intel's original intention. If you had a large dataset to go through, you could issue an 8087 instruction and then execute 8086 instructions getting the next operation ready while the 8087 performs the calc in the background, effectively making it "free".

I still have the code and can see what I did back then, too many years ago.

We used that quite a bit in the 80s; I'd love to see your source.
 
I'll see if I can dig it up--it'd been only, what, 35 years or so? One interesting aspect is that binary floating point was distrusted, so all of the operands and results are in BCD floating point, even if the intermediate computations aren't. I did the related FP packages for both the x80 and x86 versions of Pascal/M as well.

My understanding is that in financial circles, there is still an attitude that binary floating point isn't considered to be good; there are decimal copros that have been developed. Even on the CDC STAR supercomputer, 128-bit double-precision binary floating point was suspect, so the microprogramming included a full set of BCD instructions (up to 128K digits, IIRC).

The Intel manual that I cited goes into pretty good detail about delays. Interestingly, it sometimes just uses NOPs as a delay instead of a WAIT.
 
Even on the CDC STAR supercomputer, 128-bit double-precision binary floating point was suspect, so the microprogramming included a full set of BCD instructions (up to 128K digits, IIRC).

Interesting. Why would that be and why wouldn't the binary floating point be trust worthy?
 
Write out the binary expansion for 1/10. That should be explanatory, given our decimal-based monetary system.

The IBM S/360 has both binary floating point and BCD math. You'd be a loon to use the floating point for doing large monetary calculations.
 
Simply put, the value "one tenth" is an irrational number in binary just as "one third" is in decimal. 1/3 /= 0.333333333....... an endlessly repeating series of digits that can approach but never equal 1/3.

The resulting errors in any such approximation tend to grow as calculations using them are performed. While they can be predicted and dealt with by engineers and other ruffians, they would drive bankers and accountants crazy.

That's why financial calculations that need to be exact to the penny are often done in BCD.

It also puts a stop to people dreaming about sweeping all those fractional parts of a penny into a numbered Swiss bank account regardless of what you saw in the movie "Office Space."

;-)
 
I was trying to educate by making it an exercise. Once you do the math and determine that 1/10 is a repeating binary fraction, you have your answer. Sorry if I wasn't being obvious.

In scientific applications, it doesn't really matter, as many relationships (e.g. trig functions) are based on irrational numbers, so anything can be an approximation. There, significance rules (i.e. you can have numbers with many digits, none of which are significant).
 
Does anybody know the protocol and via which IO ports the 80287 and 80387 are accessed? I remember of digging into Intel's Hardware Reference Manual back in time but the information was very scarce. On the other hand I used to possess back in time for example a 386DX motherboard that supported both 80287 and 80387 so this information must have been available to the motherboard designers back in time...
 
The ports are actually enforced by the CPU hardware and are as follows:

Code:
00F0-00FF ----	coprocessor (8087..80387)

00F0	w	math coprocessor clear busy latch
00F1	w	math coprocessor reset
00F8	r/w	opcode transfer
00FA	r/w	opcode transfer
00FC	r/w	opcode transfer
 
The ports are actually enforced by the CPU hardware and are as follows:

Code:
00F0-00FF ----	coprocessor (8087..80387)

00F0	w	math coprocessor clear busy latch
00F1	w	math coprocessor reset
00F8	r/w	opcode transfer
00FA	r/w	opcode transfer
00FC	r/w	opcode transfer

That's what I wrote about. There is something wrong with this description since 8087 does not use this I/O port mechanism. Also why would one transfer opcodes from coprocessor to the CPU? Maybe who wrote this meant operands instead? Anyway are these ports software accessible? What is the protocol? Are they exposed to ISA bus or hooked by most chipsets? What transfers are used (bytes, words and/or dwords)?
 
Back
Top