Image Map Image Map
Page 2 of 2 FirstFirst 12
Results 11 to 16 of 16

Thread: Innovations in Integer to String (itoa) Methods for 8088?

  1. #11

    Default

    Interesting to see this bumped, and whilst it's cute to do this for 16 bit, it's not a very useful range... That was the problem I was facing was that for scorekeeping in a game even 16 bit was too slow and I really needed 8 digits, not five.

    I just wrote about how my answer was so switch to using BCD a few days ago:

    http://www.deathshadow.com/blog/2017...208088%20games

    Even though it takes more overhead for addition/subtraction -- don't even get me STARTED about multiply and divide, in my case since all I was doing is addition and I needed to display it on screen fast, the answer to binary integer to decimal per byte conversion was to do none... and just work in BCD in the first place. At 8 digits accuracy even if it takes ~350+ clocks per addition, it beats the 2800+ clocks to convert a 32 bit unsigned integer into ASCII or what is for all intents and purposes unpacked BCD.

    Though... wasn't there a trick for doing conversion to any bit-depth "better" using AAM?
    From time to time the accessibility of a website must be refreshed with the blood of owners and designers. It is its natural manure.
    CUTCODEDOWN.COM

  2. #12

    Default

    Can't beat BCD, but here's my attempt to optimize the binary version.
    Timings as documented + 4 clocks per byte, except after DIV:

    Code:
    	; DX:AX=integer less than 100_000_000
    	; ES:DI=>string buffer
    	; ret buffer filled (always 8 digits)
    	mov cx, 10000	;16
    	div cx		;152
    	mov bx, dx	;2
    	xor dx, dx	;3
    	mov cx, 100	;16
    	div cx		;152
    	mov cl, 10	;4
    	div cl		;88
    	add ax, '00'	;4
    	stosw		;11
    	xchg ax, dx	;7
    	div cl		;88
    	add ax, '00'	;4
    	stosw		;11
    	xchg ax, bx	;7
    	mov cl, 100	;12
    	xor dx, dx	;11
    	div cx		;152
    	mov cl, 10	;4
    	div cl		;88
    	add ax, '00'	;4
    	stosw		;11
    	xchg ax, dx	;7
    	div cl		;88
    	add ax, '00'	;4
    	stosw		;11
    	; = 957
    Quote Originally Posted by deathshadow View Post
    Though... wasn't there a trick for doing conversion to any bit-depth "better" using AAM?
    AAM just divides AL by a constant (which is 10 in the documented version) - it isn't any faster than a normal DIV.

  3. #13
    Join Date
    Aug 2006
    Location
    Chicagoland, Illinois, USA
    Posts
    4,563
    Blog Entries
    1

    Default

    Quote Originally Posted by dreNorteR View Post
    AAM just divides AL by a constant (which is 10 in the documented version) - it isn't any faster than a normal DIV.
    That's not true; AAM takes 83 cycles whereas DIV can take between 80 and 90 based on the operands (dividend and divisor). So AAM is slightly faster in the general case.

    However, I usually recommend against using AAM with divisors other than 10, because NEC V20/V30 processors don't support that.
    Offering a bounty for:
    - Documentation and original distribution disks for: Panasonic Sr. Partner, Corona PPC-400, Zenith Z-160 series
    - Music Construction Set, IBM Music Feature edition (has red sticker on front stating IBM Music Feature)
    - Any very old/ugly IBM joystick (such as the Franklin JS-123)

  4. #14
    Join Date
    Jan 2007
    Location
    Pacific Northwest, USA
    Posts
    25,014
    Blog Entries
    20

    Default

    The problem (or advantage) with the scheme above is that it always takes the same amount of time to get to an answer, whereas the "Chinese remainder theorem" is linear with the number of digits.

  5. #15
    Join Date
    Dec 2014
    Location
    The Netherlands
    Posts
    1,670

    Default

    Quote Originally Posted by Trixter View Post
    That's not true; AAM takes 83 cycles whereas DIV can take between 80 and 90 based on the operands (dividend and divisor). So AAM is slightly faster in the general case.

    However, I usually recommend against using AAM with divisors other than 10, because NEC V20/V30 processors don't support that.
    Another advantage of AAM might be that it sets the zero flag.
    So you wouldn't need an extra compare when looping/early-out.

  6. #16

    Default

    Quote Originally Posted by Trixter View Post
    That's not true; AAM takes 83 cycles whereas DIV can take between 80 and 90 based on the operands (dividend and divisor). So AAM is slightly faster in the general case.

    However, I usually recommend against using AAM with divisors other than 10, because NEC V20/V30 processors don't support that.
    Also the digits are swapped, so the quotient is in AH and remainder in AL (forgot about that until I tried to put AAM into my code above).
    The V20/30 optimizes the AAD instruction with a hardwired multiply-by-10. Can't test it right now but I'm fairly certain that AAM works with other constants as it does on Intel.

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •