MASM question--determining segment of a variable for conditional assembly

Chuck(G) · Dec 6, 2020

I'm a little fuzzy-minded today, but I can't for the life of me, figure out the answer to this one.

Here's a code snippet (assume MASM 6.13):

Code:

        .model  small,c
        .data
one     dw      100 dup 100
        .code
two     dw      100 dup 100
...
clearit  macro   what,howmany
        mov     di,offset what
        if      seg what EQ @code
        push    cs
        pop     es
        endif
        mov     cx,howmany
        cld
        rep stosw
        endm

        clearit one
        clearit two
        end

Now, clearly, this won't work, as the SEG operator defers resolution until link time. A possible solution to this might be to compare the name of the segment that the variable resides in, but I don't see a MASM pseudo-op that will allow that. You can see the utility of such a facility, as one can move variables between segments and have the correct code generated regardless.

So how does one solve this?

daver2 · Dec 6, 2020

Can’t you just load SEG(what) into ES and OFFSET(what) into DI regardless of what segment ‘what’ lives in? You might have to craft the remaining code to suite.

I also remember LES to load a 32-bit pointer into ES and another 16 bit offset register.

Dave

Chuck(G) · Dec 6, 2020

I'm not talking about link- or run- time, but assembly time.

Clearly, if I know that ES=DS in this case and "one" resides in the .data segment, I don't need to generate any extra code.
Similarly, if I want to address "two" through ES, the simplest way would be to generate code to set ES = CS. This can matter a lot when writing code that's moved at run-time.

All I want to do is extract the segment name that a variable belongs in at assembly time. Since the assembly cross reference shows this, it's clear that it's known to MASM.

I suppose I could define a "shadow" symbol indicating the segment association with each variable, but that seems like the long way around.

A corollary to this is the determination of other variable attributes, such as alignment at assembly time. MASM appears to have a hole in the pseudo-op repertory.

daver2 · Dec 6, 2020

>>> I'm not talking about link- or run- time, but assembly time.

Understand.

>>> A possible solution to this might be to compare the name of the segment that the variable resides in.

I have never seen such a feature in MASM.

Dave

Chuck(G) · Dec 6, 2020

For that matter, I'm not even aware of a pseudo-op that will tell you the name of the segment that you're currently in.

It seems that with MASM 6, Microsoft took the attitude of "who needs to program new 16 bit code?"

EDIT: Belay that last comment. @Curseg returns the name of the current segment, but there's still no way to obtain the name of the segment that an item belongs to.

Hugo Holden · Dec 7, 2020

I have been using an early version of MASM in my IBM5155. Managed to write a few simple assembly language programs that worked.

But compared to say 8080 programming in my sol-20, the issues of segmented memory, as they apply to the MASM assembler itself, for the 8088, baffled me. So I just stuck with an "only segment" in my code which is plenty big enough for the short and somewhat amateurish programs I have been able to write.

I wish I had enough knowledge to even adequately understand your question, because I'm sure that the answer would be very enlightening, if I could get to grips with the implications of it.

In the books I have, it is the memory segmentation, with respect to how the assembler handles that, that is the least well covered. When that sort of thing happens, it is generally my experience that the author of the text was struggling with it too.

Chuck(G) · Dec 7, 2020

The difference between MASM 5 and 6 and earlier versions is a matter of lightyears. The problem is that MS implemented new features in a sort of haphazard way. By the time the Windows standard became the 32-bit model, MS seemingly lost interest in filling out MASM capabilities, restricting themselves to adding new instruction support. That, and the migration toward C for most systems-level programming. I've used other strong-typed assembly languages where all relevant information about a data item was kept in an assembly time array, which was convenient.

Still, in this respect, MASM is a bunch better than most microcontroller assemblers.

At any rate, I think I may have a hack solution. I'll post it later after I check it out.

Chuck(G) · Dec 8, 2020

Okay, here's my hack. One defines a macro that also defines a "shadow" textequ macro that contains the current segment.

It's nasty, but it works. Consider the following snipped that loads es depending on the segment within which the "dws" macro occurs. Word "one" is in the _DATA segment and word "two" is in the code segment:

Code:

Microsoft (R) Macro Assembler Version 6.14.8444		    12/07/20 20:24:49
x.asm							     Page 1 - 1


					.MODEL small,c

				dws	macro	what,val
				what	dw	val
				%?&what	textequ	<@CurSeg>
					endm

				;	load es with either DS or CS

				ldeseg 	macro	what
				%	ifdef	?&what
				%	ifidn 	<?&what>,<_TEXT>
					push	cs
					pop	es
				%	elseifidn  <?&what>,<_DATA>
					push	ds
					pop	es
					endif
					endif
					endm

 0000					.data

					dws	one,0
 0000 0000		     1	one	dw	0

 0000					.code

					dws	two,0
 0000 0000		     1	two	dw	0

					ldeseg	one
 0002  1E		     1		push	ds
 0003  07		     1		pop	es
					ldeseg	two
 0004  0E		     1		push	cs
 0005  07		     1		pop	es

					end
				 
Microsoft (R) Macro Assembler Version 6.14.8444		    12/07/20 20:24:49
x.asm							     Symbols 2 - 1




Macros:

                N a m e                 Type

dws  . . . . . . . . . . . . . .	Proc
ldeseg . . . . . . . . . . . . .	Proc


Segments and Groups:

                N a m e                 Size     Length   Align   Combine Class

DGROUP . . . . . . . . . . . . .	GROUP
_DATA  . . . . . . . . . . . . .	16 Bit	 0002	  Word	  Public  'DATA'	
_TEXT  . . . . . . . . . . . . .	16 Bit	 0006	  Word	  Public  'CODE'	


Symbols:

                N a m e                 Type     Value    Attr

?one . . . . . . . . . . . . . .	Text   	 _DATA
?two . . . . . . . . . . . . . .	Text   	 _TEXT
@CodeSize  . . . . . . . . . . .	Number	 0000h	 
@DataSize  . . . . . . . . . . .	Number	 0000h	 
@Interface . . . . . . . . . . .	Number	 0001h	 
@Model . . . . . . . . . . . . .	Number	 0002h	 
@code  . . . . . . . . . . . . .	Text   	 _TEXT
@data  . . . . . . . . . . . . .	Text   	 DGROUP
@fardata?  . . . . . . . . . . .	Text   	 FAR_BSS
@fardata . . . . . . . . . . . .	Text   	 FAR_DATA
@stack . . . . . . . . . . . . .	Text   	 DGROUP
one  . . . . . . . . . . . . . .	Word	 0000	  _DATA	
two  . . . . . . . . . . . . . .	Word	 0000	  _TEXT	

	   0 Warnings
	   0 Errors

Like I said, it's nasty and I don't plan on using it. Just to demonstrate that it's possible.

WBST · Dec 10, 2020

Isn't that a little risky? As the Model (Tiny, Small, Medium, Huge) will cause the current contents of CS and DS to vary, depending upon *which* CODE and DATA segment is currently addressed, as well as the programmer being expected to know where the variable was defined?

Chuck(G) · Dec 10, 2020

The code is based on some assumptions, namely that the same CS value is used throughout the assembly module and that DS will refer to the default data segment. My interest is mostly for small and tiny models, but perhaps also compact. The basic idea is to provide for movable code without bringing the linker into the picture.

Color this exercise academic and something really asking the question why is ES wired to STOS (i.e. no segment override possible)?

WBST · Dec 10, 2020

Would a construct such as
PUSH WORD PTR (SEG (FAR PTR (DATA)))
POP ES
be interpreted correctly? It's been a long time...

:D

Chuck(G) · Dec 11, 2020

I don't think you need to be that complicated; just a simple push seg should suffice:

Code:

                                        .model  small,c
                                        .286

 0000                                   .data

 0000 0000                      One     dw      0

 0000                                   .code

 0000  68 ---- R                        push    seg One

                                        end

Note that (1) it's restricted to 186 and better and (2) it relies on the linker to fill in the segment value.

WBST · Dec 19, 2020

Chuck(G) said:
I don't think you need to be that complicated; just a simple push seg should suffice:

Note that (1) it's restricted to 186 and better and (2) it relies on the linker to fill in the segment value.

I was thinking about a model and segment-independent macro though, hence the (MASM-specific, yuck!) explicit type casting and generalised x86 immediate value word push. The linker-dependent segment relocation fixup is the same.

Chuck(G) · Dec 19, 2020

Of course, all of this relates to the simple assembly time question of "What segment did I declare this variable in?"

I can see only two possible answers that an optimal MASM might return:

1. You declared it in the segment named (fill in the blank)
2. You haven't declared it yet, so I don't know.

Curiously, MASM can't even answer the question for a variable declared in an absolute segment (i.e. declared in a "SEGMENT AT xxxx").

It would seem like a no-brainer.

Mills32 · Jan 26, 2021

A bit off topic, but still related to this.

Using the large model, how can I force an array to be located at, for example, segnemt 2, offset 0?.

I managed to program a simple code which loads sound samples to a buffer, and plays them from there. I want that buffer to use an entire segment (exactly 64K) so that the sound blaster can read all samples stored there using DMA.

For the moment I just used malloc, and code from root42 https://www.youtube.com/watch?v=hn-9oL-ClCE. The code allocated a 32K buffer that lies inside a segment, but I can't make it allocate 64K inside one segment. The start address is always not zero, and the last part of the buffer goes to the next 64k block in memory, (resulting in no sound when playing that part).

I thought I could define a 64k array in assembly and declare it like this:

Code:

org 0
_sound_data	label	word
dw 00000h, 00000h, 00000h, 00000h, ...

The org instruction does work, it moves the array inside the segment, but the compiler places the arrays just after the program code, so "org" advances from a non zero offset. How could I define a free segment to store the array?.

Thanks!.

daver2 · Jan 26, 2021

Is this a C compiler question as opposed to a MASM one?

Anyhow, providing the compiler/assembler locates the start of the array on a 16-byte boundary, you can determine the segment of the start of the array and the offset is automatically 0.

I would, however, put my data in a separate segment and use the tools to locate that segment on a paragraph boundary. This can usually be specified as a parameter to the segment (or group) directive of the assembler.

I am using the Intel ASM86 tools these days on a DOS box and not MASM. I need to locate my segments absolutely (to match the hardware) and I am doing this via directives to the locator. I am writing embedded code for communications cards.

Dave

Mills32 · Jan 26, 2021

daver2 said:
Is this a C compiler question as opposed to a MASM one?

Both, I just wanted to know the best, simpler way to do it. I´m using tasm, and I could not find any docs about storing data at particular addresses or segments. For example, game boy compilers will paste assembly data/code at a specific location in rom, if you use "_CODE_X" flags (or #pragma x in c). I thought there was something similar for MS-DOS.

daver2 · Jan 26, 2021

I have just looked up the MASM documentation and you can define a specific segment name specifying paragraph (segment) alignment and optionally locating the segment at a specific address if you really want to.

See MASM documentation related to SEGMENT and ENDS. Check the documentation for: alignment PARA and location AT.

You can then identify (at run-time) where the segment is located by using the SEG operator.

I will lookup the tasm documentation now.

Which 'tasm' are you referring to by the way? There are a number of assembler products identified as 'tasm'.

The Telemark TASM doesn't appear to support these directives (on a very quick look at least).

Borland Turbo Assembler (TASM) seems to support similar directives to MASM.

Dave

Mills32 · Jan 26, 2021

daver2 said:
Borland Turbo Assembler (TASM) seems to support similar directives to MASM.

Dave

Sorry, I forgot to specify, I'm using that one, "Borland Turbo Assembler". It's good to know there are similar directives, (I should have read more turbo assembler docs).

Thanks.

daver2 · Jan 26, 2021

The Borland Turbo Assembler supports the same syntax as the MASM assembler (according to the documentation).

SEGMENT, ENDS, PARA and AT keywords apply in a similar manner.

I can't give you an exact syntax - but if you have a go and post the results (I suggest you create a separate thread for your discussion though) I may be able to help further.

Dave

MASM question--determining segment of a variable for conditional assembly

25k Member

10k Member

25k Member

10k Member

25k Member

Veteran Member

25k Member

25k Member

Veteran Member

25k Member

Veteran Member

25k Member

Veteran Member

25k Member

Experienced Member

10k Member

Experienced Member

10k Member

Experienced Member

10k Member