FLEX for the 6502 or building a compatible operating system...

BillGee · Apr 15, 2020

I discovered this several months ago:

https://www.corshamtech.com/ss-50-65c02-board-experiment/

So I contacted Bob Applegate and told him that I have been working on a debugger for the 6502 and will convert it into a monitor.

We were talking about operating systems. Most of the popular ones on the 6502 are wrapped tightly around its host hardware. None of them were anything like a CP/M or MS-DOS which can be readily adapted to a new system.

I brought up the OSI DOS. He said that he had looked at it and thought it was “clunky.” I had to agree because the user had to manually allocate disk tracks.

There is DOS/65, a CP/M clone, but there is not much information available on how to adapt it and build a boot disk.

While working on adding support for his SD card “disk system” in my debugger/monitor, I had a thought. Why not build a clone of FLEX for the 6502?

I started working on it and have most of the command line interface of the DOS portion working and can boot it off of a virtual disk in my emulator. Next up are disk drivers then the file system.

This morning, I reached out to Dave , local Ohio Scientific expert and owner of http://www.osiweb.org/ , for his input. We have been having some interesting discussions.

More to come…

commodorejohn · Apr 15, 2020

Yeah, it's an interesting topic; it certainly seems that most of your options for 6502 homebrew are either bespoke systems or straight ROM-based language interpreters. It'd be interesting to see something quasi-standardized, but not necessarily as tightly-coupled to the CP/M model as DOS/65 (let's face it, user areas were a terrible substitute for directories, nobody needs that.) It'd be cool if the core OS could be ROM-resident, as well...for a lot of the simple 32KB/32KB RAM/ROM split designs, it'd be nice to have as much of the RAM free for user applications as possible.

BillGee · Apr 15, 2020

Indeed. I am early in the design process with the freedom to specify (dictate) the particulars of the ABI.

FLEX on the 6800 and 6809 were fixed in its location in memory. $A000 to $BFFF on the 6800 and $C000 to $DFFF on the 6809.

User RAM starts at address 0 and goes up from there. During initialization, FLEX tests memory from address 0 and up in 1K increments to determine the amount of available RAM.

I originally decided on $B000 to $CFFF on the 6502 to give the maximum available user memory on Bob's computer. His I/O is at $E000 and his monitor uses RAM at $DF00.

In my conversations with Dave today, some models of OSI computers have BASIC in ROM from $A000 to $BFFF and I/O at $C000.

An easy compromise is to place FLEX at $8000 to $9FFF, but that severely and artificially limits the amount of available user RAM on Bob's computer. You see, FLEX was not designed to handle disjoint memory.

CP/M system calls are made via a jump vector at address 5. Meaning that application programs need not be aware of where the CP/M image resides in memory.

On the other hand, FLEX programs make direct calls into jump tables within the FLEX image. It will not do to have FLEX reside at different addresses on different machines as we would then need versions of application programs specific to each machine. Those “magic” locations varied between the 6800 and the 6809, but that was considered to be reasonable because the two processors were not binary code compatible; we had to have different versions anyway.

Since there are no FLEX application programs today for the 6502, I suppose that I have the freedom to reserve some locations in low memory to serve as trampolines for FLEX access. Heck, even DOS/65 made that kind of tradeoff since the traditional CP/M TPA start address of 100h is not possible on the 6502.

If I did that, I can also move system variables to that low memory area, then the FLEX system itself can be ROMable.

resman · Apr 15, 2020

Given that 6502 memory maps were all over the place, doing something like Apple SOS and using BRK to call into the OS sounds like a reasonable solution. Actually mimicking the file layout of SOS/ProDOS wouldn't be a terrible idea, either.

commodorejohn · Apr 15, 2020

Yeah, that would make the most sense - no point in making unnecessary tradeoffs if it's not even going to get you compatibility. A jump table and system variables in low memory would allow the OS to reside wherever is most convenient for any particular system.

BRK is a convenient solution when you know that the IRQ handler is relatively uncluttered and easily patchable, but depending on the system that may not be the case. Plus, a BRK handler for system calls is probably just going to end up using the jump-table approach anyway.

BillGee · Apr 15, 2020

resman said:
Given that 6502 memory maps were all over the place, doing something like Apple SOS and using BRK to call into the OS sounds like a reasonable solution.

I could do that, but since FLEX currently has a unique entry point for each function call instead of going into a central dispatcher like CP/M, MS-DOS or Linux, I am inclined to keep doing it the FLEX way, but moving that stuff out of the system image to a fixed location in low memory. I'm not saying I won't, but that is not the current intent.

http://www.flexusergroup.com/flexusergroup/pdfs/flexapg.pdf

resman said:
Actually mimicking the file layout of SOS/ProDOS wouldn't be a terrible idea, either.

Because Bob's system is essentially a clone of a SWTPC 6809 computer but with a 6502 CPU, disk format compatibility with its 6800 and 6809 brethern is paramount.

BillGee · Apr 15, 2020

commodorejohn said:
BRK is a convenient solution when you know that the IRQ handler is relatively uncluttered and easily patchable, but depending on the system that may not be the case. Plus, a BRK handler for system calls is probably just going to end up using the jump-table approach anyway.

Now that I think of it, BRK is very useful for implementing breakpoints and single stepping in a debugger. Also using it for a system call trap makes a debugger even more complicated.

BillGee · Apr 15, 2020

commodorejohn said:
It'd be cool if the core OS could be ROM-resident, as well...for a lot of the simple 32KB/32KB RAM/ROM split designs, it'd be nice to have as much of the RAM free for user applications as possible.

I just realized a complication. There are two types of FLEX programs: regular programs and utilities.

Regular programs load and run in the main user RAM area.

Utilities are small programs which load and run in a 1.5 K hole within the 8K FLEX area called Utility Command Space. That will also need to be moved to a fixed location in low memory.

In addition, FLEX uses File Control Blocks to interface with the file system. Each FCB contains a 256-byte sector buffer and 64 bytes of "housekeeping" information for a total of 320 bytes per FCB. The FLEX image includes space for three FCBs. One for loading binaries, one for input redirection and one for output redirection. The address of the one used for program loading is documented (user programs may reuse it once they have been loaded.) The other two are supposedly "secret." The public one needs to be at a documented location. The other two just needs to be out of the FLEX image if it is to be ROMmable.

resman · Apr 15, 2020

I didn't know anything about FLEX, so I looked it up. The first thing that jumped out at me was that the company, Technical Systems Consultants, was from my home town. So it *must* be good.

As for dispatch, everything is going to be a compromise. Having up a jump table in low memory is probably going to chew up some memory used for something in the system. Low memory is used for so many things in 6502 designs. Using a single entry point, whether BRK or vector, relieves a lot of pressure, regardless on how you implement the kernel. How you pass parameters is an interesting design decision. A pointer to a parameter block or inlining parameters both have pluses and minuses.

Another thought, are programs re-locatable or is there a fixed address where they run? Again, because 6502 designs use low memory for so many things.

commodorejohn · Apr 15, 2020

Yeah, that would work. It might be nice to have add-in routines fill down from the top of memory so you could have an arbitrary number of them while the program resides at a fixed address, but then you'd need a relocating loader for them anyway...the 6502 is not great for position-independent code...

BillGee · Apr 15, 2020

resman said:
I didn't know anything about FLEX, so I looked it up. The first thing that jumped out at me was that the company, Technical Systems Consultants, was from my home town. So it *must* be good.

Would that be West Lafayette or Chapel Hill. They moved during their time.

resman said:
As for dispatch, everything is going to be a compromise. Having up a jump table in low memory is probably going to chew up some memory used for something in the system. Low memory is used for so many things in 6502 designs. Using a single entry point, whether BRK or vector, relieves a lot of pressure, regardless on how you implement the kernel. How you pass parameters is an interesting design decision. A pointer to a parameter block or inlining parameters both have pluses and minuses.

Another thought, are programs re-locatable or is there a fixed address where they run? Again, because 6502 designs use low memory for so many things.

The FLEX binary file format contains load addresses. They are not like CP/M .COM files which are nothing but a flat array of bytes to be loaded at 100h. But they are not relocatable either; absolute addresses are determined when the file is created.

By low memory, I am talking about page 2 and up for the purpose of having things in the same place on every system regardless of how much RAM is installed. On the 680x, the zero page is merely fast memory. On the 6502, there are essential things which can only be done with zero page locations. One thing I really like about FLEX on the 680x is that the zero page is left for programs to use; the system does not claim it. The 6502 zero page is too precious to waste for most system variables or jump tables. Since FLEX is usually dealing with slow disks or an even slower human, it does not really need the added speed of zero page. The only need is to dereference a pointer and FLEX only needs a few of those.

BillGee · Apr 15, 2020

commodorejohn said:
Yeah, that would work. It might be nice to have add-in routines fill down from the top of memory so you could have an arbitrary number of them while the program resides at a fixed address, but then you'd need a relocating loader for them anyway...the 6502 is not great for position-independent code...

There is a trick I learned while working on CP/M system software: build the code ORGed at two different locations and compare them. The bytes which differ are the ones needing adjustment when the code is relocated. Store that information in a bitmap as input to the relocator. That is how things like MOVCPM works.

resman · Apr 15, 2020

BillGee said:
Would that be West Lafayette or Chapel Hill. They moved during their time.

West Lafayette. Purdue EE grads. Can't really blame them for relocating, though. I left as soon as I could, too. I like to say: "Indiana is a great place to be from".

BillGee said:
The FLEX binary file format contains load addresses. They are not like CP/M .COM files which are nothing but a flat array of bytes to be loaded at 100h. But they are not relocatable either; absolute addresses are determined when the file is created.

By low memory, I am talking about page 2 and up for the purpose of having things in the same place on every system regardless of how much RAM is installed.

Case in point: page 2 is used for the input buffer on a lot of machines (assuming you want to take advantage of existing ROM routines). Memory mapped video likes to live around page 4 on many, if that is a concern.

BillGee said:
On the 680x, the zero page is merely fast memory. On the 6502, there are essential things which can only be done with zero page locations. One thing I really like about FLEX on the 680x is that the zero page is left for programs to use; the system does not claim it. The 6502 zero page is too precious to waste for most system variables or jump tables. Since FLEX is usually dealing with slow disks or an even slower human, it does not really need the added speed of zero page. The only need is to dereference a pointer and FLEX only needs a few of those.

Good. Looking for available holes in ZP is always a pain. A proper OS would leave it to the app, saving and restoring those it uses.

BillGee · Apr 16, 2020

resman said:
Case in point: page 2 is used for the input buffer on a lot of machines (assuming you want to take advantage of existing ROM routines). Memory mapped video likes to live around page 4 on many, if that is a concern.

For greatest portability, I am staying away from functionality in various ROMs.

FLEX for the 680x allocates 128 bytes for the input buffer and another 128 bytes for the stack.

It looks like a natural to me for them to share page 1. With the buffer growing up from $100 and the stack down from $1FF, the chance of a collision is very low. Only a program run wild is likely to cause that.

dfnr2 · Apr 16, 2020

BillGee said:
There is a trick I learned while working on CP/M system software: build the code ORGed at two different locations and compare them. The bytes which differ are the ones needing adjustment when the code is relocated. Store that information in a bitmap as input to the relocator. That is how things like MOVCPM works.

That's a great idea.

Instead of a bitmap, you could store an offset to the first relocatable address. That address would contain a pointer to the next address, and so on, like a linked list. That takes up much less space than a bitmap, and the code is much simpler as well.

commodorejohn · Apr 16, 2020

You don't even really need a linked-list structure, just a flat table of addresses and offsets (or separate tables thereof) and an entry count.

dfnr2 · Apr 16, 2020

commodorejohn said:
You don't even really need a linked-list structure, just a flat table of addresses and offsets (or separate tables thereof) and an entry count.

The advantage of the linked list is that it doesn't take up any extra space (except the 2-byte pointer to the first relocatable address). A bitmap or a table take up a lot of space. And the code to relocate from the linked list is simpler as well.

BillGee · Apr 16, 2020

dfnr2 said:
The advantage of the linked list is that it doesn't take up any extra space (except the 2-byte pointer to the first relocatable address). A bitmap or a table take up a lot of space. And the code to relocate from the linked list is simpler as well.

I have encountered the linked list idea in some object code file formats. The problem is that the links leave no room for the original contents of the item to be adjusted, so to what do I add the load offset? The 16-bit Windows (NE executable) format has a list for the references to each system entry point, such as CreateBitmap. Each NE file has to list all of its imports anyway, so that is not a problem.

Consider a 6502 example:

Code:

    lda #1
    jsr DoA
    lda #2
    jsr DoB

If you link the address of the first jsr to the address of the second; and the address of the second one is 0 to signify the end of the list.

When is is time to relocate the code, how do you know which one referenced DoA?

bakemono · Apr 16, 2020

dfnr2 said:
The advantage of the linked list is that it doesn't take up any extra space (except the 2-byte pointer to the first relocatable address). A bitmap or a table take up a lot of space. And the code to relocate from the linked list is simpler as well.

Not sure I understand this plan, how would you know what address to patch into each location?

Human68K executables have one of most compact relocation tables that I've seen. There is a pointer to the first location, then an array of 16-bit words (at the end of the binary) that encode the distance from one relocation to the next. There's a special case for distances greater than 64KB, a '1' will appear followed by a 32-bit word. For each relocation the loader simply goes through and adds the load address to the value already existing in the binary.

For doing relocations on a 6502 system, maybe it would be wise to always load on a 256-byte boundary, and then only the high byte of each address would need to be patched. And maybe the code is relatively small enough that it would be worth changing the Human68K array of 16-bit distances to an array of 8-bit distances (with a special case for longer ones).

dfnr2 · Apr 16, 2020

BillGee said:
I have encountered the linked list idea in some object code file formats. The problem is that the links leave no room for the original contents of the item to be adjusted, so to what do I add the load offset? The 16-bit Windows (NE executable) format has a list for the references to each system entry point, such as CreateBitmap. Each NE file has to list all of its imports anyway, so that is not a problem.

Consider a 6502 example:

Code:

lda #1 jsr DoA lda #2 jsr DoB

If you link the address of the first jsr to the address of the second; and the address of the second one is 0 to signify the end of the list.

When is is time to relocate the code, how do you know which one referenced DoA?

Yes, you are right. That is why I should not be posting before I've had my coffiee

I do love the idea of assembling twice and comparing the outputs to create a relocatable format. I've never seen that trick before.

FLEX for the 6502 or building a compatible operating system...

Experienced Member

Veteran Member

Experienced Member

Veteran Member

Veteran Member

Experienced Member

Experienced Member

Experienced Member

Veteran Member

Veteran Member

Experienced Member

Experienced Member

Veteran Member

Experienced Member

Experienced Member

Veteran Member

Experienced Member

Experienced Member

Experienced Member

Experienced Member