I know a thing or two about this.
I'm trying to program a roto zoom for slow computeers like 8086 8 Mhz. I got it working in mode x , by copying data from an image in ram to vram in asembly, but it is still too slow, even with precalculated sin/cos stuff, the effect runs at about 10 fps.
Rotozoomers are all about bandwidth and not CPU speed, since (as you discovered) you can use lookup tables and code generation to take calculation out of the picture. So answering your question becomes:
- Write code that does what you want it to do (REP MOVSW, or MOVSB; INC; etc.)
- Benchmark the code
(Benchmarking is easier than people realize; I have links to a few Zen Timer packages on
https://trixter.oldskool.org/2013/0...88-and-8086-cpu-part-3-a-case-study-in-speed/ if you have trouble finding that info.)
Minimizing bandwidth = doing less. You mentioned VGA; are you trying to rotozoom 320x200x256? That's 64K to update every frame. So if VGA is a requirement, you can try things like:
- Reduce the width of your effect from 320 pixels to 256
- Enable all four write planes in unchained mode so that a single write will fill four pixels instead of one
- Change the cell height using CRTC Index 3d4 to halve the vertical resolution
Doing this will trade resolution for speed, and instead of trying to update 64K each frame, you'll be updating (256/4)*100= 6K per frame instead, with an effective onscreen resolution of 64x100.
So I decided to do it the easy way: Create a tiny video (80x60, 60fps, 3 seconds) so it is about 800Kb in size, and then paste line by line to vram using rep movsw.
That's not a good way to think about optimizing your rotozoomer, since that's not what your rotozoomer will be doing. You'll be benchmarking the speed of REP MOVSW, not an actual rotozoomer.
would a real computer be able to read 80x60 bytes from a file every frame?.
Again, I'd use a disk benchmark on your own hardware to verify what the actual read speed is. Speeds range from 90KB/s to 150KB/s on most MFM/RLL subsystems, and you can hit speeds of 300KB/s (or higher) if using an XT-IDE card with flash storage. Some bus-mastering SCSI adapters can match and exceed those speeds, but those are not typical for today's hobbyists.
I found some info about read speeds on an 8088 4.77:
-Access time = 40 ms.
-Read speed = 140 Kb/s.
Those are averages. In real life, it varys. If you want to optimize for the worst case, assume 90KB/s sustained transfer rates. Don't worry about "access time" (seek) speeds because if you're seeking a lot to play back a video file, you're doing something wrong.
But video playback and rotozoomers are different things, so I'd abandon the video file idea if you're trying to optimize your rotozoomer.