Image Map Image Map
Page 2 of 4 FirstFirst 1234 LastLast
Results 11 to 20 of 33

Thread: Maker4D the RPG Game Maker Engine

  1. Default

    Here are the results so far:
    Optimizing the animation system: 0,7 fps ---> 1,2 fps
    Better shadow code + eliminating SQRTF-s: 1,2 fps ---> 1,4 fps
    Optimizing the 3D renderer for weak FPU: 1,4 fps ---> 1,6 fps
    CPU: Cyrix 6x86MX 200 MHz

    The optimizations will be continued later, as currently i dont have more ideas how to optimize it further.

    I will test the engine with Intel Pentium MMX (200 mhz), Intel Pentium (90 mhz), Cyrix 6x86L (150 mhz), and AMD K5 (90 mhz), VIA C3 (533 mhz), Vortex86DX (800 mhz) CPU-s soon, just for comparison, and i will post the graph here.

    (On Pentium 4, AMD Athlon XP, and Pentium3, i already know that the engine is fast enough so no point testing on those).

  2. #12
    Join Date
    Apr 2015
    Location
    Austin, Texas
    Posts
    1,491

    Default

    Quote Originally Posted by Geri View Post
    software rendering from this era (like the quake for example) can process a couple of 1000 polygons on playable frame rates.
    The DOS version of Quake cannot display thousands of world polygons at any resolution with a playable frame rate. The BSP tree would be so large that the engine would more than likely crash on period hardware and be a slide show on something slightly more modern. 300-500 world polygons would be as high as you'd want to go, and even then, that'd only run on the fastest machines of the day.

    Later Windows Quake ports and engine enhancements changed that limit, Darkplaces can handle like 10,000 world polygons without breaking a sweat.

    Quote Originally Posted by Geri View Post
    i animate polygons and not pixels, but its right, it should be more playable. today i had some time so i continued this strange task to optimize the engine on the cyrix.
    If you plan on making even moderate use of the FPU, I'd switch the target processor to a Pentium or an AMD K6. The FPU on the 6x86 is very weak, Cyrix basically took the FPU from their 486, changed a few things and integrated it into the 6x86. The FPU is not pipelined and is considerably slower than even the AMD K5's FPU. you may as well target a 486 because there's no performance to be had from a 6x86.

  3. #13
    Join Date
    Jan 2007
    Location
    Pacific Northwest, USA
    Posts
    31,481
    Blog Entries
    20

    Default

    Looking at this thread title, I said to myself "This should be interesting--I don't think I've ever encountered a game written in RPG."

  4. Default

    Quote Originally Posted by GiGaBiTe View Post
    If you plan on making even moderate use of the FPU, I'd switch the target processor to a Pentium or an AMD K6. The FPU on the 6x86 is very weak, Cyrix basically took the FPU from their 486, changed a few things and integrated it into the 6x86.
    Thats right. Now as i annihilated the floating point from even more places from the code, so i dont think p1 will be significantly faster clock by clock, but actully i have no idea as i didnt tested it on pentium so far. I dont have k6 at the moment (i gave my k6 to someone years ago). Currently i have cyrix 6x86mx, 6x86L, amd k5, pentium1, and pentium mmx to test. I will do the benchmarking later.

    Quote Originally Posted by Chuck(G)
    This should be interesting
    Probably it was very suprising to see this much colors then

    By the way i updated Maker4D, i did the following:

    2019, april 30.
    -Background menu music now can be in .wav format
    -50% speed-up when generating 3D characters, resulting faster teleportation
    -10% speed optimization in rendering.
    -Fixed a graphics glitch on Cyrix, Vortex86DX, IDT WinChip, AMD K5 processors.
    -Shadow rendering has been optimized.
    -Modell loader is optimized, now the overall loading speed is almost twice as fast.
    -Fixed an input-doubling bug with Windows9x.
    -Background was mistakenly not rendered when NPC mode battle system automatically engages if entering a map.
    -If there is only one hero on the scene, the battle system now still displays the avatar.
    -Automatic teleport chain now properly processes scene settings.
    -Fixed segmentation fault with missing attackname1.txt when battle system engages with flawed text parameters.
    -Hero avatars was not properly restored after loading a game.
    -Fixed a bug causing the collision engine staying engaged on dead enemies.
    -Fixed an ID bug in object management.
    -New menu element allows to gentle exit from the battle system.

  5. #15
    Join Date
    Apr 2015
    Location
    Austin, Texas
    Posts
    1,491

    Default

    Quote Originally Posted by Geri View Post
    Thats right. Now as i annihilated the floating point from even more places from the code, so i dont think p1 will be significantly faster clock by clock, but actully i have no idea as i didnt tested it on pentium so far. I dont have k6 at the moment (i gave my k6 to someone years ago). Currently i have cyrix 6x86mx, 6x86L, amd k5, pentium1, and pentium mmx to test. I will do the benchmarking later.
    The Cyrix is generally faster at integer math than the Pentium at a lower clock speed, hence why Cyrix used a PR scheme to rate the speed of their processors, as they couldn't compete in raw clock speeds. Although the PR scheme had merit on integer performance, people quickly found out the FPU was junk and gamers avoided it like the plague.

    Early Cyrix 6x86 chips had numerous issues that caused system instability. The first is they ran at weird bus speeds like 75 and 83 MHz, which many motherboards didn't support, and those that did had problems with cascaded bus clocks. The 83 MHz chips would run the PCI bus alarmingly out of spec at 41.5 MHz with a 1/2 divider, which was common at the time since most CPUs had a 60/66 Mhz FSB. Memory was run out of spec, as generally SDRAM was run at the same speed as the FSB, and memory back then generally didn't take to overclocking well.

    Also exacerbating matters is that the 6x86 is missing several instructions that are available on both the Pentium and K6, causing programs to act erratically.

  6. #16
    Join Date
    Aug 2006
    Location
    Chicagoland, Illinois, USA
    Posts
    5,856
    Blog Entries
    1

    Default

    Quote Originally Posted by Geri View Post
    Here are the results so far:
    The optimizations will be continued later, as currently i dont have more ideas how to optimize it further.
    You're still off by orders of magnitude. Are you copying the entire screen on every refresh, maybe? If it's a RE-style game, the backgrounds don't change, only the characters and dialog, right? If only a small portion of the screen updates, you should copy only that portion of the screen, not the entire screen. Look up "dirty rectangles".

    If this code is written in C, you should be able to profile it somehow, to see which portions of your game/render loop are taking up so much time. An overall FPS counter isn't good enough; you could be spending time on areas that don't actually matter that much. Is there a profiler for the language you're using?
    Offering a bounty for:
    - The software "Overhead Express" (doesn't have to be original, can be a copy)
    - A working Sanyo MBC-775, Olivetti M24, or Logabax 1600
    - Music Construction Set, IBM Music Feature edition (has red sticker on front stating IBM Music Feature)

  7. Default

    GiGaBiTe: Hopefully this will not be a problem as i do an 5x86 compatible (-march=i586) binary for this era of computers. Years ago i have already determinated that this will work on Pentium1, Pentium MMX, Cyrix 6x86L and Cyrix 6x86MX. But i have forget if it will work on AMD K5 or not. Previously, (like more than a decade ago) i had issues generating compatible binaries for AMD K5 but i dont remember any more how and why...

    Trixter: Im not just copyi it by pixel, i complitely re-rendering the whole background and re-displaying the image. The wallpaper is actually a 3D plane model in the background, so its not even just a pixel copy, it goes through the whole 3d engine just like anytihng else. This is done by purpose (as the background sometimes ,,follows'' you depending on the settings on the map) so the whole thing must be re-rendered again and again.

    This engine is a modern software, serving modern gamedev purposes, features canot be sacraficed in the sake of running it on 20 year old machines.

    And the fun fact is this not even is the reason of the slowness... it contributes to it but the real speed demon is elsewhere (in multiple locations, not yet determined)... I can and will printf parts of the code to see whats going on. However, compiling new and new code and copying to the cyrix computer is very unergonomic and frustrating. I must find a way to do this more ergonomically. And by ergonomic, i meant like putting two keyboard on my desktop is frustrating, even if i have a tv capture card to get the picture of the cyrix. Now its sort of okay, i got used to it sort of, but it took a week. To find and measure the time of every single bottleneck of the engine will take days, and the steps to counter it will be much longer, so this will be a very very very long story...

  8. #18
    Join Date
    Aug 2006
    Location
    Chicagoland, Illinois, USA
    Posts
    5,856
    Blog Entries
    1

    Default

    Quote Originally Posted by Geri View Post
    The wallpaper is actually a 3D plane model in the background, so its not even just a pixel copy, it goes through the whole 3d engine just like anytihng else. This is done by purpose (as the background sometimes ,,follows'' you depending on the settings on the map) so the whole thing must be re-rendered again and again.
    But is the background always at a fixed angle and scale? If so, you can use a faster bitmap routine to copy it, rather than wasting time going through the textured polygon engine.

    This engine is a modern software, serving modern gamedev purposes, features canot be sacraficed in the sake of running it on 20 year old machines.
    Then why are you wasting your time trying to get it running on 20 year old machines?

    To find and measure the time of every single bottleneck of the engine will take days, and the steps to counter it will be much longer, so this will be a very very very long story...
    Most languages have a way to profile code without using printf. You should learn how to use your language's profiling features, and you can do a lot of optimizing in a modern environment before you even test on old hardware. A profiler library can give you a summary after the game exits what procedures took the longest time, sorted by runtime, or number of calls, or percentage of total running time...
    Offering a bounty for:
    - The software "Overhead Express" (doesn't have to be original, can be a copy)
    - A working Sanyo MBC-775, Olivetti M24, or Logabax 1600
    - Music Construction Set, IBM Music Feature edition (has red sticker on front stating IBM Music Feature)

  9. Default

    Sadly the background is not from fixed angle, it can change if you walk in the front of it, or if you switch the angle of the phone in your hands.

    Quote Originally Posted by Trixter View Post
    Then why are you wasting your time trying to get it running on 20 year old machines?
    I dont agree on the nowdays so popular development-conceptions, where a disco-snob copypastes some shoddy c# librarary with a random shader convention that will not even exist any more within the next 3 years to make a 2D platform scroller that can probably only run on a computer that have an $1000 graphics card (so only his own computer).

    But i will also not do structural changes to be able to run it better on a pentium1 in 2019.

    All optimization i do is general optimization that speeds up the overall performance on all machines. Due to this optimization-spree, it became playable on low-end android devices, it reached 400 fps on intel i9 cpu-s, and it became sort of fluid on intel atoms.

    I think with the conception i follow, everyone wins. I win, because i get a clean, fast and realible code that can run on wide generations of hardware. The users win, because the can produce games running on more wide variety of hardware. The gamers win, because the can run the games on more hardware.

    And the retro people win too, because now its above 2 fps and not 0,7.

    Quote Originally Posted by Trixter View Post
    Most languages have a way to profile code without using printf. You should learn how to use your language's profiling features, and you can do a lot of optimizing in a modern environment before you even test on old hardware. A profiler library can give you a summary after the game exits what procedures took the longest time, sorted by runtime, or number of calls, or percentage of total running time...
    I can measure the runlenght of all the passes in the engine by just printfing it. Anything beyond this cant be realibly measured in a heavily superscalar and multithreaded environment with multiple mbytes of cache, you can profile out script-languages like python, java or c# where probably a[x]=b[y+x] takes 100 clock cycles, but you cant do it with real code such as C. This would be maybe different on the socket7 environment, where these factors are much weaker, but i will certainly not even bother to try it, and i will only continue with the optimization conventios i used earlyer.


    Now i did furter optimizations, i was able to save more speed on the animation system, and i removed some duplicated code from various places, i also manually re-ordered if-chains to help more early exits. The result is 5% speed-up again.

    I have find some time to do multuple test on multiple Socket7 and other processors.

    First of all, GiGaBite was RIGHT with the 6x86 and missing instructions.
    The linux kernel i use is compiled for 586 and refused to boot on the 6x86L and on the K5. It seems the 6x86L and the AMD K5 is more like a 486 than a pentium, and they refuse to boot the 586 kernel. I dont want to reinstall linux just for this so this stays like this for now, 6x86L and K5 will not be tested for now.

    Here are the results:



    Now it finally turned out that the Cyrix 6x86 MX on 200 mhz approximately equals to the performance of a Pentium1 133 mhz under this engine. This is something i didnt expected, i tought the performance of the two cpu will be more equal.

    Pentium1 MMX on 250 MHz (overclocked) can run the engine on 2,45 fps, Cyrix 6x86MX on 250 mhz (overclocked) can run the engine on 2 fps.

    The initial goal is sort of reached, i exceeded 2 fps on the Pentium1 generation.

    Now im sort of happy, and my next goal will be to reach 3 fps, but now i pause in this task as i will allocate my time on other things.

    btw i LOVE this cpu generation

  10. #20
    Join Date
    Apr 2015
    Location
    Austin, Texas
    Posts
    1,491

    Default

    Quote Originally Posted by Geri View Post
    This engine is a modern software, serving modern gamedev purposes, features canot be sacraficed in the sake of running it on 20 year old machines.
    The original Unreal engine released in 1998 is far more advanced than your engine and runs on 20 year old hardware at 15-60 FPS depending on the graphics acceleration being used. Unreal had possibly the best software renderer of the day that enabled 2D only cards at the time to play the game at the lower end of that framerate spectrum depending on how strong your CPU was.

    The problem is not old hardware, it's something in your rendering code that isn't right.

    Are you using an existing rendering library like SDL? if so, this would explain why the engine is so slow; SDL is a massively bloated mess designed to be cross platform portable, not efficient. If you want an efficient software renderer on 20 year old hardware, it needs to be written from scratch in C or assembly or both to be fast. This is what game developers of the day did.

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •