Image Map Image Map
Page 4 of 5 FirstFirst 12345 LastLast
Results 31 to 40 of 46

Thread: PDP-11 π benchmark

  1. #31

    Default

    Quote Originally Posted by bqt View Post
    The raw numbers suggests that the ARM would be very comparable. Actual tests shows that this is not at all the case.
    Where can we find those actual tests?
    Quote Originally Posted by bqt View Post
    I've already had long discussions about this, but the author really thinks that it is better when time for I/O and other artifacts are included. So, no compensation for that.
    The last number is just passed wall clock time, which is the "performance".
    Dear bqt, you just don't want to understand very simple things. It is really very sad for me. The last number printed is just the duration of program execution. A faster machine produces it lower, a slower machine produces it larger. It is not a general performance meter. It is just a test for a particular set of instructions and you know this very well. I told you about this more than 10 times... I am just gathering data for different computers. Indeed, dhrystones is more general benchmark than the pi-spigot, and even the dhrystones is far from perfect and especially for modern computers.

  2. #32
    Join Date
    Sep 2019
    Location
    Zurich, CH
    Posts
    157

    Default

    Quote Originally Posted by vol.litwr View Post
    Where can we find those actual tests?
    If you read the full article you linked to, you have a whole bunch of those tests... Didn't you even read that thing yourself before posting the link?
    Dear bqt, you just don't want to understand very simple things. It is really very sad for me. The last number printed is just the duration of program execution. A faster machine produces it lower, a slower machine produces it larger. It is not a general performance meter. It is just a test for a particular set of instructions and you know this very well. I told you about this more than 10 times... I am just gathering data for different computers. Indeed, dhrystones is more general benchmark than the pi-spigot, and even the dhrystones is far from perfect and especially for modern computers.
    And you don't understand that a faster machine does not necessarily produce a lower number, since the number you get includes a whole lot of things that are unrelated to processor speed, and which can affect the outcome significantly depending on lots of other factors. Not to mention the fact that the same machine will produce different numbers depending on OS and various rules you enforce. Does that mean the processor speed is different for these different OSes, or implementations? Of course not. The processor speed does not change. The same machine always have the same speed. The fact that your test in fact does give different numbers for the same machine just highlight the severe problem with your test. But you obviously do not see that.

    dhrystones are indeed another test to try and determine processor speed. That test is by far better than your setup if we want to figure out the speed of a processor, but there are several known problems with that one as well, yes.

  3. #33

    Default

    Quote Originally Posted by cbscpe View Post
    Finally got around to run the pi benchmark on my PDP-11/Hack. What I get is

    I was a little bit surprised that during the calcualtion digits were printed, I assume this is compensated when claculating the performance. And I assume the last number gives an indication of speed
    Thank you very much. The digits printed are digits of the famous pi number. Your results showed that the PDP11/Hack is about 10% faster than the 11/70 for the pi-spigot.

    Quote Originally Posted by bqt View Post
    So, stop trying to compare CPUs through clock frequency. It's not meaningful. Clock speed, and what is actually done in one clock cycle are very different on different machines.
    Dear bqt. I wrote above "It is not all about the performance dependence on a processor frequency, it is rather about electronics efficiency". Have you read this?

    Quote Originally Posted by bqt View Post
    If you read the full article you linked to, you have a whole bunch of those tests... Didn't you even read that thing yourself before posting the link?
    You have written about something different. You wrote that benchmarking for different architectures is meaningless. I have given you a link which IMHO shows that it is quite meaningful. It shows that the x86 still a bit faster but not always. It is quite interesting for me. With better software the ARM could show much better results. BTW I worked with the ARM servers. Some branches of modern software development can't work without them.


    Quote Originally Posted by bqt View Post
    And you don't understand that a faster machine does not necessarily produce a lower number, since the number you get includes a whole lot of things that are unrelated to processor speed, and which can affect the outcome significantly depending on lots of other factors. Not to mention the fact that the same machine will produce different numbers depending on OS and various rules you enforce. Does that mean the processor speed is different for these different OSes, or implementations? Of course not. The processor speed does not change. The same machine always have the same speed. The fact that your test in fact does give different numbers for the same machine just highlight the severe problem with your test. But you obviously do not see that.
    How can you define a faster machine?! The pi-spigot implementation includes some factors which only indirectly depend on a processor speed but those factors are minor and almost invisible for a 3000 digits run. Sorry, but I have to repeat. It is not a general benchmark, it is project which gathers data related to processors speed. You are completely wrong about OS. Different OSes produce different results for the same hardware. It is a fact.

    Quote Originally Posted by bqt View Post
    dhrystones are indeed another test to try and determine processor speed. That test is by far better than your setup if we want to figure out the speed of a processor, but there are several known problems with that one as well, yes.
    I agree with you completely.

  4. #34
    Join Date
    Sep 2019
    Location
    Zurich, CH
    Posts
    157

    Default

    Quote Originally Posted by vol.litwr View Post
    Thank you very much. The digits printed are digits of the famous pi number. Your results showed that the PDP11/Hack is about 10% faster than the 11/70 for the pi-spigot.
    For some instance of 11/70, running some OS, with some kind of connection. For other variants, the difference might be much larger, even if we still talk 11/70.
    Dear bqt. I wrote above "It is not all about the performance dependence on a processor frequency, it is rather about electronics efficiency". Have you read this?
    Electronics efficiency? I'd say your test is more about OS behavior as well as what the actual connection you are having to the machine.
    You have written about something different. You wrote that benchmarking for different architectures is meaningless. I have given you a link which IMHO shows that it is quite meaningful. It shows that the x86 still a bit faster but not always. It is quite interesting for me. With better software the ARM could show much better results. BTW I worked with the ARM servers. Some branches of modern software development can't work without them.
    I have repeatedly pointed out that comparing machines based on frequency is meaningless. It should not be so hard to just read through this thread.
    And the article you linked to again shows exactly that.

    But to explicitly quote a couple of replies up on this exact thread that you are arguing now... I said: "The most obvious example of the problems with your thinking is to take, for example, an extreme CISC and compare with an extreme RISC. The clocks are totally meaningless to compare."

    You replied: "You are totally wrong. Such measurements are used for ages and they are quite respectable. There are numerous tables with such measurements on the net. Do you appeal just claim all of them false?!
    What wrong to compare electronics efficiency of the x86 and ARM? The ARM instructions are generally more powerful than instructions of the 80386 but the 80386 has a larger variety of instructions. So MIPS/MHz ratio is much better for the ARM but the 80486 got almost the same ratio as the ARM. It reflects quite accurate the real hardware performance."

    So, I am pointing out that comparing machines based on frequency is meaningless. You claim that in the end, if reflects quite accurate the real hardware performance.

    So I replied: "No doubt people have tried to make such comparisons for ages. And every time other people have pointed out that it is meaningless. And gives you useless information, from which then people make all kind of deductions, which are incorrect."

    To which you replied: "So such kind of texts is meaningless? "

    And I then said: "If you were to just take the CPU frequency numbers and compare them from this article, you would indeed get a very meaningless piece of information. Which the author obviously understands as well, and which is why he starts with giving those numbers for people who think they are interesting, and then goes on to do actual tests on a lot of different problems."

    Again. The frequency numbers do not at all give any kind of useful comparison, and that article actually illustrates this very well.
    But you seem to not understand this detail. That article even observes that, even though the frequency of the two (three) processors are almost the same, the chips essentially turn out to have very different performance numbers, which also differs depending on the exact software they are trying to run. Not to mention that the number of cores are rather different between the processors, which skews the results even more. But bottom line is that they are not essentially churning out the same amount of work, even though they are running at similar frequencies.

    The amount of work done have little connection to the frequency. Stop trying to compare processors by comparing frequencies.

    How can you define a faster machine?! The pi-spigot implementation includes some factors which only indirectly depend on a processor speed but those factors are minor and almost invisible for a 3000 digits run. Sorry, but I have to repeat. It is not a general benchmark, it is project which gathers data related to processors speed. You are completely wrong about OS. Different OSes produce different results for the same hardware. It is a fact.
    I don't know how to define a "faster" machine. I guess it depends on what specific problem you want to solve. For a specific problem you can compare the time used by different processors. Does that make one machine "faster". I don't know.

    *Your* pi-spigot implementation is measuring a whole bunch of factors which are sometimes not even closely related to actual processor speed.
    The basic problem is an interesting one, with some potential of giving interesting information. But with your baking of all kind of factors into it, it becomes much less interesting.
    You are essentially not seeing that much of the processor speed aspect as one would have hoped.

    Let me tell this story from a different angle.

    When I first saw your note and table, I became interested. I was curious about how a PDP-11 would compare to other systems, so I went there and checked.

    That made me notice that there was no implementation for RSX. Since I like RSX, I wanted an implementation I could run on RSX. The existing implementation would have been ugly to port. In addition, I felt that I could probably improve some on the code. And since you had claimed on that page that these were the most efficient implementations known, and if someone had a better implementation, you would like to hear from them, I figured I should try and do an implementation for RSX, and try to do a better one than what you had.

    So I did that. My first attempt was pretty bad, and it was slow. So I sat down and tried to figure out how I really could make it fast, and did a really serious improvement. However, I still came out slower than what you had in your table. And this was when running on a real 11/70, so I was expecting to be closer than I was. Next I started checking your code to see how it could still be faster. I saw a couple of nice tricks, but I also noticed that if I just included those tricks, my code would then really be better than what you currently had. Doing all the tricks, I was still slower.
    At this point I had to reexamine things at a deeper level. How could my solution be slower, when it was fewer instructions, and better at every point?
    The answer was that I/O was slower. So even though I had a better implementation, the number for your table gave a longer time.
    So the next thing to do was to solve the slow I/O problem. But this is where I hit trouble with you. You were essentially not interested in the faster solution, because it was "cheating". First in not printing while computing, next by doing I/O asynchronous, to make it behave in a similar way to Unix. I also tried solutions that worked for a smaller range, but still covered your runs of 100, 1000 and 3000 digits.
    All such attempts at making faster solutions were met with disapproval for various reasons, all which seemed to not apply so much for other solutions you had in your table.

    So in the end, if you want to have a fast solution for your condition, the first thing is to find the OS that gives you the most advantage, because otherwise any better implementation will essentially not really show that it is better, since other factors drag it down. So this whole problem becomes a question of finding the most optimal environment, and not the most optimal implementation.

    Had I just accepted that, you would never had gotten any number that was faster than what you already had in your tables, and would never have tried to improve your code either, and you would still have claimed that the old code was the most efficient implementation known, ignoring the code I had written which was in fact faster. Because the end number was slower because of factors outside of the implementation. And that is still the case.

    We both know that my variant is still a bunch of instructions less than yours, but the "official" results still shows your implementation being the faster.

    This all begs the question - are you really interested in more efficient implementations? Or is it only about some implementation that is good enough, that anything done with some other OS will never be able to beat it, no matter what? I think the answer is obviously that you are not that interested in the most efficient implementation.

  5. #35
    Join Date
    Sep 2019
    Location
    Zurich, CH
    Posts
    157

    Default

    Here are my final runs of this exercise.

    This is all run on the same PDP-11/93. Each number is the best achieved after three runs. Except for once or twice, all numbers of the different runs for the same number of digits have been within less than 1% of each other. Many times I get the exact same number when running the program multiple times.

    Code:
                    100     1000    3000
    2.11BSD (patch 465):  
            Number pi calculator v9(EIS-of) - max 7168 digits
    
            Telnet: 0.14    13.08   97.44
            Serial: 0.16    10.28   88.62
    
    RSX-11M-PLUS V4.6:
            number pi calculator v1 (EIS-of) - max 9104 digits
    
            Telnet: 0.60    14.76   103.00
            Serial: 0.18    10.44   89.86
    Worth pointing out that this is running the exact same code, with the exception of how I/O is done.
    Does anyone think such variance is adding something to the exercise?

    Also, note that the Unix version handles significantly less number of digits. But that is ok, even though the problem statement says it should use all available memory, and if doing that, it should be able to hit around 9000 digits. Because the author says it is ok in this case.

    I won't even bother presenting the numbers from my code, because it's all just cheating anyway. Even though it also computes the correct result.

  6. #36

    Default

    Quote Originally Posted by bqt View Post
    For some instance of 11/70, running some OS, with some kind of connection. For other variants, the difference might be much larger, even if we still talk 11/70.
    It is just faster. It is a fact, not your speculations. BTW it shows that a program under RT-11 works notably faster than under RSX-11 or 2.11BSD - you didn't believe that the 11/83 under RT-11 is faster than the 11/93 under RSX-11 for pi-spigot. Thanks to cbscpe we got a proof.

    Quote Originally Posted by bqt View Post
    Electronics efficiency? I'd say your test is more about OS behavior as well as what the actual connection you are having to the machine.
    It is again your just speculations. The ER values in the second table are exactly about the electronic efficiency.

    Quote Originally Posted by bqt View Post
    I have repeatedly pointed out that comparing machines based on frequency is meaningless. It should not be so hard to just read through this thread.
    And the article you linked to again shows exactly that.
    Excuse me but can you read? I repeat I never claimed that I compare machine speeds basing on frequencies of their CPUs. It is again just your fantasies. However the mentioned article about processors with similar electronic efficiency shows such a comparison and it is quite meaningful.

    Quote Originally Posted by bqt View Post
    But to explicitly quote a couple of replies up on this exact thread that you are arguing now... I said: "The most obvious example of the problems with your thinking is to take, for example, an extreme CISC and compare with an extreme RISC. The clocks are totally meaningless to compare."
    So you just repeat your point missing the idea that it is pointless.

    Quote Originally Posted by bqt View Post
    You replied: "You are totally wrong. Such measurements are used for ages and they are quite respectable. There are numerous tables with such measurements on the net. Do you appeal just claim all of them false?!
    What wrong to compare electronics efficiency of the x86 and ARM? The ARM instructions are generally more powerful than instructions of the 80386 but the 80386 has a larger variety of instructions. So MIPS/MHz ratio is much better for the ARM but the 80486 got almost the same ratio as the ARM. It reflects quite accurate the real hardware performance."
    So, I am pointing out that comparing machines based on frequency is meaningless. You claim that in the end, if reflects quite accurate the real hardware performance.
    The first paragraph was about a particular case for processors with similar electronic efficiency. The second paragraph was not about the performance of processors but about their electronic efficiency. So your paragraph is just a wrong logical conclusion.

    Quote Originally Posted by bqt View Post
    So I replied: "No doubt people have tried to make such comparisons for ages. And every time other people have pointed out that it is meaningless. And gives you useless information, from which then people make all kind of deductions, which are incorrect."

    To which you replied: "So such kind of texts is meaningless? "

    And I then said: "If you were to just take the CPU frequency numbers and compare them from this article, you would indeed get a very meaningless piece of information. Which the author obviously understands as well, and which is why he starts with giving those numbers for people who think they are interesting, and then goes on to do actual tests on a lot of different problems."
    The article is about processors which frequencies are between 2.1-2.5 GHz. So they are very close values. It shows that those processors often show close speed results. What is wrong? Indeed the processor with 2.5 GHz is not the fastest. So you are formally correct but it is close to the leaders so your correctness is not very significant. Moreover why should we discuss the ARM and x86 thoroughly when we are discussing more general matter? I have to say you again: "I have never claimed that I compare machine speeds basing on frequencies of their CPU's". All this discussion about the article content is about particular cases when such comparison is not completely meaningless. I can even write more detailed text for you about my point. It is generally meaningless but for some particular cases it has its reasons.

    Quote Originally Posted by bqt View Post
    Again. The frequency numbers do not at all give any kind of useful comparison, and that article actually illustrates this very well.
    But you seem to not understand this detail. That article even observes that, even though the frequency of the two (three) processors are almost the same, the chips essentially turn out to have very different performance numbers, which also differs depending on the exact software they are trying to run. Not to mention that the number of cores are rather different between the processors, which skews the results even more. But bottom line is that they are not essentially churning out the same amount of work, even though they are running at similar frequencies.

    The amount of work done have little connection to the frequency. Stop trying to compare processors by comparing frequencies.
    Stop declare just your fantasies. Your previous paragraph is rather formally correct but it has no any direct connection to our discussion so it is rather meaningless.

    Quote Originally Posted by bqt View Post
    I don't know how to define a "faster" machine. I guess it depends on what specific problem you want to solve. For a specific problem you can compare the time used by different processors. Does that make one machine "faster". I don't know.
    You have written: "And you don't understand that a faster machine does not necessarily produce a lower number, since the number you get includes a whole lot of things that are unrelated to processor speed, and which can affect the outcome significantly depending on lots of other factors."

    It implies that you know what is a general faster machine is. You wrote "a machine" which means any faster machine.

    Quote Originally Posted by bqt View Post
    *Your* pi-spigot implementation is measuring a whole bunch of factors which are sometimes not even closely related to actual processor speed.
    The basic problem is an interesting one, with some potential of giving interesting information. But with your baking of all kind of factors into it, it becomes much less interesting.
    You are essentially not seeing that much of the processor speed aspect as one would have hoped.
    I can again say to you that for 3000 digits we have almost pure processor speed. Moreover you know that there is a formula which allows you to get a number about the processor speed only. Why do you ignore that fact? Of course the formula is not perfect but for the worst known cases its deviation from the perfect numbers is less than 5%.

    Quote Originally Posted by bqt View Post
    Let me tell this story from a different angle.

    When I first saw your note and table, I became interested. I was curious about how a PDP-11 would compare to other systems, so I went there and checked.

    That made me notice that there was no implementation for RSX. Since I like RSX, I wanted an implementation I could run on RSX. The existing implementation would have been ugly to port. In addition, I felt that I could probably improve some on the code. And since you had claimed on that page that these were the most efficient implementations known, and if someone had a better implementation, you would like to hear from them, I figured I should try and do an implementation for RSX, and try to do a better one than what you had.

    So I did that. My first attempt was pretty bad, and it was slow. So I sat down and tried to figure out how I really could make it fast, and did a really serious improvement. However, I still came out slower than what you had in your table. And this was when running on a real 11/70, so I was expecting to be closer than I was. Next I started checking your code to see how it could still be faster. I saw a couple of nice tricks, but I also noticed that if I just included those tricks, my code would then really be better than what you currently had. Doing all the tricks, I was still slower.
    That tricks make code much faster. Without them your code would be very slow. Anyway thank you very much for your efforts. Cooperation with you has made project in general and pi-spigot code for the PDP-11 in particular much better.

    Quote Originally Posted by bqt View Post
    At this point I had to reexamine things at a deeper level. How could my solution be slower, when it was fewer instructions, and better at every point?
    The answer was that I/O was slower. So even though I had a better implementation, the number for your table gave a longer time.
    So the next thing to do was to solve the slow I/O problem. But this is where I hit trouble with you. You were essentially not interested in the faster solution, because it was "cheating". First in not printing while computing, next by doing I/O asynchronous, to make it behave in a similar way to Unix. I also tried solutions that worked for a smaller range, but still covered your runs of 100, 1000 and 3000 digits.
    All such attempts at making faster solutions were met with disapproval for various reasons, all which seemed to not apply so much for other solutions you had in your table.
    Yes. There are rules 1-4 against cheating. You constantly tried to deny or bend them. IMHO your just tried to get unfair advantages for your beloved RSX-11. It can't be allowed. I don't want crazy optimizations by direct write to screen or using smaller, for instance 4x4, fonts, or similar things. All existing programs follow these rules.

    Quote Originally Posted by bqt View Post
    So in the end, if you want to have a fast solution for your condition, the first thing is to find the OS that gives you the most advantage, because otherwise any better implementation will essentially not really show that it is better, since other factors drag it down. So this whole problem becomes a question of finding the most optimal environment, and not the most optimal implementation.
    The OS influence is insignificant for the 3000 digit case. How many times should I repeat this phrase?

    Quote Originally Posted by bqt View Post
    Had I just accepted that, you would never had gotten any number that was faster than what you already had in your tables, and would never have tried to improve your code either, and you would still have claimed that the old code was the most efficient implementation known, ignoring the code I had written which was in fact faster. Because the end number was slower because of factors outside of the implementation. And that is still the case.
    Your code is mentioned as the fastest by 0.01%. This value can't be detected with a timer which precision is only 0.20ms. The links to your code has given. You know that your code has other issues. It, for example, it requires your personal library for it.

    Quote Originally Posted by bqt View Post
    We both know that my variant is still a bunch of instructions less than yours, but the "official" results still shows your implementation being the faster.
    Read carefully my previous text for you.

    Quote Originally Posted by bqt View Post
    This all begs the question - are you really interested in more efficient implementations? Or is it only about some implementation that is good enough, that anything done with some other OS will never be able to beat it, no matter what? I think the answer is obviously that you are not that interested in the most efficient implementation.
    I hope we have interested in a fair competition. Just follow rules. Anyway thank you very much for your participation. Without your help the version for RSX-11 has not appeared. Thank you again.

    Quote Originally Posted by bqt View Post
    Here are my final runs of this exercise.

    This is all run on the same PDP-11/93. Each number is the best achieved after three runs. Except for once or twice, all numbers of the different runs for the same number of digits have been within less than 1% of each other. Many times I get the exact same number when running the program multiple times.

    Code:
                    100     1000    3000
    2.11BSD (patch 465):  
            Number pi calculator v9(EIS-of) - max 7168 digits
    
            Telnet: 0.14    13.08   97.44
            Serial: 0.16    10.28   88.62
    
    RSX-11M-PLUS V4.6:
            number pi calculator v1 (EIS-of) - max 9104 digits
    
            Telnet: 0.60    14.76   103.00
            Serial: 0.18    10.44   89.86
    Worth pointing out that this is running the exact same code, with the exception of how I/O is done.
    Does anyone think such variance is adding something to the exercise?

    Also, note that the Unix version handles significantly less number of digits. But that is ok, even though the problem statement says it should use all available memory, and if doing that, it should be able to hit around 9000 digits. Because the author says it is ok in this case.

    I won't even bother presenting the numbers from my code, because it's all just cheating anyway. Even though it also computes the correct result.
    Thank you very much for your precious results. They are really very interesting. The tables are updated and ready for new results.

    I dare to add several remarks about interpretations for your results. The results show that timings for serial versions for RSX11 and 2.11BSD are almost identical. So the more noticeable difference between timings for telnet versions is definitely caused by lower quality of telnet-server for RSX-11 than for 2.11BSD.

    Let's compare results for 2.11BSD. We can get separate i/o and CPU timings using the mentioned formula. The result of such separation is available on the web-page. For 3000 digits, we get 86.82s for a serial connection and 87.42s for a telnet connection. So we have the difference less than 0.7%.

    Let's also compare results for RSX-11. For 3000 digits, we get 87.93s for a serial connection and 88.2s for a telnet connection. So we have the difference about 0.3%.

    It seems quit good. I have only to say: "Dear bqt, thank you very much".

    We have that CPU timings for RSX-11 is generally a bit higher than for 2.11BSD. But they are higher only slightly and it is quite plausible that RSX-11 is a bit heavier than 2.11BSD. RT-11, on the contrary, is much lighter. It would be nice to get results from the 11/83, or 11/84, or 11/93, or 11/94 under RT-11 to check this accurately.

  7. #37
    Join Date
    Sep 2019
    Location
    Zurich, CH
    Posts
    157

    Default

    I won't even bother commenting on most parts of what you write. I will say that you can't both claim that differences between different OSes are insignificant, and then go on observing that one OS might be "heavier" than another. Either you think they are insignificant, in which case there are none that are "heavier", or else you think that different OSes cause different overhead and extra load, in which case there is some significance to it...

    Anyway, the below needs another comment...

    Quote Originally Posted by vol.litwr View Post
    I dare to add several remarks about interpretations for your results. The results show that timings for serial versions for RSX11 and 2.11BSD are almost identical. So the more noticeable difference between timings for telnet versions is definitely caused by lower quality of telnet-server for RSX-11 than for 2.11BSD.
    Your claim of "lower" quality is both unsubstantiated and wrong. I got very curious with my numbers and had to investigate. And what happens is that 2.11BSD actually are not printing anything at all until after the program finishes. Again, yet another layer of buffering happening in 2.11BSD here. While in RSX, when you print those 4 digits, they are actually sent out at that point in time. Which does lead to a lot more TCP traffic, since you end up with lots of small TCP packets, and corresponding ACKs. Nagle, unfortunately, do not help here, because the communication is so fast that the ACKs come before the next 4 bytes are to be sent. 2.11BSD is doing another layer of buffering here that I haven't investigated. I suspect that TCP is buffering things for you here, because I know that on the terminal line, each chunk is written out more immediately, so this does not appear to be a buffering in the terminal driver. I guess that TCP buffers for some limited time, which gives you much larger chunks of text, but at longer intervals. I explicitly decided to not do such a thing in RSX, since I wanted good interactive behavior.

    So there you have it, plenty of layers of buffering going on, which your program benefits from hugely.

  8. #38

    Default

    Quote Originally Posted by bqt View Post
    I won't even bother commenting on most parts of what you write. I will say that you can't both claim that differences between different OSes are insignificant, and then go on observing that one OS might be "heavier" than another. Either you think they are insignificant, in which case there are none that are "heavier", or else you think that different OSes cause different overhead and extra load, in which case there is some significance to it...
    However the difference is actually almost invisible and RSX-11 results are a bit larger. What is wrong for you? Some things may be happened together sometime.

    Quote Originally Posted by bqt View Post
    Your claim of "lower" quality is both unsubstantiated and wrong. I got very curious with my numbers and had to investigate. And what happens is that 2.11BSD actually are not printing anything at all until after the program finishes. Again, yet another layer of buffering happening in 2.11BSD here. While in RSX, when you print those 4 digits, they are actually sent out at that point in time. Which does lead to a lot more TCP traffic, since you end up with lots of small TCP packets, and corresponding ACKs. Nagle, unfortunately, do not help here, because the communication is so fast that the ACKs come before the next 4 bytes are to be sent. 2.11BSD is doing another layer of buffering here that I haven't investigated. I suspect that TCP is buffering things for you here, because I know that on the terminal line, each chunk is written out more immediately, so this does not appear to be a buffering in the terminal driver. I guess that TCP buffers for some limited time, which gives you much larger chunks of text, but at longer intervals. I explicitly decided to not do such a thing in RSX, since I wanted good interactive behavior.

    So there you have it, plenty of layers of buffering going on, which your program benefits from hugely.
    Thank you. I have chosen rather improper word I should have used word "slower". Indeed it is only for this case.

  9. #39
    Join Date
    Sep 2019
    Location
    Zurich, CH
    Posts
    157

    Default

    Quote Originally Posted by vol.litwr View Post
    However the difference is actually almost invisible and RSX-11 results are a bit larger. What is wrong for you? Some things may be happened together sometime.
    1s at 3000 digits. So, which way is it? Insignificant, or is there a difference between the OSes?
    We are talking about the identical code running in both cases. The difference should ideally be zero. When it is not zero, there is a problem, I'd say, if you want to look at something you call "electronic efficiency", which for you then comes out to different results for the same CPU depending on the OS. Is the "electronic efficiency" really dependent on OS? So the "electronic efficiency" of a CPU can be improved by changing OS? That's a weird number for any kind of efficiency of a CPU.

    Thank you. I have chosen rather improper word I should have used word "slower". Indeed it is only for this case.
    Well, if I were to buffer at the TCP layer, I would again have been accused of cheating anyway.

  10. #40

    Default

    Quote Originally Posted by bqt View Post
    1s at 3000 digits. So, which way is it? Insignificant, or is there a difference between the OSes?
    We are talking about the identical code running in both cases. The difference should ideally be zero. When it is not zero, there is a problem, I'd say, if you want to look at something you call "electronic efficiency", which for you then comes out to different results for the same CPU depending on the OS. Is the "electronic efficiency" really dependent on OS? So the "electronic efficiency" of a CPU can be improved by changing OS? That's a weird number for any kind of efficiency of a CPU.
    The ER value uses only the separate CPU timing, I/O timing is ignored. It is thanks to your comments. However i/o timing for a case, when the 11/93 works under RSX-11 and is connected via telnet, is above 14% of total timing. IMHO it is rather a curious record. Even for 2.11BSD we have it less than 10%. Other cases usually have this value below 5%.

    Quote Originally Posted by bqt View Post
    Well, if I were to buffer at the TCP layer, I would again have been accused of cheating anyway.
    Yes it is stealing digits from a program. It is against rule #4.

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •