Sin and Cos Slow?
Blitz3D Forums/Blitz3D Beginners Area/Sin and Cos Slow?
| ||
If i use these ever frame, will it be slower than storing a precalculated array of values. Slower as in being worth while to do it because any good-ish speed boost :) |
| ||
Yes, it will definitely be slower. However, whether it's "worth it", it will depend on your program and target platform as to whether you'd want to use a lookup table versus calculating the values in realtime. If you call it a lot, I would say it would behoove you to use lookup tables. It's not hard, it's accurate enough, and it's faster. What could be wrong with that? Alternatively, you can just use the Sin and Cos commands, see if it's fast enough for your needs, then change it later if you want. |
| ||
Measure, measure and measure again. There is no point in implementing things like lookup tables and other optimizations if that isn't what's slowing your program down. It's like spending hours tweaking the starter on your car. If it starts up like lightning but runs like crap, was it time well spent? |
| ||
Ok, cool, thanks. This is all for the single surface stuff. I use that to rotate my particle, and it gets called very often. Just trying to pack as many optimisations in as possible. And i was a bit lazy to test the speed :) Thanks for your replys. appreciated! |
| ||
Use a lookup table like soja said. Try this: dim sintable#(360) for x = 1 to 360 sintable(x) = sin(x) next then, later in your code, use sintable(60) instead of sin(60). It ends up being much faster. -Later |
| ||
IMHO it's almost completely pointless on a reasonably modern processor (say, 400MHz+); you don't generally use tight loops like the above in a real game, and will usually have more than enough spare processor time per frame so that pre-calc'ing isn't worth the effort. |
| ||
well, i'm gonna have maybe 300 odd particles going on at the same time, and looping thru them once every frame to rotate them and stuff. Maybe save some fps by updating once ever 2 frames or something. I'll tryy both methods for speed :) thanks again! |
| ||
Yeah, your right James. It doesn't seem to be any slower. It is slower if you call it 100,000 times compared to using a lookup table, but that's not a true test. |
| ||
Most of the old optimizations are somewhat pointless on todays PCs ... unrolling loops, look up tables, fixed point math, etc. These things are still recommended to people on forums but I think it's more out of knee jerk reflex than any kind of actual thoughr process or testing. |
| ||
Optimizing techniques are *always* worthwhile if they're applied to the *right* piece of code. The point is you need to figure out what to optimize. Optimizing for its own sake just makes your code worse. That said, Blitz could use profiling tools ;) -- not a priority, it's easy enough to roll your own. |
| ||
Yes, if only there was a preprocessor that had a code profiling option. |
| ||
[q]Optimizing techniques are *always* worthwhile if they're applied to the *right* piece of code.[/q] Yes, but my point is that optimization techniques DO go out of style and usefulness. Things like look up tables are becoming less and less useful with todays processors. So, as podperson said, determine if the piece of code you're looking at is slow to begin with. After that, carefully consider if it's executed often enough to be worth optimizing. THEN start looking into techniques. |
| ||
Optimizations are they key to a fast game ... I remember playing quake 3, five years ago (I think) ... I don't recall seeing 3ghz systems out at that time, and yet I haven't seen anything like quake 3 yet done in blitz. So if anyone tries telling you that optimizations are irrelevent because of processor speed, I wouldn't waste my time listening to them. I would waste it on optimizing :) |
| ||
Ok maybe not five years ago, but anyway, I think optimization is always a key factor regardless of how fast your target system is. Anything that saves on CPU time is always great. Then you can spend more CPU time on more important things such as path finding for AI ... Just remember as system specs. grow that means more ram, and more ram means larger lookup tables, so why not use them if you can ? |
| ||
Are you even reading what I'm writing? |
| ||
So if anyone tries telling you that optimizations are irrelevent because of processor speed, Did you hear EpicBoy say that? The fastest methods of doing things usually stop being the fastest with new technology. Things like lookup tables save infinitisimal amounts of time on older CPUs, and are *SLOWER* on new CPUs. You would be much better off looking at why your game has to call Sin/Cos millions of times a second rather than speeding up the calls themselves, ie optomising your algorithm rather than its implementation. And Sin/Cos lookup tables isn't the only thing that is out of date anymore, far from it, it is just a classic example. |
| ||
Yeah, from my tests, i'd say there is absolutly no real difference between the two. Tried setting up a loop with a fps counter and doing 700 sin calculations and 700 ones using a lookup table. Didn't see any fps difference. Tried using millisecs to see how long say 5000 sin calculations took, then seeing how long 5000 lookup table calculations took. No noticable difference. So sorry for wasting your time with this question when i could have easily done the test myself. ^_^ |
| ||
This is all pretty interesting to me. This is really the first time I've paid much attention to this kind of optimization. (I haven't really felt the need to do any actual tests yet.) Why do you suppose that they take the same amount of time? Are the values calculated once and then just stored in L2 cache for quick retrieval? |
| ||
I have a 2700+ Athlon XP and Sin and Cos lookup tables are considerably faster with my single surface stuff. More so if you can precalculate the multiplier too, but even without that, it's a worthwhile optimisation and takes about 5 minutes to do. |
| ||
How many times are you calling your lookup tables per frame? |
| ||
Ok sorry guys, I was a little hard on the optimization bit, had a long night Saturday night at a wedding, still feeling the effects. :) To do massive number crunching with super high end cpu's for testing purposes is almost irrelevent, because the amount of cpu cycles the processor can do in a second is huge ... but remember when you have other things in your game that start to slow things down like traversing a bsp tree, a quadtree, a surface, or hiding/unhiding objects etc. then those massive amounts of cpu cycles get eaten up pretty quick. You are then left with little cpu cycles compared to what you have started with and that is where lookup tables help out, so you can spend those cycles doing other things like AI, hit detection, physics etc.. @Michael: Agreed their are better ways, but Jokers question was about sin/cos and if a lookup table offered any speed increase. Otherwise you are right, maybe we could help joker optimize his code by reduction of code, and un-needed code. |
| ||
If you wanna help me out :o) go here: http://www.blitzbasic.com/bbs/posts.php?topic=25932 thanks! |
| ||
How many times are you calling your lookup tables per frame? It was for a set of functions rather than a game. I tested it with a few hundred vertices, each requiring, I guess six calls to the lookup tables. Then I cranked it up so high it was creating more than 65535 vertices at one point ( which is the DirectX per-surface vertex limit, I believe ) It was worth doing in either case. That's not to say there aren't better ways to do this. But it's still worth doing, and I've yet to find any high end PC for which it was slower. I tested on a lot of different machines. |
| ||
Right, well i'm working on some optimsations that will be instantly visible, and the i'll try out the different moethods once its all integrated into the game :) Thanks! |
| ||
I've only found two things that actually slow down anything I write. 1. Lots of maths (such as Picks, TerrainY#(), TForms) in long loops (hundreds or thousands per frame). 2. Rendering loads of little fiddly objects or objects with funny blending modes (alpha < 1, add/multiply blending). The second one is generally by *far* the worse problem. |