What do you do to increase the framerate?

BlitzMax Forums/BlitzMax Beginners Area/What do you do to increase the framerate?

Derron(Posted 2007) [#1]
During the last weeks I had to try to improve the framerate for my game on slower/older machines.

At the first glance I tried to decrease the amount of graphics drawn each frame... so when there was a big sprite which had some smaller sprites drawn on it on every time the same relative position, I locked this sprite and drew the smaller sprites on it to save some drawimage-commands.

On older pcs (with older gfx-cards, mainly onboard like the sis or via-chips) there was not such a huge increase. So I thought about dirty-rects, but I do not believe in BlitzMax being able to work this way (eg multiple flips per frame with viewports set to dirtyrects?). Someone out there who is able to convince me of dirty-rects being possible?

Another try was a type "TDirtyImage" I created yesterday... every big sprite (interface-backgrounds, or an skyscraper which is the "level" of my game) is splitted into many small frames. Every DirtyImage is updated per frame and checked for being dirty. E.g. it's checked if the mouseimage is overlapping the dirty-rectangle, if so then the image is marked dirty and drawn with the DirtyRect.Draw(x,y)-method.

Although this works - and gives a small increase, the performance hit was not bigger than 20%. And for myself struggling with lower than 70 fps on the mainscreen (and on my older pc) its to dangerous to believe in constantly over 60fps.


So, to make this post have a finish: What do you do, to increase the (possible) framerate of your application?

I know that newer pcs and newer graphic-cards will not have problems, but for a 2D-game running on a >2GHz cpu with an graphicscard having 64mb (shared) vram I'm not really convinced of Max2D (I know it's 3D to have alpha blending and so on ... but it is still scaring me).

Are there any tips especially for huge sprites (800x200, 500x1000) - SetViewPort and loading them as Animated Image (and drawing the frames separately) or splitting the images and load them separately brings nearly no effect. The only really measurable increase was possible when removing "cls" and just drawing things which are marked dirty.


bye and thanks for your help,

MB


GfK(Posted 2007) [#2]
My solution is easy. Code for 30FPS.

My game logic and drawing update at 30FPS, which gives me a 33ms window to do everything. If it takes longer than this, then all my calculations are adjusted to compensate so that everything still moves at the same speed; everything from sprite movement/alpha to audio triggers/fades.


Derron(Posted 2007) [#3]
Yeah, sure it's possible to decrease the framerate.

But with 30fps it just looks jaggy for a bit faster sprite movement (in my game you can use an elevator to go to different floors of the house... this house is then scrolled up or down with a dY-speed which does not look smooth when using 30fps).

So this would not be the solution I would use - and when doing some more complex gfx, the framerate was kind of slower than 30 fps (some reported 5-10 fps when others reported >200fps ... so the difference of hardware my/our users have is to huge to take such a simple solution).

Me: "hey you will have to buy a new graphics-card, your 5yrs old thingy isn't having the performance needed to play a 800x600 2d-game"
Someone: " Yeah, but your SDL-sample some years ago achieved 500fps in the same scenario while now only 70fps are hardly possible."

bye
MB


tonyg(Posted 2007) [#4]
First thing to do is identify exactly what is taking up the time in your mainloop. Gather simple stats and check which functions/mmethods take up high CPU time or are called often and have high average times. Concentrate on reducing these times or post them here.


Derron(Posted 2007) [#5]
The Flip is the method/function which takes so long... I reduced my tests to just drawing some sprites (like the mentioned above... the bigger ones which are only limited by the setviewport-command or the tests with "animimage")

The physical functions - like changing coordinates, calculating events and so on - they do not limit the game (they could be called some times more than in my game). The Flip-time limits the older graphic-cards to such low fps.


bye
MB


tonyg(Posted 2007) [#6]
What happens with flip 0?


Derron(Posted 2007) [#7]
I'm using flip 0 ... otherwise I would not be able to measure the maximum framerate.

At the moment I'm writing some sample code (creating images on the fly - so I could post the sample without an additional download of mediafiles).


bye
MB


tonyg(Posted 2007) [#8]
OK, that would help. I'm struggling to see how 'flip' will be taking the time.


Derron(Posted 2007) [#9]
I know this code is not the shortest one, but it does not need external mediafiles.

At the moment, the graphics are drawn ten times per flip ... if you increase it to 100, you will see an increase of the time the flip needs...
I think "DrawImage" is not also "draw the image to the memory of the backbuffer" - if this would be that way, there would be a nearly constant flip-time.

Some results:
. . Athlon 1800xp+, 1024mb ram, Radeon 9600Pro, 32Bit, 1280x1024, 60Hz TFT:
. . . . AvgFlipTime: 8ms
. . . . FPS: 96

. . Pentium IV 2,4GHz, 1024mb ram, Sis 651 (64mb shared memory), 32Bit, 1024x768, 75Hz CRT:
. . . . AvgFlipTime: 32ms
. . . . FPS: 27

(slight increase possible, when in fullscreen and 16bit on the sis-chip)

As I am drawing many small sprites (figures, doors, plants, clouds, parallax-scrolling-background-buildings,...) the amount of loops in the code could be 1/2 to be equal to the things my game draws.


SuperStrict

Framework brl.GLMax2D
Import brl.Max2D
Import brl.d3d7max2d
Import brl.glmax2d
Import BRL.StandardIO
Import brl.Random

SetGraphicsDriver GLMax2DDriver()
Global g:TGraphics = Graphics(800, 600, 0, 60, 0)

SetBlend ALPHABLEND     

Global ExitGame:Int = 0
Global GrafikFPS:Int = 0
Global tempGrafikFPS:Int = 0
Global LastSecondTime:Int = 0
Global FlipTime:Int =0
Global FlipTimeSum:Int =0
Global FlipCount:Int =0
Global AvgFlipTime:Int =0

'creating images
'===============================================

Global gfx_interface_left:TImage   = CreateImage(20,373) 
SetColor 255,0,255; DrawRect(0,0,20,373);  GrabImage(gfx_Interface_left,0,0);Cls

Global gfx_interface_top:TImage    = CreateImage(800,10) 
SetColor 205,0,105; DrawRect(0,0,800,10);  GrabImage(gfx_Interface_top,0,0);Cls

Global gfx_interface_right:TImage  = CreateImage(20,373) 
SetColor 255,0,255; DrawRect(0,0,20,373);  GrabImage(gfx_Interface_right,0,0);Cls

Global gfx_interface_bottom:TImage = CreateImage(800,217)
SetColor 155,0,205; DrawRect(0,0,800,217); GrabImage(gfx_Interface_bottom,0,0);Cls

Global gfx_mousecursor:TImage      = CreateImage(32,32)  
SetColor 0,100,0;   DrawRect(0,0,32,32);   GrabImage(gfx_mousecursor,0,0);Cls

'normally an image with alpha component  
Global gfx_interface_antenna:TImage= CreateImage(100,60)
Local Pix:TPixmap = LockImage(gfx_interface_antenna)
For Local i:Int = 0 To 100 -1
  For Local j:Int = 0 To 60 -1
    WritePixel(Pix, i, j, (Int(Rand(255)*$1000000)+Int(255*$10000)+Int(Rand(255)*$100)+Int(0))) 
  Next
Next
UnlockImage(gfx_interface_antenna)

Global gfx_building_house:TImage      = CreateImage(512,1080)  
Pix = LockImage(gfx_building_house)
For Local i:Int = 0 To 512 -1
  For Local j:Int = 0 To 1080 -1
    WritePixel(Pix, i, j, (Int(255*$1000000)+Int(0*$10000)+Int(Rand(255)*$100)+Int(0))) 
  Next
Next
UnlockImage(gfx_building_house)
'===============================================



Function DrawMain()    
  Cls   
  SetColor 255,255,255

 'try increasing the loop-amount here... it affects the FlipTime-value
 For Local i:Int = 0 To 10
    SetViewport(10,10,780,373)   
    DrawImage(gfx_building_house, 127, -420)
	SetViewport(0,0,800,600)
    DrawImage(gfx_interface_top,0,0)
    DrawImage(gfx_interface_left,0,10)
    DrawImage(gfx_interface_right,780,10)
    DrawImage(gfx_interface_bottom, 0, 383)
    DrawImage(gfx_interface_antenna, 98,328)
    DrawImage(gfx_mousecursor, MouseX(),MouseY())
 Next
  SetColor 0,0,0;  DrawRect(10,15,170,60);  SetColor 255,255,255
  DrawText ("FlipTime: "+FlipTime + "ms", 15,20)             
  DrawText ("AvgFlipTime: "+AvgFlipTime + "ms", 15,35)             
  DrawText ("FPS: "+GrafikFPS, 15,50)             
  FlipTime = MilliSecs()
  Flip 0
  FlipTime = MilliSecs()-FlipTime
  FlipTimeSum:+ FlipTime
  FlipCount:+1
  Delay(1)
EndFunction

Repeat
  Local NowSecondTime:Int = MilliSecs()
  If KeyHit(KEY_ESCAPE) ExitGame = 1;Exit
    DrawMain()
	tempGrafikFPS:+1

  'calculating FPS and average FlipTime
  If NowSecondTime-LastSecondTime > 1000 Then 
    GrafikFPS = Ceil((tempGrafikFPS*1000) / (NowSecondTime-LastSecondTime-tempGrafikFPS))
	LastSecondTime = NowSecondTime
	If FlipCount > 0 Then AvgFlipTime = Ceil(FlipTimeSum*100 / FlipCount)/100
	FlipTimeSum = 0
	FlipCount = 0
	tempGrafikFPS = 0
  EndIf	

Until AppTerminate() Or ExitGame = 1



bye
MB


H&K(Posted 2007) [#10]
16bit (I would say lower but its BMax)

Smaller screen

Athough I agree with you, an approx 2GHz cpu with an graphicscard having 64mb (shared) vram should run things like the dogs bollocks, UNLESS the graphics is a 2d card.. in which case you would have said.


Derron(Posted 2007) [#11]
I know that the sis is no real 3d-card, and I know that Max2D wraps 2D-functions to 3D. But if 3D would be that slow - why other applications (mainly dx7 and dx8) run that fast and smooth on the sis-pc while this bmax-application cries: slowdown?!

And before someone states this: the big images created could be a reason of slight slowdowns - but in this case, Max2D should automatically slice them so the amount of vram used is reduced to the best what is possible.


bye
MB


tonyg(Posted 2007) [#12]
My machine shows a big difference as well using the standard max.png image. Not so bad with DX but still noticeable and debug mode doesn't affect it. I *thought* the flip would be pretty constant as (I thought) it's simply switching pointers as the drawing is already done.
Might have another look at the sourcecode.


Derron(Posted 2007) [#13]
To my suggestion of improving the procedure:

Split images into smaller parts and only draw things which are marked for redrawing. Then leaving out the "cls" will also leave the screen as it was after the last flip - this would speed up the process enormously (when no movement or drawing is needed and therefor the screen has not to be updated).

I tried that - and also wondered, that Max restored the screen after minimizing/restoring the application. I thought kind of garbage would be collected at the apps position in vram.

So you (I hope so) see the problem I have...and now to your kindly appreciated solutions ;D...


bye
MB


tonyg(Posted 2007) [#14]
My results seem to be affected by my internet connection.
What was taking 20ms for flip takes 8-9 without an internet connection. The flip time varies between 0 and 18ms when drawing the same info (a single image 1000 times).
<edit> In fact, I'm now getting 4ms so reckon I had something else running when I did my first test.
I'll have another look at your code MB.


H&K(Posted 2007) [#15]
Split images into smaller parts and only draw things which are marked for redrawing. Then leaving out the "cls" will also leave the screen as it was after the last flip - this would speed up the process enormously
Ok, dont bite my head off for this.

But although this would seem to work, you have no garantee that the buffers will keep the flip screen in the same place all the time. That is although it seems that the backbuffer/frount buffer are always the same if you just flip, it is not always so.

Result? Well at some point the backbuffer will move, and the only thing that you will drawing on it would be the "dirty Images".


ImaginaryHuman(Posted 2007) [#16]
You cannot depend on using dirty rects at all. As soon as you flip the backbuffer to the front, the contents of what used to be the front can now be potentially totally trashed. The graphics card might use an entirely different portion of memory for the next frame which could contain any kind of junk data. That's why you have to either redraw the whole screen or do a cls. You might be getting what seems to be maintained backgrounds when you flip, but this is in now way reliable and on many systems it's not going to work. Some systems have multiple buffers beyond double-buffering which would totally mess up your system, and especially in OpenGL you have 0 guarantee that the buffer content is preserved. There is no point at all doing dirty rects because you have to manually preserve the backbuffer e.g. copy it into an image or redraw the whole frame anyway.

Also when you issue commands to the graphics card ie via the OpenGL or DirectX driver, for the most part using Max2D you are in what's called `immediate` mode whereby what you draw is sent to the driver right away, at the time you ask for it, but it is not necessarily DRAWN at that time. The card/driver can buffer a whole bunch of commands and in some cases may not draw *anything* until you do a flip(). So if you are seeing your flip time be different depending on how much is drawn, you can be fairly sure that *your system* is caching the drawing commands and doing them all at once when you call flip(). To remedy this you might try to get all your graphics commands done as early as possible in your game loop, and then try glFlush() for OpenGL, which tells it to start drawing if it hasn't already. That might then allow the driver to do the drawing in parallel to other stuff you're doing with the cpu, which might make it faster. It sounds as though on your system it is waiting for the flip request before it starts to draw, meaning that for the remainder of your game loop where the cpu is doing stuff the gpu is doing nothing.

I would also recommend making sure that your textures are not bigger than the maximum supported texture size on your computer, which in OpenGL is glGetEnv(GL_MAXIMUM_TEXTURE_SIZE,Varptr(MySize)). It will be at least 64 but it is limited on some cards. If the card's max texture size is smaller than the size you're asking for, it is going to create extra slowdown. You'd be better off breaking your images down into, say, chunks of 256x256. Bearing in mind, that will also add overhead, but might be less than what you save.

Also bear in mind that the graphics drawing speed is *highly* determined by the speed of your graphics card. 800x600 is quite a high resolution, compared to when games used to be 320x240. You might be just running into the limitations of your graphics card and how fast it can draw and how fast it can flip.

If you don't need to use alphablending for all objects *switch it off*. It will switch off the blending hardware which takes more time.

If you don't need to rotate or scale your objects smoothly, DO NOT load them as FILTEREDIMAGE. You don't need the extra overhead of interpolation.

If you don't need to use floating point coordinates for image placement, use int's, it may be faster to draw.

You might implement, instead of dirty rects, a `draw only the parts of the images that show`, so that you don't `overdraw` on top of stuff you've drawn already. You'd need to use a drawimagerect type of thing.

16-bit color can be faster for the backbuffer but the textures might still be stored as RGBA8888 (32-bit). I find on my system that 16 is no faster than, and sometimes slower than, 32-bit. The graphics card is optimized for best performance in a particular mode that they think is most popular.

Writing your own OpenGL/DX code will likely let you use features or switch on/off features that you can't with Max2D that may give you more speed. Also if you know you're drawing several objects that are similar or related, you can gain speed with less overhead by not having a glBegin() glEnd() wrapped around every single quad which is the case with Max2D.


Derron(Posted 2007) [#17]
@H&K
I know this already... that's why I'm still searching for other methods to optimize the whole thing.

And there is still the question why the flip-duration is the longer the morge drawimage-commands are passed?

If every drawimage affects a special "pixmap" (size of the screensize - e.g. 800x600) which is thrown to the graphics-buffer of the graphic-card when flip is run than there wouldn't be a change in duration for constant memory being transferred.


bye
MB


H&K(Posted 2007) [#18]
And there is still the question why the flip-duration is the longer the morge drawimage-commands are passed?
Oh sorry I thought this had been a retorical question. I wont answer it, cos Angels 'teen post is better than I could.


Derron(Posted 2007) [#19]
another thing for AngelDaniel: Do you know graphiccards behaving in a different way? All my test-pcs here (mainly ati-cards) are struggling with growing flip-durations.


As I believe to have mentioned ... splitting images into smaller parts nearly made no difference (splitted as anim-images)

Edit: I now also measured the time the drawImage-commands needed... Using GLMax2D they (the loop) need 3ms - using DX they need 50ms and the flip-time is halfed.
Using smallsize-sprites (all 256x256) made no measurable difference.

bye
MB


Derron(Posted 2007) [#20]
Edit: SetBlend - set to solid when images are surely having no alpha (mainly rectangles) brings a slight improvement... but not everytime usable for my game being modifyable when it comes to data and images... but worth a try to optimizing it a bit.


The code below shows nearly no difference on my pcs - either the big-image is drawn or smaller parts (with 256x256).

SuperStrict

Framework brl.GLMax2D
Import brl.Max2D
Import brl.d3d7max2d
Import brl.glmax2d
Import BRL.StandardIO
Import brl.Random

'SetGraphicsDriver D3D7Max2DDriver(),0
SetGraphicsDriver GLMax2DDriver()
Global g:TGraphics = Graphics(800, 600, 0, 60, 0)

SetBlend ALPHABLEND     

Global ExitGame:Int = 0
Global GrafikFPS:Int = 0
Global tempGrafikFPS:Int = 0
Global LastSecondTime:Int = 0
Global FlipTime:Int =0
Global FlipTimeSum:Int =0
Global FlipCount:Int =0
Global AvgFlipTime:Int =0
Global DrawTime:Int =0
Global DrawTimeSum:Int =0
Global DrawCount:Int =0
Global AvgDrawTime:Int =0

'creating images
'===============================================
Global gfx_building_house:TImage      = CreateImage(512,1080)  
Local Pix:TPixmap = LockImage(gfx_building_house)
For Local i:Int = 0 To 512 -1
  For Local j:Int = 0 To 1080 -1
    WritePixel(Pix, i, j, (Int(255*$1000000)+Int(0*$10000)+Int(Rand(255)*$100)+Int(0))) 
  Next
Next
UnlockImage(gfx_building_house)

Global gfx_building_houseA:TImage[8]
For Local i:Int = 0 To 7 
  gfx_building_houseA[i]      = CreateImage(256,256)  
Pix = LockImage(gfx_building_houseA[i])
For Local i:Int = 0 To 255
  For Local j:Int = 0 To 255
    WritePixel(Pix, i, j, (Int(255*$1000000)+Int(0*$10000)+Int(Rand(255)*$100)+Int(0))) 
  Next
Next
UnlockImage(gfx_building_houseA[i])

Next


'===============================================


Global UseBigImage:Byte =0
Function DrawMain()    
  Cls   
  SetColor 255,255,255

 'try increasing the loop-amount here... it affects the FlipTime-value
  DrawTime = MilliSecs()
 For Local i:Int = 0 To 100
    SetViewport(10,10,780,373)   
    If UseBigImage
      DrawImage(gfx_building_house, 127, -420)
    Else
      DrawImage(gfx_building_houseA[0], 127    , -420)
      DrawImage(gfx_building_houseA[1], 127+256, -420)
      DrawImage(gfx_building_houseA[2], 127    , -420+1*256)
      DrawImage(gfx_building_houseA[3], 127+256, -420+1*256)
      DrawImage(gfx_building_houseA[4], 127    , -420+2*256)
      DrawImage(gfx_building_houseA[5], 127+256, -420+2*256)
      DrawImage(gfx_building_houseA[6], 127    , -420+3*256)
      DrawImage(gfx_building_houseA[7], 127+256, -420+3*256)
    EndIf
    SetViewport(0,0,800,600)

 Next
  DrawTime = MilliSecs()-DrawTime
  DrawTimeSum:+ DrawTime
  DrawCount:+1
  SetColor 0,0,0;  DrawRect(10,15,170,120);  SetColor 255,255,255
  DrawText ("FlipTime: "+FlipTime + "ms", 15,20)             
  DrawText ("AvgFlipTime: "+AvgFlipTime + "ms", 15,35)             
  DrawText ("DrawTime: "+DrawTime + "ms", 15,50)             
  DrawText ("AvgDrawTime: "+AvgDrawTime + "ms", 15,65)             
  DrawText ("FPS: "+GrafikFPS, 15,80)     
  DrawText ("<b> to switch", 15,110)        
  If UseBigImage Then DrawText ("using: big Image", 15,95) Else DrawText ("using: 256x256", 15,95)             
  FlipTime = MilliSecs()
  Flip 0
  
  FlipTime = MilliSecs()-FlipTime
  FlipTimeSum:+ FlipTime
  FlipCount:+1
  Delay(1)
EndFunction


Repeat
  Local NowSecondTime:Int = MilliSecs()
  If KeyHit(KEY_ESCAPE) ExitGame = 1;Exit
  If KeyHit(KEY_B) UseBigImage = 1-UseBigImage
    DrawMain()
	tempGrafikFPS:+1

  'calculating FPS and average FlipTime
  If NowSecondTime-LastSecondTime > 1000 Then 
    GrafikFPS = Ceil((tempGrafikFPS*1000) / (NowSecondTime-LastSecondTime-tempGrafikFPS))
	LastSecondTime = NowSecondTime
	If FlipCount > 0 Then AvgFlipTime = Ceil(FlipTimeSum*100 / FlipCount)/100
	FlipTimeSum = 0
	FlipCount = 0
	If DrawCount > 0 Then AvgDrawTime = Ceil(DrawTimeSum*100 / DrawCount)/100
	DrawTimeSum = 0
	DrawCount = 0
	tempGrafikFPS = 0
  EndIf	

Until AppTerminate() Or ExitGame = 1


bye
MB