Flash performance for game dev: native render VS BitmapData framebuffer
Asked Answered
L

5

20

I develop a 2D shooter game with lots of objects and aggressive scrolling.

QUESTION: which way is better?

CHOICE 1 - use native Flash rendering:

  • derive game objects from Bitmap, use existing x, y, width, height, bitmapData
  • add all objects as children UIComponent.addChild(...) to sccreen
  • clip visible area using "scrollRect"

CHOICE 2 - write custom rendering using "bitmap + copyPixels"

  • use own game object with x, y, width, height, bitmapData
  • add a Bitmap to a screen, take bitmapData from it
  • redraw every ENTER_FRAME: bitmapData.lock(), iterate over game objects and copyPixels() into bitmapData, then bitmapData.unlock()
  • custom clipping: do not render out of screen objects

Here in this question some people complain that "bitmap + copyPixels()" is slow.

EXPERIMENT: I have implemented both techniques:

Please, try them and tell which one is better (faster, smoother, eats less CPU).

Wait until there will be at least 250 enemies (counter above the screen).
UPDATE: Try to open Task Manager (or $top) and see overall CPU usage

UPDATE 2: I've changed the code, now creeps spawn much faster.

Lixivium answered 19/6, 2009 at 13:52 Comment(1)
If you use Mr. Doob's in-SWF profiler, you won't have to guess at memory usage or framerate: code.google.com/p/mrdoob/wiki/statsDink
I
4

Update: thanks for the high-stress version. Again, I couldn't really see a difference just running around. But I cleverly figured out that "r" drops turrets, and when I dropped 20-30 turrets, the native version was somewhat slower than the manual one, so maybe I was wrong. (I saw no difference in memory usage.) It still seems like doing things natively ought to have the potential to be faster, but it may well be that it would require specialized handling of some opaque sort.

Since this was accepted I'll add a note to make explicit what I said in a comment to a different answer: If all your assets are bitmaps themselves, then as HanClinto points out it's not surprising to find that compositing them manually can be faster than making native objects and letting Flash do the work, since it eliminates the overhead associated with display objects, like event structures.

However there are probably situations where doing things manually might win out, such as if you have vector contents that need to be rendered into bitmaps, or lots of animated sprites, or if you need to detect mouse events on your actors (which you'd need to do manually, perhaps painfully, if you do your own compositing).

So if you don't need to do anything that would slow down manual compositing, it appears to definitely be the best answer, and if you do, then trying both approaches is the best way to be absolutely sure. (A hybrid model is also possible, where you make one layer of native objects that need mouse events, and overlay or underlay it with a layer of manually composited bitmaps.)

Inappropriate answered 19/6, 2009 at 14:3 Comment(0)
D
4

I found the custom (bitmap rendered) version to be much faster, and would have expected that.

Flash's DisplayList is designed to account for a huge number of variances in the DisplayObjects and as a result will not be the most efficient route to go (unless you end up accounting for all those variances yourself in AS3, in which case you'll lose to the native code).

For example, for tile rendering (where you're doing copyPixels for tiles), a custom bitmap renderer will be far faster than have hundreds of DisplayObjects on the DisplayList. You can also likely use a specialized clipper to toss out tiles, whereas Flash ends up doing very generic bounding-box calculations and tests.

In re: variances, for example, in your custom version the "building" sprite wobbles as the character moves around, probably due to a float-to-int conversion or round-up instead of round-down in your code.

Daune answered 21/7, 2009 at 18:36 Comment(0)
D
2

If you are doing hundreds or thousands of objects on a screen (such as with intense particle effects), then you will have better performance with CopyPixels.

A lot of this just depends on what you're trying to do, right?

Dink answered 19/6, 2009 at 14:51 Comment(5)
Well, someone has to do "the math": either your code or flash player. Both versions of SWF use bitmaps inside. By the way which SWF was better?Lixivium
HanClinto, why would you expect CopyPixels to perform better?Inappropriate
@oshyshko: #2 performed much better for me, but it's hard to give you good metrics without an FPS counter. Have you tried using a performance-measuring widget like Mr. Doob's? code.google.com/p/mrdoob/wiki/stats @fenomas: Just because DisplayObject is a pretty heavyweight class. It has a lot of checks for fancy things that you are probably not using, and if you manage all of your rendering code yourself, you're free to cut out a lot of the cruft. In my experience, the Flash display stack really isn't all that efficient (especially for thousands of objects, such as a big tilemap).Dink
I see what you mean han... if all your assets are bitmaps you would be bypassing a lot. The last time I did something similar I had many assets that were general displayobjects, and needed to be rendered before compositing, so as I recall the native approach came out ahead.Inappropriate
@fenomas: Sorry for the late reply -- not sure why I didn't think of this before, but one thing that you might have been missing in your example is bitmap caching. DisplayObject can do caching of its objects, so that it doesn't have to recalculate filters every frame, but something about the way you were doing things might have been bypassing the caching, and causing a slowdown. I'm not positive, it's just a thought.Dink
P
1

Flash can natively handled hundreds of sprites on a modern pc or mac without losing performance, so I's vote for going with display objects.

Psychoactive answered 19/6, 2009 at 14:24 Comment(0)
R
0

I have a lowend laptop, Intel Mobile 1.6Ghz/512MB, Firefox 3.5.x, Flash10.0.32.18, WinXP

I can cleary see a big difference.

native version: less than 10sec goes up to CPU99% and movement is jerky. custom version: stays below

Btw, is there any chance to get example code as an exercise.

Relay answered 7/10, 2009 at 11:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.