mikecann.co.uk: 5,000,000 Chrome Crawlers? Why not [haXe & WebGL]

Thursday, 11 August 2011

5,000,000 Chrome Crawlers? Why not [haXe & WebGL]

Following on from my previous experiments into the world of haXe and HTML5 I have been playing around again with trying to get as many 2D sprites on screen as I can.

I started by reading some posts by google on how to render things fast in HTML5, and it got me thinking. Where I was likely going wrong with my HaXe + Three.js experiments was that I was making a separate draw call to WebGL for each and every crawler. Draw calls are expensive and hence I was reaching the draw call bottleneck at just 2000 sprites being rendered at once.

What I needed to do was batch the draw calls together and render them at once. I knew from my work on XNA you could group sprite render calls together quite nicely. So I started off coding up a WebGL equivillant of SpriteBatch from XNA. I managed to get it kind-of working but as is often the way another thought struck my brain, and I decided to tackle that instead.

My thought was; why not just render everything as 2D point sprites? I remembered from my XNA days you could achieve quite staggering numbers of 2D sprites in DirectX by using point spites.

So after a little bit of fiddling and hacking I managed to get point sprites correctly rendering:

You can play with it here: http://mikecann.co.uk/projects/HTML5SpeedTests/HaXeWebGL/bin/

The great thing about point sprites is that I only use one draw call per frame and the GPU is very good at rendering them. The only bottleneck really is the number of pixels you need to draw. With that in mind if you drop the size of the point sprite down to 1x1 you can render a very large (5million) points at interactive framerates (18fps):

I added a "dont use texture" option just out of interest to see how expensive the texture lookup in the fragment shader was, it didnt seem have much of an effect.

There are a few caveats to using point sprites:

Firstly in WebGL has a limit on how large you can make them currently it differs between graphics cards and browsers, a safe minimum is 64x64 so this means you cant have them and bigger and still want it to work in all situations.

Secondly, and this one is more important I cheated to get the numbers above. In my other experiments with haXe and WebGL I was using the CPU to update the positions of the crawlers each frame, having them bounce off the screen edges. In this point sprites demo however I have the points flowing out of a fountain, the simulation of which is entirely calculated on the GPU. The reason for this I talked about in a paper I wrote for a university project 4 years ago:

If I wasn't to perform the upates on the GPU but instead just use the CPU to update the crawlers that would mean the javascript (CPU) would need to update 5million crawlers each frame then re-upload the point sprite positions back to the GPU for rendering. This would obviously be a bad idea.

So I kind of cheated. The fountain simulation on the GPU isn't the same as my other examples. The crawlers don't bounce off the side of the screen. To make that that happen each crawler needs to somehow preserve its state between frames.

Currently each crawler's position is calculated in the vertex shader each frame like so:

[codesyntax lang="glsl"]

attribute float maxAge;
attribute float startingAge;
attribute float velX;
attribute float velY;

uniform float uTime;
uniform float uPointSize;

varying float invAgeRatioSq;

void main(void)
{
	float age = mod(uTime+startingAge, maxAge);
	float ageRatio = age / maxAge;
	float invAge = 1.0 - ageRatio;
	invAgeRatioSq = 1.0 - (ageRatio * ageRatio);

	gl_Position = vec4((-velX*ageRatio*0.8), (velY*ageRatio)+(-0.4*age*ageRatio)-0.5, 0., 1.);

	gl_PointSize = uPointSize;
}

[/codesyntax]

To preserve state between frames I need to use textures to record each crawlers position and velocity, then using Vertex Texture Fetch a vertices position can be calculated.

That however will have to wait for another evening ;)

I have uploaded the source for this project here incase anyone was interested:

http://mikecann.co.uk/projects/HTML5SpeedTests/HTML5SpeedTests_2.zip

I warn you its ugly and uncommented, however it should be enough of a start for anyone looking to do something similar.

18 comments:

Ali Jaya Meilio12 August 2011 at 08:30
Wow... it's just wow... nice work XD... keep it up XD
ReplyDelete
Replies
Anonymous12 August 2011 at 08:31
:) Thanks Ali, I will do!
ReplyDelete
Replies
Skial Bainn13 August 2011 at 06:01
Very impressive, awesome work!
ReplyDelete
Replies
Jonathan Dunlap15 August 2011 at 11:25
amazing, thanks for posting!
ReplyDelete
Replies
theRemix16 August 2011 at 12:32
you are awesome! keep up the great work!
ReplyDelete
Replies
WebGL around the net, 15 September 2011 | Learning WebGL15 September 2011 at 06:57
[...] Useful hints and tips on particle animations from Mike Cann. [...]
ReplyDelete
Replies
Graphics Noob20 September 2011 at 00:24
interestingly at 1,000,000 particles of size 1, checking "dont use texture" drops my framerate from 30 to 15fps..
ReplyDelete
Replies
Anonymous20 September 2011 at 01:37
Really? Hmm that's really weird, I would have thought a texture lookup would have been MORE expensive not less
ReplyDelete
Replies
nemo23 September 2011 at 15:47
Hm. On this machine (Firefox 9 20110922, Radeon HD 4670, Ubuntu) I had ~33fps both with and without the don't use texture option. Floated around 32-36fps so was a little hard to be sure, but it certainly didn't move noticeably.
ReplyDelete
Replies
nemo23 September 2011 at 15:52
Yeah... Tried 10,000,000... steady 10fps. With or without that option enabled. Seemed to make no difference. Different cards do different things?
ReplyDelete
Replies
Anonymous24 September 2011 at 02:35
Nice! 10fps for 10million! You got a nice gfx card there mate :)
ReplyDelete
Replies
nemo1 October 2011 at 13:49
Heh. Retested on my home machine.
Nvidia this time, instead of ATI. GeForce 9800 GT.
Ubuntu 10.10 instead of 11.04, but still Firefox Nightly.
13 FPS for 10 million. With or without texture option.
ReplyDelete
Replies
Anonymous2 October 2011 at 02:47
Interesting, I think im going to have to upgrade my gfx crd ;)
ReplyDelete
Replies
nemo2 October 2011 at 07:09
So anyway, apart from as a speed test, I was wondering, is it possible to do a combined approach?
Like, if you wanted to still have the bouncing physics of the particles on some part of the page that you mentioned, trace approximate paths in JS for a few hundred particles at a low-ish resolution, like moving 5px per interval, and then use that to update the GPU particles?
ReplyDelete
Replies
Anonymous2 October 2011 at 08:18
No need, you can do collisions with images all on the GPU like I did in the past with my LieroXNA project.

Im working on doing this in WebGL, just need more time to spend on it ;)
ReplyDelete
Replies
MikeCann.co.uk » Blog Archive » GPU State Preserving Particle Systems with WebGL & HaXe20 October 2011 at 05:02
[...] the idea was to build upon what I had been working with previously with my stateless particles systems with WebGL and HaXe. The intention from the start was to replicate some of my very early work (from 2007) on state [...]
ReplyDelete
Replies
Kenneth Hurley13 August 2012 at 21:26
Micheal,

There is no reason to store any of data in in texture for stateless particle systems. Verlet integration will do it for you.

This was done back in 2002, in Direct X 8 with simple pixel shaders 1.1 and vertex shaders without any vertex texture reading necessary:
http://www.realistic3d.com/Explosions.htm

Also, take a look at a follow up article explaining more on achieving what you are after.
http://www.gamasutra.com/view/feature/130535/buil...
ReplyDelete
Replies
mikecann14 August 2012 at 01:46
EDIT: I apologise. I mis-read your post. Yes I know textures arent required. If you look at the shader code above that I have used for this fountain simulation, it makes no mention of state or textures. It is a Verlet Intergration simulation in the way that it calculates a particle position given a number of starting conditions and time delta T.
ReplyDelete
Replies

Add comment