hunar4321/particle-life

200+ FPS (on an simple laptop)

ker2x opened this issue · 12 comments

ker2x commented

i'm still busy refactoring and i'm currently relying on intel's oneAPI and TBB for multithreading.
But you can check the code here https://github.com/ker2x/particle-life/tree/oneapi-dpl/particle_life/src , and perhaps backport the modification to a normal compiler and normal lib. (or i'll dot it myself some day i guess).

It's not fully optimized yet but, notable change :

  • Using Vertex Buffer (vbo) instead of bruteforcing call to circle.
	void Draw(colorGroup group)
	{
		ofSetColor(group.color);
		vbo.setVertexData(group.pos.data(), group.pos.size(), GL_DYNAMIC_DRAW);
		vbo.draw(GL_POINTS, 0, group.pos.size());

	}
  • Using SOA instead of AOS. better possible vectorization, and it was needed to efficiently use VBO anyway
struct colorGroup {
	std::vector<ofVec2f> pos;
	std::vector<float> vx;
	std::vector<float> vy;
	ofColor color;
};
  • it should also allow to add more color more easily (i hope)

  • major cleanup of interaction code

void ofApp::interaction(colorGroup& Group1, const colorGroup& Group2, 
		const float G, const float radius, bool boundsToggle) const
{
	
	assert(Group1.pos.size() % 64 == 0);
	assert(Group2.pos.size() % 64 == 0);
	
	const float g = G / -100;	// attraction coefficient

//		oneapi::tbb::parallel_for(
//			oneapi::tbb::blocked_range<size_t>(0, group1size), 
//			[&Group1, &Group2, group1size, group2size, radius, g, this]
//			(const oneapi::tbb::blocked_range<size_t>& r) {

	for (size_t i = 0; i < Group1.pos.size(); i++)
	{
		float fx = 0;	// force on x
		float fy = 0;	// force on y
		
		for (size_t j = 0; j < Group2.pos.size(); j++)
		{
			const float distance = Group1.pos[i].distance(Group2.pos[j]);
			if ((distance < radius)) {
				const float force = 1 / std::max(std::numeric_limits<float>::epsilon(), distance);	// avoid dividing by zero
				fx += ((Group1.pos[i].x - Group2.pos[j].x) * force);
				fy += ((Group1.pos[i].y - Group2.pos[j].y) * force);
			}
		}

		// Wall Repel
		if (wallRepel > 0.0F)
		{
			if (Group1.pos[i].x < wallRepel) Group1.vx[i] += (wallRepel - Group1.pos[i].x) * 0.1;
			if (Group1.pos[i].x > boundWidth - wallRepel) Group1.vx[i] += (boundWidth - wallRepel - Group1.pos[i].x) * 0.1;
			if (Group1.pos[i].y < wallRepel) Group1.vy[i] += (wallRepel - Group1.pos[i].y) * 0.1;
			if (Group1.pos[i].y > boundHeight - wallRepel) Group1.vy[i] += (boundHeight - wallRepel - Group1.pos[i].y) * 0.1;
		}

		// Viscosity & gravity
		Group1.vx[i] = (Group1.vx[i] + (fx * g)) * (1.0 - viscosity);
		Group1.vy[i] = (Group1.vy[i] + (fy * g)) * (1.0 - viscosity) + worldGravity;
//		Group1.vx[i] = std::fmaf(Group1.vx[i], (1.0F - viscosity), std::fmaf(fx, g, 0.0F));
//		Group1.vy[i] = std::fmaf(Group1.vy[i], (1.0F - viscosity), std::fmaf(fy, g, worldGravity));

		//Update position
		Group1.pos[i].x += Group1.vx[i];
		Group1.pos[i].y += Group1.vy[i];
	}

	if (boundsToggle) {
		for (auto& p : Group1.pos)
		{
			p.x = std::min(std::max(p.x, 0.0F), static_cast<float>(boundWidth));
			p.y = std::min(std::max(p.y, 0.0F), static_cast<float>(boundHeight));
		}
	}	
}

i still have some crap to clean :)

  • using oneapi::parallel_invoke for parallelization
	oneapi::tbb::parallel_invoke(
		[&] { interaction(red,   red,   powerSliderRR, vSliderRR, boundsToggle); },
		[&] { interaction(red,   green, powerSliderRR, vSliderRG, boundsToggle); },
		[&] { interaction(red,   blue,  powerSliderRR, vSliderRB, boundsToggle); },
		[&] { interaction(red,   white, powerSliderRR, vSliderRW, boundsToggle); },
		[&] { interaction(green, red,   powerSliderGR, vSliderGR, boundsToggle); },
		[&] { interaction(green, green, powerSliderGG, vSliderGG, boundsToggle); },
		[&] { interaction(green, blue,  powerSliderGB, vSliderGB, boundsToggle); },
		[&] { interaction(green, white, powerSliderGW, vSliderGW, boundsToggle); },
		[&] { interaction(blue,  red,   powerSliderBR, vSliderBR, boundsToggle); },
		[&] { interaction(blue,  green, powerSliderBG, vSliderBG, boundsToggle); },
		[&] { interaction(blue,  blue,  powerSliderBB, vSliderBB, boundsToggle); },
		[&] { interaction(blue,  white, powerSliderBW, vSliderBW, boundsToggle); },
		[&] { interaction(white, red,   powerSliderWR, vSliderWR, boundsToggle); },
		[&] { interaction(white, green, powerSliderWG, vSliderWG, boundsToggle); },
		[&] { interaction(white, blue,  powerSliderWB, vSliderWB, boundsToggle); },
		[&] { interaction(white, white, powerSliderWW, vSliderWW, boundsToggle); }
	);

this is me slowly learning to use oneAPI and SYCL in order to offload all the parallel code to the GPU in the future (in a new project)

The biggest performance improvement come from the use of SOA and VBO.

Great job looking forward to it 👍 👍 💯

It is not working for me

ker2x commented

It is not working for me

yes. I'll try to make something mergeable with the main project, and independent of oneAPI.

ker2x commented

it took me a while. the code is unfortunately much slower on MSVC than on intel compiler. But still faster than the previous version of course.

i also removed the dependency to intel TBB so no 200FPS (it can still be seen in commented code however)

Hi! I want to set different particle sizes depending on their type. For example, red should be 1.0 pixels, green 1.2 pixels, blue 1.4 pixels, and so on. I saw in the code that a size is defined for all particles in ofApp.h. How could I condition this particle size with an "if"?
I'm referring to this piece of code:
void draw() const
{
ofSetColor(r, g, b, 100); //set particle color + some alpha
ofDrawCircle(x, y, 1.5F); //draws a point at x,y coordinates, the size of a 1.5 pixel circle
}
My colors are defined by generic names:
void ofApp::restart()
{
if (numberSliderα > 0) { alpha = CreatePoints(numberSliderα, 0, 0, ofRandom(64, 255)); }
if (numberSliderβ > 0) { betha = CreatePoints(numberSliderβ, 0, ofRandom(64, 255), 0); }
if (numberSliderγ > 0) { gamma = CreatePoints(numberSliderγ, ofRandom(64, 255), 0, 0); }
if (numberSliderδ > 0) { elta = CreatePoints(numberSliderδ, ofRandom(64, 255), ofRandom(64, 255), 0); }
if (numberSliderε > 0) { epsilon = CreatePoints(numberSliderε, ofRandom(64, 255), 0, ofRandom(64, 255)); }
if (numberSliderζ > 0) { zeta = CreatePoints(numberSliderζ, 0, ofRandom(64, 255), ofRandom(64, 255)); }
if (numberSliderη > 0) { eta = CreatePoints(numberSliderη, ofRandom(64, 255), ofRandom(64, 255), ofRandom(64, 255)); }
if (numberSliderθ > 0) { teta = CreatePoints(numberSliderθ, 0, 0, 0); }
}

I would like to define something like:
void draw() const
{
ofSetColor(r, g, b, 100); //set particle color + some alpha
if (numberSliderα > 0)
{
ofDrawCircle(x, y, 1.0F); //draw a point at x,y coordinates, the size of a 1.0 pixels
}
if (numberSliderβ > 0)
{
ofDrawCircle(x, y, 1.2F); //draw a point at x,y coordinates, the size of a 1.2 pixels
}
if (numberSliderγ > 0)
{
ofDrawCircle(x, y, 1.4F); //draw a point at x,y coordinates, the size of a 1.4 pixels
}
if (numberSliderδ > 0)
{
ofDrawCircle(x, y, 1.6F); //draw a point at x,y coordinates, the size of a 1.6 pixels
}
if (numberSliderε > 0)
{
ofDrawCircle(x, y, 1.8F); //draw a point at x,y coordinates, the size of a 1.8 pixels
}
if (numberSliderζ > 0)
{
ofDrawCircle(x, y, 2.0F); //draw a point at x,y coordinates, the size of a 2.0 pixels
}
if (numberSliderη > 0)
{
ofDrawCircle(x, y, 2.2F); //draw a point at x,y coordinates, the size of a 2.2 pixels
}
if (numberSliderθ > 0)
{
ofDrawCircle(x, y, 2.4F); //draw a point at x,y coordinates, the size of a 2.4 pixels
}
But Visual Studio gives me errors because these sliders are not defined here (they are defined in the GUI, class ofApp final : public ofBaseApp).
Please help me!

ker2x commented

i assume your are referring to old codebase. Can you post a link to your code ?

the easiest way to do this would be to add a radius property to the point struct. and then you would just ofDrawCircle(x, y, radius)

Here is my last version of the code.
Manuel_src.zip

i assume your are referring to old codebase. Can you post a link to your code ?

the easiest way to do this would be to add a radius property to the point struct. and then you would just ofDrawCircle(x, y, radius)

I tried the new code, but it works extremely hard. I couldn't get more than 8 fps at 8 colors of 1000 particles each. Then, another shortcoming of the new code is that the particles look extremely small, like little dots where you can't really distinguish the color shades. The structures formed don't look good at all because of this.

I also tried to insert a fullscreen button but it didn't work. I used ofToggleFullscreen(), but the screen kept blinking without showing anything.
I'm also trying to figure out how to introduce the 3D vision function (with 3D glasses).
Also, I don't know how to save the color palette generated before saving the model. The particle colors are generated when pressing the buttons that trigger the restart. But when I save the model, the existing colour palette is not saved so that I can load it later. Do you have any idea how I could save this color palette?

ker2x commented

I'll take a look at your code, and also patch my code to allow drawing circle.
my code shouldn't be slower. this is weird. are you using openmp in your code ? I might have forgot to reenable it.

ker2x commented

I'll check the fullscreen problem as well.

I'll take a look at your code, and also patch my code to allow drawing circle. my code shouldn't be slower. this is weird. are you using openmp in your code ? I might have forgot to reenable it.

I took your exact code and just added more colors and those extra buttons. And after compiling, the code worked extremely hard. I had to set a small number of particles (under 1000 of each color) to go at 10 fps.

Here is your code with my additions in it in old way, that works fine with 17313 big particles at 61 fps:
src 1.7.6.5.zip

Here is a proof screenshot:
202301060731

Here is your code with my additions in it but in new way, that works slow and particles are very very small (near dots) with 9600 particles at 4 fps:
src 1.8.5.zip

Here is a proof screenshot:
202301060705