Introducing Frock, a flocking chicken simulation written in Lua with Löve

I recently learned about LÖVE, a 2D game framework for Lua off HN. It looked simple enough, and when I ran the particle system demo and it was pretty fast. It seemed faster than what Ruby Shoes would be able to do. I got rather excited because

  1. I’ve always wanted to make my own game. A lawnmowing game comes to mind.
  2. I wanted to see if I could create a large flocking simulation

Also, I’ve decided a while back to start writing more great projects and do less reading of tech news garbage. While #1 would be on the backburner for the time being, #2 was fairly easy to do in a couple of hours. (I think about a weekend’s worth of hours total tweaking things).

I’ve been fascinated with decentralized systems since right after college–one of them being flocks. But how do they work? For a long time, ornithologist (bird scientists) had no clue how birds flocked. Birds seem to be able to move in unison, as a super-organism, swooping, expanding/contracting, splitting. When you see massive waves of these things (these are starlings), it’s something to behold. Who is the leader? How do they coordinate these flocks?

We still don’t exactly know, since we don’t yet have the capabilities to tap into a bird’s decision making mechanism in real time. However, in the 1990’s, Craig Reynolds demonstrated that you can get very convincing flock-like behavior generated procedurally, using three simple rules. And it ends up that you don’t need a leader to get flocking behavior. All interactions can be local, and each flock member (or boid, as he called it), just needed to follow three simple rules:

  1. Attraction: Move towards the perceived center of the flock according to your neighbors
  2. Repulsion: Avoid colliding with your neighbors
  3. Alignment: Move in the same direction and speed (velocity) as your neighbors.

Add up all these vectors together and you get a resultant velocity vector. Different amounts of the three influences leads to different looking types of flocks. As an aside, particle swarm optimization works on the same sort of principles.

So here it is. Introducing Frock, a flocking simulator in Lua Love.


I release it now, because while it’s primitive, it works (release early, release often!). The screenshot doesn’t really show it well, they fly about in a flock, hunting for plants to eat. It’s rather mesmerizing, and I find I just stare at it, the same way I stare at fish tanks.

It was originally a port of the Ruby Shoe’s hungry boids, but I used flying chickens lifted from Harvest Moon instead. I originally had cows flying about, but without flapping, it just wasn’t the same. I also made the repulsion vector increase as a function of decreasing distance. Otherwise, the chickens didn’t mind being right on top of each other if they were on the inside of the flock.

My immediate goal is to make it support more chickens, so I can get a whole swarm of them. Right now, I’m using an inefficient algorithm to calculate which chickens are neighbors (basically n^2 comparisons). So if any of you have good culling techniques applicable here, I’d love to hear it. I’m currently looking at R-trees.

There are different possibilities as to where it could go. I think while lots of people have written boid simulations, they haven’t taken it much further than that. While I’ve seen ones with predators, I haven’t seen anything where people try to evolve the flock parameters, or try to scale it up to a large number of chickens. One can also do experiments on whether different flock parameters have different coverage of the field, or which set of parameters minimizes the time a plant is alive.

If at the basic, it becomes a framework for me to write ‘me and my neighbors’ decentralized algorithms, that’d be useful too. And since Lua is suppose to be embeddable into other languages, that makes it an even more exciting possibility. Later on, I’ll write a little something about Lua.

Well, if you decide to try it out yourself, get to the Frock github repo, and follow the readme. Patches are welcome, but note I haven’t decided on a license yet–but it will be open source. If you have questions, feel free to contact me or comment. Have fun!

Algorithm always matters, even when parallel

I’ve finally had some time to start looking at things that looked interesting to me. One of the things that I started looking at was parallel algorithms.

I was reading Hacker News, and I saw some link that lead me to discover Cilk, a C extension to easily make parallel programs.

thread int fib(int n) {
if (n < 2)
return n;
else {
cont int x, y;
x = spawn fib (n-2);
y = spawn fib (n-1);
sync;
return x+y;
}
}

This was the example given for calculating the Fibonacci sequence in parallel. This is the standard mathematical way to define it, and it looks clean enough. So instead of trying it out in Cilk, I fired up Erlang to try my hand at doing a port. I found it a little bit difficult because while you can easily spawn processes in Erlang, there was no quick way for the parent process to wait/sync/join child processes and get their results. Since that was besides the point of the exercise, I fired up Ruby, even though they had a slow Threading library (which is suppose to be fixed in 1.9, plus Fibers!) I’ll do with it Erlang some other time.

First I wrote the threaded version to mimic the Cilk code:

def fib_threaded(n)
if n < 2
return 1
else
threads = []
x = 0
y = 0
threads << Thread.new(n - 2) { |i| x = fib_threaded(i) }
threads << Thread.new(n - 1) { |i| y = fib_threaded(i) }
threads.each { |thread| thread.join }
return x + y
end
end

I don’t have a multi-core processor. I’m on a 3 year old 1GHz laptop. At a mere fib(18), it was taking about 21 seconds to run. To see if there was a difference, I wrote a serial version.


def fib_serial(n)
n < 2 ? 1 : fib_serial(n - 1) + fib_serial(n - 2)
end

This one ran much much faster. It took about 0.02594 seconds. At this point, it’s probably the overhead of thread creation that’s making it take so long to run. Maybe with green threads or lightweight threads, the threaded version would run much faster. That makes me want to try it in Erlang just to compare. But wtf, adding shouldn’t take that long, even if it is 0.025 seconds

When I thought about it, it was an efficient algorithm: there’s a lot of wasted cycles. In order to compute f(n), you have to calculate f(n – 1) and f(n – 2) in separate threads.

  • The f(n – 1) thread requires it to spawn two more threads to compute f(n – 2) and f(n – 3).
  • The f(n – 2) thread requires it to spawn two more threads to compute f(n – 3) and f(n – 4).

Notice that both the threads for f(n – 1) and f(n – 2) have to spawn two different threads to calculate f(n – 3). And since this algorithm has no way for threads to share their results, they have to recompute values all over again. The higher the n, the worse the problem is, exponentially.

To calculate the speedup given to an algorithm by adding more processors, you calculate the amount of total work required and divide it by the span of the parallel graph. If that didn’t make sense, read lecture one for Cilk, which is where the diagram comes from. So for fib(n)

Twork = O(n^2)

The total amount of work is the total number of processes spawned. Since every f(n) recursively spawns two other processes, it’s about n^2 processes.

Tspan = O(ln n)

The total span is how many nodes a particular calculation traverses. A la the diagram, it’s about the height of the tree, so that’s about ln n nodes.

Therefore, for fib(n), the processor speed up is at most:

Tw / Ts = O(n^2) / O(ln n)

I don’t know of any reductions for that off the top of my head, but you can see that the processor speedup gain grows somewhere in between n and n^2. On one hand, it means this algorithm can benefit from speedups by adding up to somewhere between n and n^2 processors. However, that also means that to make this algorithm go as fast as it can to process fib(1000), you need more than 1000 processors to make it worthwhile. Not so good for a parallel program that’s just doing addition.

As a last version, I wrote one that computed the Fibonacci from 0 up to n, and keeping the total as I go along, instead of the recursive version that has to work its way n back down to zero.


def fib_loop(n)
sequence = [1, 1]
if n < 2
sequence[n]
else
sequence = (1..n-1).inject(sequence) do |t, e|
t << (t[-2] + t[-1])
end
end
sequence[-1]
end

It’s not space effective since I wrote it quickly, but this beat the pants off the other two running at 0.00014 seconds. As you can see, you’re not recalculating any f(n) more times than you need to.

I wish Cilk had a better first example to parallel programs. Given that the guy making Cilk is the same guy that co-wrote the famous mobile book for algorithms, I guess I was surprised. However, it was a fun exercise, and made me think about algorithms again.

I’ll find some other little project that requires me to write in Erlang, rather than falling back on the comfortable Ruby. snippet! Below if you want to run it yourself.

#!/usr/bin/ruby
# rubinacci.rb by Wilhelm

def fib_threaded(n)
if n < 2
return 1
else
threads = []
x = 0
y = 0
threads << Thread.new(n – 2) { |i| x = fib_threaded(i) }
threads << Thread.new(n – 1) { |i| y = fib_threaded(i) }
threads.each { |thread| thread.join }
return x + y
end
end

def fib_serial(n)
n < 2 ? 1 : fib_serial(n – 1) + fib_serial(n – 2)
end

def fib_loop(n)
sequence = [1, 1]
if n < 2
sequence[n]
else
sequence = (1..n-1).inject(sequence) do |t, e|
t << (t[-2] + t[-1])
end
end
sequence[-1]
end

def run(&algorithm)
start_time = Time.now
integer = 18
answer = algorithm.call(integer)
elapsed = Time.now – start_time

puts “Fib(#{integer}) = #{answer}”
puts “Total elapsed time: #{elapsed} seconds”
end

puts “threaded:”
run do |integer|
fib_threaded(integer)
end

puts “serial:”
run do |integer|
fib_serial(integer)
end

puts “loop:”
run do |integer|
fib_loop(integer)
end