Benchmarking row multiply in Ruby’s Linalg

While using LinAlg to write and understand/review information gain, I found that I wanted the equivalent of a row multiply, notated as [*] (it’s non-standard. I made up the notation), where:

A [*] x = B

[[a00, ..., a0M] [[x0] [[a00 * x0, ..., a0M * x0]
[... ...] [*] [x1] = [... * x1, .... ...]
[aN0, ..., aNM]] [x2]] [aN0 * x2, ..., aNM * x2]]

I want to map each row with each corresponding value of x. This operation is not standard matrix multiplication, though I feel like it should be!

I admit I slept in Linear Algebra class. Thus I was a bit dumbfounded about how to express it using normal matrix multiplication. While I could do it using LinAlg’s mapping functions, I knew that it could be done because it was all linear transformations.

Luckily James Lawrence, who maintains LinAlg was emailing me, and asked me about it (thanks!). He gave me some code that did it both ways. After I read it, I slapped myself.


# using straight up matrix multiplication
def calculate1(con)
DMatrix.diagonal(con*DMatrix.new(con.hsize,1,1)).inverse*con
end

# using map
def calculate2(con)
DMatrix.diagonal((con*DMatrix.new(con.hsize,1,1)).map { |e| 1.0/e })*con
end

I wasn’t sure which one was faster. Maybe they’d be about the same. Maybe doing matrix multiply would be slower because you’d have to go through all the multiplications. Per James’s suggestion, I benchmarked them, in the fine form of yak shaving.

Run on a laptop 1.1GHz Pentium M, 758MBs, I did two experiments. For one experiment, I kept the matrix at 800 rows, then I grew the columns. Let’s see what it looks like:

time (secs) vs column size

I should have labeled the axis in GnuPlot, but you’ll live. The y axis is the number of seconds, and the x axis is the column size. Uninteresting, but nice to know it’s linear. The two functions aren’t significantly different from each other in the amount of time that it takes. I expected the map (calculate2) to be much faster since it didn’t have to do all the multiplies. oh well.

I almost didn’t run the second test, but it proved to be a bit more interesting. This time I kept 800 columns and grew the number of rows. Same axis. Different graph:

time (secs) vs row size

Whoa! It’s exponential. Or quadratic. I can’t tell. Anyway, anything curving up is bad news. I suspect this might have something to do with row major/column major ordering. C stores matrices row by row, whereas Fortran stores it column by column.. Update: As corrected by James L., in the first experiment, growing columns will create more multiplies inside the diag() call, but the size of the diagonal will stay the same. Growing the rows, however, will create less multiplies inside the diag() call, but each increase in row size will increase both dimensions of the resulting diagonal matrix, giving us n^2. So it’s quadratic.

So what I went looking for wasn’t what I meant to find. But given that I’ve read about it before, it wasn’t super interestingit would have made sense had I thought about it, I guess it’s not super surprising. Let’s see if we can use transpose to cut down on the time. For one part, we’ll grow the rows as before, and compare it to growing rows, but transposing the input then transposing the output, to get the same deal. What’s it look like:

time(secs) vs row size

This is good news. Even if transpose is an extra couple of manipulations, it saves us computation for bigger matrix sizes. The most interesting part of the graph is the crossing of the two graphs. If somehow, LinAlg (or any other package for that matter) can detect where that inflection point is going to be, it can switch off between the two. The only thing I can think of is another package lying underneath doing sampling of each implementation randomly whenever a user calls the function to do interpolation of its growth curve, and then calculate the crossing analytically. I don’t currently know of any package that does this (or if it does, I don’t know about it, cuz it already performs so well by doing the switch!)

This was a nice little diversion from my side projects…a side project of side projects. Back to learning about information gain and its ilk. At least something came out of it. I have a nice little experiment module that I can use to do other experiments. And I spent way too much time on it not to post something…

I share my code below, and you can run it if you want. snippet!

# These are the experiments that I ran
#!/usr/bin/ruby

require 'experiment'

require 'linalg'
include Linalg

def calculate1(con)
DMatrix.diagonal(con*DMatrix.new(con.hsize,1,1)).inverse*con
end

def calculate2(con)
DMatrix.diagonal((con*DMatrix.new(con.hsize,1,1)).map { |e| 1.0/e })*con
end

DATA_PTS = 10
MAX = 800
MID = MAX / 2
STEP = MAX / DATA_PTS

# Growing columns
Experiment::compare(1..MAX, STEP,
proc { |col_size| DMatrix.rand(MID, col_size) },
proc { |matrix, col_size|
calculate1(matrix)
},
proc { |matrix, col_size|
calculate2(matrix)
})

# Growing rows
Experiment::compare(1..MAX, STEP,
proc { |row_size| DMatrix.rand(row_size, MID) },
proc { |matrix, row_size|
calculate1(matrix)
},
proc { |matrix, row_size|
calculate2(matrix)
})

# Growing rows, transposing
Experiment::compare(1..MAX, STEP,
proc { |row_size| DMatrix.rand(row_size, MID) },
proc { |matrix, col_size|
calculate1(matrix)
},
proc { |matrix, col_size|
calculate1(matrix.transpose).transpose
})
# Experiment.rb - perform timed experiments across on one dimension
require 'rubygems'
require 'gnuplot'

# Experiment is a module where you can use to plot and benchmark pieces of code
# and plot it on a gnuplot
#
#
module Experiment
class << self

def timer
start_time = Time.now
yield
diff_time = Time.now - start_time
puts "That run took #{diff_time} seconds"
return diff_time
end

# takes range to try and the step resolution of the range and runs
# the benchmark_block with each of the different ranges.
# the init_proc only gets run once during setup.
#
# Experiment::benchmark((1..50), 10,
# proc { |col_size| DMatrix.rand(500, col_size)},
# proc { |matrix, col_size| ProbSpace.new(matrix).entropy })
def benchmark(range, resolution, init_proc, benchmark_block)
xs, ts = [], []
range.step(resolution) do |x_size|
object = init_proc.call(x_size)
xs << x_size
ts << timer { benchmark_block.call(object, x_size) }
end
plot(xs, ts)
end

# same idea as benchmark, but does two at the same time. So
# it takes an init_proc to initialize the experiment,
# and two different procs, standard and comparison, to run
# for each step in the range at the given resolution.
#
# Experiment::benchmark((1..50), 10,
# proc { |col_size| DMatrix.rand(500, col_size)},
# proc { |matrix, col_size| ProbSpace.new(matrix).entropy1 },
# proc { |matrix, col_size| ProbSpace.new(matrix).entropy2 })
def compare(range, resolution, init_proc,
standard_block, comparison_block)
xs, s_ts, c_ts = [], [], []
range.step(resolution) do |x_size|
object = init_proc.call(x_size)
xs << x_size
s_ts << timer { standard_block.call(object, x_size) }
c_ts << timer { comparison_block.call(object, x_size) }
puts "#{x_size} = standard : comparison :: #{s_ts.last} : #{c_ts.last} secs"
end
plot(xs, s_ts, c_ts)
end

def plot(x, *ys)
Gnuplot.open do |gp|
Gnuplot::Plot.new(gp) do |plot|
ys.each do |y|
plot.data << Gnuplot::DataSet.new([x, y]) do |ds|
ds.with = "linespoints"
ds.notitle
end
end
end
end
end

end
end