Painite teaser

It seemed like “What the heck is a Ycombinator” struck a chord with people. In retrospect, I probably should have done less rambling. So after a silence of a week, I’ll get to the point. Here’s a little teaser of what I’ve been working on over the holidays.


require 'painite'

p_es = Prob::EventSpace.new do |es|
es << Prob::RandVar.new("cancer") do |pdf|
pdf["cancer = :sick"] = 0.01
pdf["cancer = :healthy"] = 0.99
end
es << Prob::RandVar.new("test | cancer") do |pdf|
pdf["test = :pos | cancer = :sick"] = 0.8
pdf["test = :neg | cancer = :sick"] = 0.2
pdf["test = :pos | cancer = :healthy"] = 0.096
pdf["test = :neg | cancer = :healthy"] = 0.904
end
end

p_es["cancer = :sick | test = :pos"] # => 0.8 * 0.01 / (0.8 * 0.01 + 0.096 * 0.99)

It’s still not quite complete, as it’s harder than I expected to generalize it. The interface is still a little bit in flux, but I hope to nail it down soon.

Advertisements

What the heck is a Ycombinator

I woke up at 4am this morning for no reason and decided to go through Ycombinator in Ruby. Given that I read Hacker News daily, it’s a shame I didn’t know what a Ycombinator was. I thought the article was pretty clear, at least compared to Raganwald’s article. As an aside, from purely a learning experience, it was kinda fun and eye opening. At least I can see why geeks think it’s neat.

I had written a reply to a comment on hacker news about it, and it turned out to be really long, so I just extracted it out here. I was suppose to start some coding, but I got wrapped up in writing up the comment. My comment ended up to be pretty long, but it was shorter than it could have been. You guys get to suffer through the long version. Haha. But I’m not going to go into details. You can read that in the linked articles. Instead, I’ll paint some broad strokes.


Y = λf·(λx·f (x x)) (λx·f (x x))

or in Ruby, copied from the aforementioned article:


y = proc { |generator|
proc { |x|
proc { |*args|
generator.call(x.call(x)).call(*args)
}
}.call(proc { |x|
proc { |*args|
generator.call(x.call(x)).call(*args)
}
})
}

or if you prefer elegance, Tom’s solution in response to Raganwald


def y(&f)
lambda { |x| x[x] } [
lambda { |yf| lambda { |*args| f[yf[yf]][*args] } } ]
end

I found lambda calculus above hard to read. However, if you go through the code in Y Combinator in Ruby, you’ll find it’s not too bad. I find that this lecture is also pretty good, as it takes you step by step, with a little bit of humor as well.

If I had to take a stab at it, A Ycombo is a way to implement recursion mechanism when the language doesn’t provide named recursion, loops, or iterators, and all you get are first-class functions and a few substitution rules.

Functions are just mappings from one set of things to another set of things–ie. give it a number, and it’ll point to another number. Ycombo relies on a property of functions that sometimes, when you put something into a function, you get the exact same thing out, i.e. something gets mapped to itself, like f(x) = x^2, where f(1) = 1 is an example of this property. They call this the fixed point of a function.

The thing about functions is that they don’t just have to map numbers to numbers. They can map functions to other functions. A derivative is an example of a function that takes one function as an input, and spits another function back out. like d(x^2) = 2x. Where is the fixed-point of a derivative? One of them is when you put e^x into it. d(e^x) = e^x. I’m sure there are more.

This is important because if you can find the point in which a function can return a function unchanged, you can use that to call the function again, which is what we call recursion. And all the trickiness you see in ycombinator that you see is mainly a result of functional programming not keeping state, so you have to pass everything you need into a function. So if you have to do recursion, you have to have a mechanism pass the function itself, so you can call it again. And this mechanism kinda bends back onto itself, and that’s why you see a part of the ycombinator repeated twice in the code above, and in the lambda calculus.

It seems pretty inane to use ycombo given that modern high level languages provide named recursions, and if anything, for loops and iterators with closures. But what if you don’t? How do you process lists/arrays without loops or named recursion? Generally, you’d have to make your own recursion mechanism.

When would you ever have languages that don’t have those things? Probably not often. But Forth comes to mind. I’ve never used Forth, but from what little I know about it, the language starts off with some basic primitives and that’s it. No loops, no if statements. How do you get anything done? Well, you build your own control structures. People use Forth because it’s possible to build your own very small compiler from the ground up written in Forth itself, and still understand the entire thing. I suspect it’s used in embedded programming because of that reason.

So you’d pretty much use it when you want to have a powerful system from first principles. I’m guessing it’s possible to build your own computer by implementing only functions and substitution rules in hardware, and then you can derive everything else, numbers, pairings, and even recursions in software. That way, you keep your hardware costs down, while retaining the power of a Turing machine.

Speculating here…but another thing that might be interesting is that ycombinator might be a way to compress code. Software size should be reducible if the code is compressible. And recursion can be thought of as a compressing code, and expanding it when the recursion is run. I wonder if there’s other ways to reduce software bloat with the use of a ycombinator besides recursion?

(Aha!) Part of the reason why great hackers are 10 times as productive

I knew in college that some dudes were faster than I was in terms of programming. Since peer programming wasn’t exactly encouraged in college, and at work I did mostly prototyping work, I never really knew how fast other programmers worked.

So when I read Paul Graham (and Joel’s) claim that great hackers are at least ten times as productive as average programmers (too lazy to cite right now), I was kinda shocked. Surely, ten times is an order of magnitude! Something that takes an average programmer a 40 hour week to do the great hacker can do in a 4 hour afternoon?

I wondered about that, since there are times when I get stuck on something, then I just start stabbing around randomly out of frustration. I had assumed that great hackers were faster only because they had either the experience or insight to side-step whatever I was doing wrong.

But lately, I’ve been re-reading everyone’s essays that write about programming productivity. And one thing that caught my eye the second time around was when Paul Graham was talking about bottom up programming and how he didn’t really believe in objects, but rather, he bent the language to his will. He was building new blocks for himself so he could think about the problem at a higher level of abstraction.

This is basic stuff! I mean, that’s the whole point of higher-level programming. When you refactor methods out, you’re creating a vernacular so that you can express the problem in terms of the problem domain, not in terms of general computing. This is much like if you want to drive a car, you’d want to be able to step on the gas, rather than time the firings of the pistons. And if you want to control traffic in a city, you’d rather tell all cars to go to a destination, rather than stepping on the gas and turning the steering wheel for each one.

But taken into the light of bending a language to your will, it makes it more clear for me as to how great hackers are ten times as productive. Great hackers are productive not only because they know what problems to sidestep and can problem solve systematically and quickly, but they also build a set of tools for the problem domain as they go along. They are very good pattern recognizers and will be able to generalize a particular pattern of code, so that they can use it again. But not only that, great hackers will create an implicit understanding attached to the abstraction, ie. what we might call common sense.

A case in point. Before Ruby, I’d used for loops over and over again, never really thinking that I could abstract a for loop. It wasn’t until they were taken away in Ruby did I realize that map, inject, and their cousins are all abstractions of the for loop. When I see “map” I know that it performs a transformation on every element. But I also know that the array I get back will be the same size, that each element operation doesn’t affect other elements, among other things. These are implicitly stated, and they allow for shorter code.

When that happens, you can simply read “map”, and get all the connotations it comes with, and hence it comes with meaning. It also becomes easier to remember, since it’s a generalized concept that you can apply in different places in the code. The more times you use it, the easier it is to remember, instead of having specialized cases of the same kind of code where the behavior is different in different parts of the code.

A great hacker will take the initial time upfront to create this generalized code, and will save in the long run being able to use it. Done over and over again, it all adds up. So it’s not that for any given problem, a great hacker will be done in 4 hours what it takes an average programmer 40 hours, but that over time, a great hacker will invest the time to create a tools and vocabulary that lets him express things easier. That leads to substantial savings in time over the long haul.

I hesitated writing about it, as it’s nothing I (nor you) haven’t heard before. But I noticed that until recently, I almost never lifted my level abstraction beyond what the library gave me. I would always be programming at the level of the framework, not at the level of the domain. It wasn’t until I started writing plugins for rails extracted from my own work and reading the Paul Graham article that a light went off for me. It was easier to plug things like act_as_votable together, rather than to still mess around with associations (at the level of the framework). I still believe you should know how things work underneath the hood, but afterwards, but all means, go up to the level of abstraction appropriate for the problem domain.

DSLs (Domain specific languages) are really just tool-making and vernacular creation taken to the level of bending the language itself. It’s especially powerful if you can add implicit meaning to the vernacular in your DSL. It’s not only a way of giving your client power in their expression, but it’s also a refactoring tool so that you can better express your problem in the language of the problem domain. Instead of only adding methods to your vernacular, you can change how the language works. It was with this in mind that I did a talk on DSLs this past weekend at the local Ruby meetup. First part is on Dwemthy’s Array, and the second is using pattern matching to parse logo. Both seemed pretty neat when I first read about it. Enjoy!


DSL in Ruby through metaprogramming and pattern matching

Getting rid of emacs 23 splash screen

Pretty Emacs Reloaded

The latest emacs 23 is pretty neat, but it noticeably has that splash screen in the beginning. It’s generally good practice for an app not to present new users with just a blank screen. There’s should be something there to say, ‘hey, these are the next steps’.

That said, I’ve found it annoying, probably because I’m use to it not being there. So I’ve found that you can turn it off through this comment.


(setq inhibit-splash-screen t)

Just put that in your .emacs file in your home directory, and you’re all set. tip~

Generating rdoc for gems that refuse to generate docs

I recently upgraded to capistrano 2.1, and it’s woefully lacking in documentation. Jamis Buck had already picked a documentation coordinator about a month ago, but nothing seemed to be happening since.

So it was time to go dumpster diving in cappy code. I at least wanted to see what the standard variables were. To my surprise, there were some docs in code, but I couldn’t generate it with


gem rdoc capistrano

For those of us that had never made a gem, all you have to do to force it to is to edit the associated specification for the gem (/usr/lib/ruby/gems/1.8/specifications/), and add


s.has_rdoc = true

Maybe if I dig enough stuff out of it, I’ll pose some prelim documentation for cappy 2.1 and donate it to the documentation effort.

As an aside and musing, ideally, the code itself would be documentation. However, just because you’re reading code, doesn’t mean that you can get the overall picture of how to use it. Even if you filtered out the details, and saw just the class and method declarations, that still wouldn’t be enough since you can’t see how things fit together. I don’t know what a good solution would be. The simplest API is often complex enough to be nearly useless without good documentation.

Small tip!

The user owns this and that

In most any community-based website, your users are creating objects. And sometimes, you’d only like to display them when they’re owned by the user. Usually, it’s easy enough to do the check based on the association in the template views.



"story"

However, for the sake of perhaps over-reaching, but something more readable, we try:


class Story < ActiveRecord::Base
belongs_to :account

def owned_by?(user)
self.account == user
end
end



"story"

But this gets to suck, because you have duplicate code in different classes that are just slightly different due to association names. One way to solve it is to put it in a module, and include it in different classes. After all, that’s what mixins are for.


module Ownership
def owned_by?(user)
self.account == user
end
end

class Story < ActiveRecord::Base
include Ownership
belongs_to :account
# blah blah blah
end

class Post < ActiveRecord::Base
include Ownership
belongs_to :account
end

Or alternatively, you can put it in the account, but in reverse. But now, you have to search through the associations of the owned ActiveRecord object.


class Account < ActiveRecord::Base
def owns?(ar_object)
ar_object.class.reflect_on_all_associations(:belongs_to).detect { |assoc|
ar_object.send(assoc.name) == self
} ? true : false
end
end

I find the tertiary operator to be kinda ugly there, but it doesn’t make sense to return the reflection object. Regardless, this lets you do:



"story"

However, this doesn’t work for self-associated AR objects, or objects that have two or more belongs_to to the same account. It relies on a unique belongs_to association for every object belonging to account. I’m not sure yet, which way’s the better way to go, and in the end it probably doesn’t matter as much, but I do like being able to say user.owns?(anything) for any object without really thinking about what the associations are. half-tip.