April 13, 2010

Gateway to the West

Filed under: collabora — Tags: , , , , , , , , , , — alsuren @ 12:16 pm

It has occurred to me that I should probably write a blog about the project that I’ve been working on recently. We have named it Fargo. Its job is to act as a gateway between the XMPP and SIP networks. The aim is to let you make voice calls from your XMPP client to your SIP contacts without needing to run a SIP-capable client on your local machine. It also allows you to receive incoming calls from SIP contacts from your XMPP client.

The program runs as a standard gateway, so if Fargo were running on, then you would tell the details of your SIP account to Once registered, whenever you log into your XMPP account, the gateway will sign you into your SIP account so that you can send/receive calls. For example, if I were to call you, you would see an incoming call from, and it would behave exactly like a call from one of your XMPP contacts. If you wanted to call me then you would just add “” to your XMPP contacts, and then click the call button.

Now, if you wanted to make a one-off call to me, but you didn’t want to have to add me to to your contact list and then remove me again afterwards, Empathy/Gabble already lets you do this (thanks to an extension that we wrote). Just open up the “New Call” dialog, and type in

So what goes on behind the scenes? When you make a call, your XMPP client opens up a pair of ports in your router (most likely using STUN or UPnP). It then sends your public IP address and port numbers to along with a list of which codecs it supports (this is pretty much the same as what it would do if you were calling someone who is using Google’s official client, or the echo service). Fargo then reads this information and sends it over dbus to telepathy-sofiasip, which forwards it on to me via SIP. My SIP client then sends back some IP/port pairs, and a list of supported codecs (as if it were talking to an X-Lite user, or the echo service).

Once both sides are aware of each other’s addresses and codecs, they can start sending media directly to each other using theReal-time Transport Protocol. The gateway is only involved in setting up and tearing down the call, so I can keep you on the phone chatting about the meaning of life for an hour, and the gateway will only be woken up to say “bye” when one of us hangs up. This means that the cost of making an hour long call through the gateway is remarkably small.

Fargo has been extensively tested with telepathy-gabble, both on the n900 and under Empathy. There are some corner cases which don’t always work as well as they could (eg NAT boxes which don’t support hairpin routing, which only work if the SIP server provides a media relay) because many SIP implementations (including telepathy-sofiasip) don’t support ICE candidate negotiation. Also, if there are no compatible codecs between the XMPP client and the SIP client then the call will fail. All other cases which we tested seem to work fine though.

In addition, have some test scripts which demonstrate a Fargo + ejabberd combination comfortably sustaining a rate of two new logins and two new calls per second on a single-core VM. Calls were limited to 10 seconds each in our tests, but since Fargo is only involved in setup and teardown, the length of calls shouldn’t be very important anyway. Fargo also plays nicely with ejabberd’s “bare_source” load balancing method, so if your gateway becomes too popular, you can just add a few servers and cluster.

Currently, Fargo doesn’t relay IMs, and only supports negotiating raw-udp transports, so it’s probably best used in conjunction with a SIP server that includes a media relay. It also relies on your XMPP server to store your roster rather than trying to store one itself. The program is written in Python (using twisted) and utilises telepathy-sofiasip for the SIP side of things (this could in theory be replaced with another connection manager to handle other protocols). Don’t hesitate to get in touch if you want to know more, or get involved.


November 14, 2008

Versioned Home Directory, and other Ideas for Projects.

I have started using version control (bzr) on my home directory. This hopes to eventually solve a few problems:

1) Sharing settings with other people. This is something that I’ve been looking for a solution to for a while (there are standard ways to share apps and themes ( and pals) but not configs. If everyone keeps their configs versioned, then it should be possible to cherry-pick changes more easily.

2) Creating consistant settings across many different linux machines (as discussed in my colourhash post. (side note relating to colourhash: many graph plotting programs have ways of automatically assigning distinct colours. I will look at that at some point too.)

3) As a backup framework: If I have all of my settings under distributed version control on 4 machines, then when I accidentally delete large chunks of my home directory (like the other day, when cmake created a folder called ‘$HOME’ which I wanted to delete…) then I don’t lose all of my rss feeds, proxy settings (which I stil haven’t managed to get working again, thanks to KDE’s incredibly fragile socks[lack-of] implementation/configuration) and email settings (resulting in me not being alerted about emails for 2 days(>20 emails))[/rant]

The progress so far is a ~/.bzrignore file as long as your arm ).

Eventually, I plan to host it on launchpad (as soon as I’ve verified that it doesn’t contain any security-critical information (I don’t think I have anything else that I would have a problem hosting on launchpad. Reading the content of my other posts might give you an idea of my views on privacy.))

==== Technobabble-filled braindump below this line ====

If anyone knows how to do nested repositories (eg. so I can get bzr to manage my ~/src as well, and so that I can have sensitive information like ~/.ssh and ~/.gpg versioned in some way that lets *me* merge them between computers, but doesn’t expose them on launchpad) give me a shout.

In other news, I taught myself a bit of perl last night, when trying to add sed-style text replacement to pidgin (by hacking apart script called whose interface was arbitrarily horrible, and only allows output replacement). I’m currently fighting with pidgin’s settings management to get persistent rules. If anyone wants it, get in touch. Otherwise, it will be in launchpad under ~/.purple/ when I get my home directory on there :D.

If anyone wants to give me input on the interface, it would be muchly appreciated. Currently, we have:
/sed foo-to-bar s/foo/bar/g
(to add a replacement rule)
/sed foo-to-bar s///
(to replace the old rule with an existing rule)

I’m thinking something more like:
/sed s/foo/bar/g
(to add a rule; a number will be assigned to each rule as an identifier)
/sed -l
(to list all rules, and associated numbers)
/sed -d #number
(to delete rules)
/sed -o s/foo/bar/g
(to only correct outgoing text)
/sed -i s/foo/bar/g
(to only correct incoming text)

Unfortunately, perl is a *horrible* language (doesn’t even have a concept of named function arguments) so the resulting code is unlikely to be anything I’m proud of.

While I think about it, I also had a load of ideas for python-based projects:
A man-page parsing command-line completion handler for ipython (and possibly bash, but bash scripts take so much longer to debug, and I get the feeling I will soon be using ipython as my default shell anyway.)
Given that debian policy forces all commands to have a man page, this is a pretty reliable way to write a powerful tab-completer. Also, since you only ever read the man page when the tab completion doesn’t work, you might as well get the tab-completer to read the man page for you.

A callback/decorator library for creating command-line programs, with an interface along the lines of:

@clargs.handles("-f", "--filename")
def input_filename(filename):
    """The filename you want to read."""
    global input = filename

@clargs.arguments("REPEATS", int)
def main(repeats):
     """Reads FILENAME to stdout REPEATS times."""
    text = open(input).read()
    for i in xrange(repeats):
        print text

if __name__ == "__main__":

It should also auto-generate help and man pages using the information given. An even more fun thing to do (with python3000) would be to use nose-style runtime inspection to to detect a function of the form:

def handle_filename(name:str):
    "The filename you want to read."""
    global input = filename

and make that handle –filename input.txt (maybe with an @shortopt(‘f’) decorator.

A subclass of numpy.ndarray that has named axes, and user-specified ranges, so…

likelihoods = semantic_array( ('t', 1000), ('x', -100, 100), ('v', -10, 10) )
# sum over v, and preserve the x and t axes.
position_likelihoods = sum(likelihoods, axis='v')
# get the best guess of x for each time t.
maximum_likelihood_x_estimate = argmax(position_estimate_probs, axis='x') 

A delayed evaluation library (might end up stealing a lot of ideas from scipy and sympy, with a good chunk of twisted to boot)
An interesting feature of python is that it doesn't have an assignment operator is *purely* a pointer-update. When you say "x=y" it just makes x point to the same thing y points to. This means that if you get passed x into a function, you can safely write x = x*10, and it won't modify x in the code that called you. This lack of side-effects (and all manner of other things) makes many python libraries look like pure-functional libraries.

On the down-side, it won't let let you override the assignment operator, so when you're dealing with large amounts of data, you can't re-use arrays without jumping through hoops. If X is a 1000000x1000000 matrix, your choices are:
X = multiply(X,10) # The canonical form, but it creates a temporary variable for the return value which is alive at the same time as X (potentially taking up twice as much memory as needed)
multiply(X, 10, output=X) # The numpy interface (potentially does the multiplication in place)
X *= 10 # works in-place, and is all very good, but what if I want to do something that's not +=,-=,*= or /=?

Then there are the slightly more hacky options, which involve delayed evaluation:
X == multiply(X, 10) # Override the logical equals operator X.__eq__ (the sympy method for writing symbolic equations). This is the most horrible, because it stops you being able to do X = (X==Y)
context.X = multiply(context.X, 10) # Override the attribute assignment operator context.__setattr__
X[:] = multiply(X, 10) # Override the item assignment operator X.__setitem__  (or sometimes X.__setslice__, I think)
Note that this last one feels quite like fortran, but it might be the least horrible of all the interfaces.

So how would these things work? A simple sympy-style one looks like this:
def multiply(X, Y):
    def deferred_calculation(output):
        numpy.multiply(X, Y, output=output)
    return deferred_calculation

# In X's class definition:
def __eq__(self, deferred_calculation):

The problem is that if you accidentally do X = multiply(X,Y) then X is just the deferred_calcuation function. That’s not very useful. On the other hand, if the returned “deferred_calculation” object can be made to behave like a numpy array, then you’re in for a win.

The fun stuff will start to happen when you start using these “deferred_calculation” objects, and passing them in and out of other functions, so you have a massive chain of deferred calculations. If you then include an interface for inspecting chains of deferred objects, you can start to write deferred-to-$LANGUAGE compilers, which would let you write “say what you mean” algorithmic code in python.

A way of writing twisted applications in a blocking style (using generator expressions).
This idea is in some ways quite similar to the idea above. I’m sure I sketched an implementation up somewhere (possibly on the eee), but I don’t seem to have posted about it. The jist of it is as follows:

In the example below, unblockify (implementation omitted) acts like a filter in two ways:
From the caller’s point of view, some_generator produces a sequence, but only things that aren’t deferreds get let out of the filter to the caller.
From the generator’s point of view, “yield” acts like a filter (or in compsci terms, a “map”). Any “deferred” objects sent through it get turned into real objects, and any real objects sent through it disappear. I’m still deciding whether to do something magical when None gets yielded. We’ll see.

def some_generator():
    while True:
        result_of_deferred = yield function_which_returns_deferred()
        yield some_immediate_function(results_of_deferred)

for out in some_generator():
    print out

While this program appears to be blocking, it shouldn’t cause unresponsiveness in GUI applications. This is because filter passes control to the twisted reactor when it’s waiting for each deferred function.

If anyone is interested in any of these projects, please shout.

Create a free website or blog at