Wow, were do I start. To begin with VizReader is the application in question. It’s a pretty weird thing but I love it. I get a small whiff of the same feeling I had when I was a kid and played Dragon’s Breath on the Amiga. Dragon’s Breath was both a weird and wonderful game at the same time and it makes me proud that I have managed to create something that gives me a little bit of that feeling.
Gradually though, other approaches and languages started to appear on my radar, Python, Ruby and its popular web development framework Ruby on Rails were a few of the things I started hearing about and getting interested in. When you’re at this stage of language exploration you’re bound to hear of Lisp sooner or later, Common Lisp is where this story begins in earnest.
It really feels weird now to express myself in terms like “language exploration” and Ruby/Python in the same sentence but that is in fact exactly what it was. These two mainstream languages felt like cutting edge for me at the time (roughly 3 years ago), in the end I had to pick one of them and I picked Ruby (blame superior marketing). Whatever, when you’re in the process of learning Ruby you will sooner or later read stuff like “Ruby is a good lisp” and “Ruby has borrowed so much from Lisp” and so on and so forth.
So I thought WTF is this Lisp thing people are raving about and why isn’t anyone using it since it seems to be the lightsaber to rule them all? Shit like Paul Graham’s Beating the Averages and The Hundred-Year Language didn’t exactly help either, it just got me more stoked and wanting to get to the bottom of it all.
It didn’t take long for me to find stuff like How Common Lisp Sucks and my own experiences weren’t very pleasant either. At this point I realized that CL wasn’t going to be the next step for me, it wasn’t going to beat the averages, or rather make me beat the averages. In fact it seemed like an outdated relic that was in sore need of some attention that it hadn’t been getting.
The fact that Paul was creating his own Lisp didn’t reflect positively on CL either, I mean if CL was the end of the line why was he putting effort into something new? At that time Arc wasn’t yet finished (or named) but the intention had been voiced publicly, anticipations were high but I certainly didn’t have the patience to hang around for the release. If the love of Lisp as a format was so strong among serious programmers and at the same time the frustration with CL was so high and so old then the hundred year language should already be around, I figured, and I was right.
First of all I needed something that would simply run out of the box, I didn’t need some crazy package manager that would present me with half-assed libraries that would maybe install properly if I was lucky. I could do without having to compile too, this was and is a big one for me ever since I started out with PHP and realized how amazing it is to be able to simply run my code without any bullshit steps in between. So I continued looking and eventually found a link to PicoLisp’s download/home page in some comment on some forum.
It’s strange how random things/life are and how such a small thing like reading some small comment can change your whole life.
At first I wasn’t sure that I had found the hundred year language (I am now though). It all seemed too good to be true. And it wasn’t easy either, I’m basically a self educated programmer and a lot of the concepts were beyond me (some still are).
Say hello to Alexander Burger the creator of PicoLisp. VizReader wouldn’t have been possible without his relentlessly dedicated support, one wonders where he finds the time. Given how amazingly effective the default PicoLisp web dev framework is when it comes to creating enterprise style applications maybe it’s not so surprising though, he could easily put something together between breakfast and lunch that it would take a year to finish off in a Java framework and I’m not really joking that much either. That kind of productivity is bound to make you have some time to kill.
Still to this day I don’t know exactly what prompted the creation of PicoLisp, I just know that Alex created the first version in the 80’s while working in Japan, the target hardware was, believe it or not, Macs. Today PicoLisp will run on both Linux and MacOSX, at least the 32bit version, last I checked the 64bit version was only running on Linux.
I’ve always regarded myself as maybe being a notch more clever than average, far from stupid but no genius either. To learn something like PicoLisp it is extremely helpful to have a CS degree and a solid knowledge of how Linux as a system works with pipes, forks and more. I had neither so it took roughly two years of spare time and boat loads of help from Alex to get where I am now.
Apart from the fact that PicoLisp is interpreted (and a Lisp of course) the other main selling point for me is its object persistence and how it’s implemented. I can simply refer to objects on disk as if they were already loaded in memory, and it happens right away, it’s like having an ORM but without the speed penalties and the configuration bullshit (the “schema” is the config).
After a few months of tinkering I decided to try and put PicoLisp to the test in the form of creating a more “normal” web application, you know something that didn’t have anything to do with administrative tasks that require a distributed system juggling half a billion objects. At this point I had started to get fed up with Google’s feed reader and I thought that an RSS reader might be a good target for PicoLisp. A feed reader should be a fairly straightforward and simple thing to make too shouldn’t it? Or so I thought…
Visualizations, feature creep and externalities
As it turns out, making a feed reader is not really that easy, even without feature creep which soon enough became rampant. Theoretically a feed reader in PicoLisp is just simply about downloading XML files, converting the contents to persistent objects and then having an interface that displays the contents of these objects.
Unfortunately there are a lot of feeds out there that don’t comply with either the RSS1/2 or the Atom specs, hence any trivial recursive XML parser will fail miserably. This is what I call external issues, it’s like external costs in economy. Something is causing you grief that you can’t control, like smoke spewing cars or a lot of noise, and comparing the RSS situation on the internet with the real world would maybe be something like the Bangkok traffic. You get nowhere and you get slowly gassed to death while just sitting there and there’s nothing you can do about it once you’re stuck, except slug it out.
Anyway, having overcome the XML problems pretty quickly landed me with a fairly simple RSS reader but wouldn’t it be cool with some natural language processing too in order to calculate similarities between articles in order to create a content discovery feature? Awesome but in the end I had to learn the hard way that in order to get good results you have to commit a fair amount of processing power, no strike that, a huge fucking amount of processing power. That’s why that stuff didn’t make into this initial version of VizReader.
However, through visualizations a fair amount of content discovery can still be had, albeit at the discretion of the user, no magic here that automatically tells you what you might be interested in based on what you’ve already read, unfortunately. I’m back to contemplation mode when it comes to that part, refusing to give up.
The visualizations were fairly easy to implement since I was already familiar with ActionScript, not with Ant and the Linux Flex SDK though… Oh I almost forgot, since I could not really use Alexander’s web framework for most of my stuff I had to come up with something else and that something else is using mostly AJAX by way of JSON so I had to make my own JSON encoder/decoder which wasn’t that big a deal either, but you see how stuff is adding up…
Now of course I couldn’t live without a whole word search (no partials here, that would be too much), setting up such an index with persistent objects in PicoLisp is trivial, awesome done deal, let’s move on to the next item on the list… not. Anyone who is reading this who has actually set up a word index that was meant to be used with a lot of data and used by a non-trivial amount of people has already started smirking by now.
This shit takes up a lot of space and requires a massive amount of writing when importing new data. In order to avoid a lot of write locks during content imports I had to create a distributed system to handle the indexing and it had to scale up automatically as more content was imported. Remote queries on these external indexes are handled in an asynchronous fashion, it would take too long otherwise. That wasn’t really hard either in the end, when I knew how to do it… This part really made me fully appreciate the job that Google is doing though, maintaining a full text index of the whole internet is not trivial, I really feel for the people who are tasked with this, I really do.
As I was using the application I was quickly getting hooked on the realtime aspect of Twitter and I remembered that I had heard about some new way of getting content in realtime for normal blogs too. There was just one tiny problem, the default HTTP server that Alex had already made — and which I was/am also using mostly — couldn’t handle raw content posts which is the case when you want to receive pubsubhubbub updates. Therefore a new specialized HTTP server had to be written. This isn’t really a biggie either, if you’ve done it before… And as it turns out not all pubsub feeds are conforming to the specs either so a trivial solution was a no go here too, at least I haven’t encountered any broken pubsub XML yet but it’s probably just a matter of time… If that happens I have to rewrite the pubsub import completely, awesome.
Oh I almost forgot, the original URL and the URL of each article are sometimes different when the content comes by way of pubsub as opposed to simply reading the source directly. Therefore I had to check targets of 301 redirects which is trivial (after you’ve read up on the ins and outs of Curl) and luckily I could use the same asynchronous approach I used to read the external word indexes in order to prevent these checks from taking forever.
Now what was missing was arbitrary tag trees, tagging articles too, read/not read logic here and there and the ability to like/unlike stuff. And if you like a Twitter post you favor it on twitter.com too. By now you’re probably amazed that I got “finished” at all, I know, me too.
I just want to finish off with acknowledging the huge job Alexander has done when it comes to educating me, thanks! I could not have done it without you.
PicoLisp is really one of a kind, getting to learn it has been a privilege. The mere existence of such awesome things lying around undetected on the internet is both disturbing and amazing. Amazing because they exist at all and disturbing because they’re not used more than they are.
Now true to form I have to give you the breakdown of how to write a big app in an obscure language you don’t know in 10 steps:
1.) Find the homepage of the language in question.
2.) Check the documentation of the language, is it good? If not see #3.
3.) Check how responsive people are who do know the language, if not abandon it.
4.) Check how many bells and whistles there are, ie. how much will you have to write yourself in order to accomplish what you want to do?
5.) If a lot was the answer to #4 then try to gauge how easy/quick it would be to write the stuff you need yourself. If the answer is negative you might want to continue looking for something better suited.
6.) Dig in for the long haul and start with the libraries you need to write, writing them will expose the basics of the language so it’s a good place to start.
7.) Ask a lot of questions rather than trying to find out stuff yourself by trial and error, that’s why #3 is so important.
8.) When writing higher level stuff (a HTTP server for instance) it really helps to study existing code (a presently existing HTTP server for instance).
9.) Find out how to integrate for instance C libraries in your code, writing a regular expression library from scratch doesn’t make sense.
10.) Realize that you have to put down a lot of effort into it and don’t give up!