Tuesday, February 06, 2007

Features & Bugs oh my!

Being on slashdot, reddit and other news sites has its ups & downs. On the positive side, you get a lot of viewers, and for the most part people seemed fairly positive. I also got a lot of useful bug reports about such wonderful things as cross sight scripting vulnerability, places where things should have been escaped, and a few feature requests.

Since All The Code has a fairly small database, since it only indexes .java files and does not look inside cvs/svn/git or even zip files for the time being, a lot of searches didn't return many results. The solution to this is to allow substring matching IE (you search for "btree" it finds "tree" and gives you tree). This has the wonderful effect of giving you lots of results, but depending on what you are searching for this stemming can lead to much sadness, giving not so good results at time. My present solution involves some magic determining when to use stemming and when not to use stemming. After being on slashdot I got an e-mail from someone who asked for the ability to manually turn on/off stemming and some of the other fuzzy matching, so I've added that feature.

Another feature request I got is allowing users to customize how it displays the code. Right now it is using the defaults of vim2html, which I'll admit aren't that readable on a white background. Since I have to replace vim2html I'm looking around at a number of the alternatives and seeing what it would take to make it user customizable. I also got an e-mail from an fellow who is working on a nice method to embed code inside pages along with the relevant meta data (like license, language, language version, etc.) that I'm going to take a look at.

A few people noticed that I had forgotten to escape all of the special characters between the front end program and the back end program, which was resulting in some less than fun results for certain searches. Fortunately thats now been fixed. Another bug related to escaping is that you could insert arbitrary java script into the pages with a fairly simple query, which has also been fixed (in my defense, since there is no account system it didn't pose a security risk, but it did make certain pages ugly).

Anyways, thats enough rambling from me, I've got a lot of code I should be writing and a lot of non-code work I should be doing as well.


Martijn said...

Is all the code searching down? I get a server not found / down message from firefox when attempting a search.

Holden Karau said...

It was down, but it should be back up again. I will probably wright a small post about it when I get the time. Sorry about that.

Free Blog Counter