Backward compatible
Using JavaScript to split text string into word tokens, taking account of punctuation and whitespace and UTF-8 charset

I got an interesting problem today. I was supposed to check some HTML form before submitting to see if the text entered by the user in textarea has some specific words in it. Googling around I found a lot of stuff like “how to split text separated by commas” and such, but I simply wanted to extract words from a paragraph like this one.

My instinct was to use String.split() function, but it splits on a single character and I would have to write a recursive or iterative function to split on all non-word characters. Not being able to predict all the crap users can enter, this did not look like the right choice.

Luckily, I discovered String.match() which uses regex and is able to split text into an array of words, using something like this:

var arr = inputString.match(/\w+/g);

Cool, eh? Now, this all went fine for ASCII English text. But I need to work with UTF-8, or more specifically, Serbian language. Serbian Latin script used by my users has only 5 characters that are not from ASCII set, so I wrote a small replace function to replace those 5 with their closest matches. The final code looks like this:


var s = srb2lat(inputString.toUpperCase());
var a = s.match(/\w+/g);
for (var i = 0; a && i < a.length; i++)
{
    if (a[i] == 'SPECIAL')
        alert('Special word found!');
}

function srb2lat(str)
{
    var len = str.length;
    var res = '';
    var rules = { 'Đ':'DJ', 'Ž':'Z', 'Ć':'C', 'Č':'C', 'Š':'S'};
    for (var i = 0; i < len; i++)
    {
        var ch = str.substring(i, i+1);
        if (rules[ch])
            res += rules[ch];
        else
            res += ch;
    }
    return res;
}
";

If you use some other language, just replace the rules array with different transliteration rules.

Disabling alerts stops JavaScript execution in #Firefox

Today I learned about interesting issue with newer versions of Firefox (I use FF7). It has a nice web developer-friendly feature to disable alerts. This is really useful when you place alert() by mistake in some loop and you can’t get out because as soon as you click OK, you get another one.

New Firefox has a checkbox to disable future alerts. And this is great. So, what’s the problem? Once you disable alerts, and javascript code is executed that would display it, it does not keep running, but rather throws an exception. This does not look like correct behavior to me.

Imagine a web application that alerts user about something and then keeps running to finish the job. If user disabled alerts because he was in a hurry and clicked fast on different message boxes, the script would not keep going but stop. And there is no way to revert that short of reloading the page (yikes!).

I found a workaround, I created a function called tryalert that wraps the alert in try..catch block. It looks like this:

function tryalert(message) 
{
    try { alert(message); } catch(e) {}
}

This is a fine workaround. Now instead of alert() I call tryalert() and although the alert is not displayed anymore, the code keeps going as if user has been alerted.

The problem is introducing tryalert to ALL applications I’ve written so far. It’s impossible. I hope Firefox team changes this.