Working with collocates

Now that Karen is expecting me to write more Perl scripts to analyse collocates I think it’s time to install the Text::NSP module from CPAN.

Published
Categorized as Uncategorized Tagged

Perl Collocates

Karen and I were talking about linguistics and textual analysis, and how she wanted to analyse the writings of the Perl community. So, to make a start we decided to write a short Perl script to extract word level n-grams from some text so we could start looking for interesting collocates.


$n=4;
undef $/;
@txt = split /\W+/, lc <>;
for($i = 0; @txt-$i > $n;  ++$i) {
    print "@txt[$i..$i+$n]\n";
}

(N-grams of 5 elements seem to be a good size for collocates, so we set $n=4.)

To look for interesting collocates we simply piped the output of that script through sort | uniq -c | sort -n | tail . As test data I ran the script against version 2 and 3 of the GPL. In version 2 the most common n-gram was “work based on the program”; but for version 3 it was “the gnu general public license”. That isn’t a particularly interesting result, but I’m sure we will find some when we look at more than 2 source documents.

Published
Categorized as Uncategorized Tagged

My family’s visit to Tokyo

My parents and sisters came to Tokyo and stayed with us from Christmas to New Year. They returned home today. In such a short time we were not able to see much of the city, but they did get to see a few interesting places. I made a Google map to remind them.

Published
Categorized as Uncategorized Tagged

iMon LCD in 3R Systems case

A while ago I bought a PC case made by 3R Systems. It came with a built-in iMon LCD that had been custom-made by Soundgraph for 3R Systems. The case is good, but I discovered that this LCD panel has no Free Software drivers, and Soundgraph do not support Linux or any Free OS. In fact, Soundgraph refuse to even try to understand simple questions if you mention Linux anywhere in the message. That sort of behaviour used to be common, but the growing popularity of Free Software / Open Source, and the Linux kernel in particular, has made many companies change their attitude: some release technical details of their products so we can write our own software; some write and release their own Free Software (or Open Source); and some others release non-Free software for Free OSes. But Soundgraph does nothing except ignore it.

So, I started to reverse engineer the device. I’m not the only one doing this, and I got some hints from ralph.y, Codeka, and someone called “tsuppiduppi” on the iMon user forum. To make it easy to experiment I wrote a quick’n’dirty C program using libusb to allow me to send commands to the device. Now I can do:

  • imon-poke 0x02 0x1c 0x02 to initialise the LCD screen
  • imon-poke 0x0d 0x0f 0x48 0x45 0x4c 0x4c 0x4f 0x00 0x20 0x57 0x4f 0x52 0x4c 0x44 0x20 to display “HELLO WORLD”

Below is a brief summary (well, random notes) of what I have found so far. The first two bytes select the main command. Subsequent bytes may be used as parameters. The 8th byte should always be zero, and the 16h should always be 2.

          0x02, 0x00: eq graph: 1 byte 1=off 2=on
          0x02, 0x01: eq graph bars (no pattern yet)
          0x09, 0x01: eq graph bars: 16 nybbles, value 0 to 6
          0x0d, 0x01: eq graph bars, seems to match previous 0x09, 0x01
          0x02, 0x09: fan icon bits lsb: F1 F2 F3 LMH
          0x02, 0x0a: fan guage 2 bits lsb: F1 F2 F3
          0x02, 0x0b: 1 byte temp degree c icons
          0x02, 0x0c: 1 byte temp value
          0x02, 0x0d: cpu icons 1 byte on=2 off=1 (or not 2)
          0x02, 0x0e: cpu guage 1 byte value
          0x02, 0x1b: the colon 1=off 2=on
          0x02, 0x1c: whole display: 0=off 1=auto clock 2=on
          0x02, 0x1d: strange xx/yy zzzz; xx = l nibble;
          0x02, 0x26: fan speed
          0x0d, 0x0f: ASCII text

Shibuya Perl Mongers テクニカルトーク#8

Karen and I went to the Shibuya.pm technical talk tonight. Most of the talks were in high-speed Japanese so we didn’t understand very much. But we need to start practising, and Perl talks are better than normal conversation because we can, at least, understand the Perl bits.

On the way home we were comparing the Tokyo tech talks to ones we have seen in Europe. There are a lot of similarities, but we noticed one trend: in Europe the focus is on how you can do something, but here it is on what you can do.

Published
Categorized as Uncategorized Tagged

Debian GNU/Linux on my Panasonic R6 ジェットブラック

I had been thinking about getting a new laptop for a while. My old Thinkpad is still mostly working, but I wanted something that is lighter. I have a desktop for heavy work, so small size and light weight of the new laptop were the most important factors. So when I realised that there was only one sensible choice for me: the tiny 960 gram Panasonic R6 ジェットブラック.

The Jet Black model was not available in any shops (that I could find) so I ordered it online. The extra benefit of buying online was the option to get my name (or other nonsense) engraved on a small metal plaque on the bottom of the laptop, so I did:

マーティー・ポーリー 火星の狸

(No, I’m not going to explain the second line.)

I tried to order it without Windoze infection, but Panasonic ignored my emails. After a week I decided it would be easier to buy a standard machine, then wipe the disk and remove the sticker; so I did.

Installing Debian was easy, once I had decided how I wanted to start. I choose to use a bootable USB hard disk with the Debian network installer, and a local Debian mirror. It all mostly just worked; but I haven’t tried everything yet and I suspect there will be a few tricky parts. But for now, I’m happy.

Published
Categorized as Uncategorized Tagged