mmcirvin: (Default)
[personal profile] mmcirvin
This thing purports to tell if text was written by a boy or a girl. It has been applied to all the hip kids' blogs.

Now, the authors of the original paper would probably object that it was calibrated with fiction, not blogs and Usenet posts. But given that disclaimer, based on the tallied results that pop up when you submit your evaluation of its guess, I think we can safely say that, at least with regard to Web text samples, it is doing no better than chance. In my own tests, it identified Andy as female and Claudia as male, and the results on my own writing are split about fifty-fifty, with one Usenet post that it actually identified as androgyne. However, it is really good at correctly identifying [livejournal.com profile] samantha2074 as female, for some reason.

Pasting in a bunch of blog entries with time stamps probably gives it a female bias, since it seems to regard numbers as feminine for some insane reason.

Further thought: You know, it might be interesting to try to train a Bayesian spam-blocker algorithm to tell males from females. It would probably do better.

Date: 2003-08-20 01:03 pm (UTC)
From: [identity profile] mmcirvin.livejournal.com
I might also note that the text of Andy's that it identified as ineluctably female included a long disquisition on the joys of "killing the motherfuckers deader." At least it's not adhering to antique stereotypes.

Date: 2003-08-21 01:29 am (UTC)
From: (Anonymous)
"Pasting in a bunch of blog entries with time stamps probably gives it a female bias, since it seems to regard numbers as feminine for some insane reason."


No it doesn't. Are you sure you're reading it right? The male symbol has the suggestive pointy arrow, and the female one has an upside-down cross, for obvious reasons. Who said that? I've checked your results a bunch of times, and I'm now pleased to announce that Claudia and I are the genders I thought we were prior to this morning, to 87% certainty.

Date: 2003-08-21 01:35 am (UTC)
From: [identity profile] mmcirvin.livejournal.com
It was saying the opposite yesterday. Don't ask me.

Date: 2003-08-20 05:01 pm (UTC)
From: (Anonymous)
Well, I guess our secret's out. Came as a bit of a shock to me, actually, but I've learned long ago that once the internet has spoken, it's useless to protest.

Date: 2003-08-21 02:44 pm (UTC)
jwgh: (Default)
From: [personal profile] jwgh
I tried it out on some (relatively) recent ARK posts I made, and although it was right once, it was incorrect several times as well.

I was vaguely interested to know that it thought my troll of wstd.general was pretty feminine.

I was also vaguely interested to note that using 'male' words increases your score and using 'female' words decreases it. Ah, well ...

Date: 2003-08-22 12:56 am (UTC)
From: [identity profile] mmcirvin.livejournal.com
I think I was reading the plus-minus thing backwards, which is why I thought it regarded use of numbers as feminine. The results were more unambiguous, though bogus.

Date: 2003-08-21 03:25 pm (UTC)
From: [identity profile] sunburn.livejournal.com
I ran it on a portion of an etext I had handy, Joseph Conrad's _Youth: A Narrative_. Verdict: Woman.

Pfft.

Date: 2003-08-22 01:21 am (UTC)
jwgh: (Default)
From: [personal profile] jwgh
As regards your further thought, I thought I would run your spammer message through the wringer. The results:

Your original message: MALE
My followup: FEMALE
grumblepants's followup: FEMALE
astrange's followup: FEMALE
sunburn's followup: MALE

Date: 2003-08-22 10:17 pm (UTC)
jwgh: (Default)
From: [personal profile] jwgh
I found a defense of gender genie. What the hell?

Date: 2003-08-23 04:31 pm (UTC)
From: [identity profile] mmcirvin.livejournal.com
Ah, I see. The results make perfect sense as long as you define social/psychological "gender" as "whatever the Gender Genie algorithm says."

(By the way, I always wondered what was up with Book Blog, because I could never see the actual blog content. It turns out that you can see it in IE and Safari, but not in anything Gecko-based. Most likely there's some coding irregularity on the site-- I wonder what it is. The W3C validators choke on it pretty fast.)

January 2026

S M T W T F S
    123
456789 10
11121314151617
18192021222324
25262728293031

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Feb. 3rd, 2026 08:21 pm
Powered by Dreamwidth Studios