Grammarizer - SAY parser

 
Post new topic   Reply to topic    mudlab.org Forum Index -> Coding
View previous topic :: View next topic  
Author Message
Lared



Joined: 07 Oct 2005
Posts: 26

PostPosted: Sun Oct 09, 2005 12:23 pm    Post subject: Grammarizer - SAY parser Reply with quote

OK, folks. Time to air a pet peeve of mine--namely, that of players to whom a SHIFT key is merely a passing acquaintance. I know I'm not alone in hating this. The MUD I currently play on is RP-encouraged, but there are a large number of people (who do RP) who simply don't capitalize or use punctuation much at all. I'm designing a set of text-parser rules for English sentences to at least pretty them up a little. The rules I thought of off the top of my head follow:

-Sentences must begin with a capital letter. If the first letter of a SAY argument is not capitalized and it's the first character (not following a number or an ellipse or the like), cap it.
-Sentences must end in a period. I don't see a lot I can do mid-stream for periods, but a lot of people don't put one at the end of a statement. Add one here.
-Turn emoticons into a different "say" verb. ":)" makes "you say" into "you smile" or something. Gank the trailing emoticon from the SAY itself.
-Proper nouns...I want to do SOMETHING here. Maybe if it's a player's name that's in the room, cap it?

If you've got any suggestions to enhance this list, please, toss 'em out here.

Ed
Back to top
View user's profile Send private message
Author Message
Spazmatic



Joined: 18 May 2005
Posts: 76
Location: Pittsburgh, PA

PostPosted: Sun Oct 09, 2005 5:38 pm    Post subject: Reply with quote

Yes. This is a prime opportunity to use machine learning. You can either a) hand label examples (meh), b) use an existing corpus (meh), or c) train on none other than your mud itself. After all, the type of emotes and dialogue you get on a mud is a very specific hypothesis space, so you could simply train your system on logs submitted by players who CAN punctuate and capitalize.

I can imagine bayes nets would perform well. I'd have to think for a while on a good feature set, though.

Anyhow, I imagine you're going to get MUCH better results from a learning algorithm than formal rules. You obviously need something else for emoticons and so forth, but this would better capture the need to say capitalize player names (with the right feature set) and, even better, do a great job of addressing punctuation.

Feature set, feature set, oh what would be thou feature set? I'm thinking about it.
Back to top
View user's profile Send private message
Author Message
KaVir



Joined: 11 May 2005
Posts: 565
Location: Munich

PostPosted: Mon Oct 10, 2005 7:01 am    Post subject: Reply with quote

Quote:
-Sentences must begin with a capital letter. If the first letter of a SAY argument is not capitalized and it's the first character (not following a number or an ellipse or the like), cap it.


Don't forget to ignore colour codes as well.

Quote:
-Sentences must end in a period. I don't see a lot I can do mid-stream for periods, but a lot of people don't put one at the end of a statement. Add one here.


Unless the sentence already ends in punctuation - you don't want to add a period after a question mark, for example. This does get irritating when you're trying to tell someone a URL, but if your 'says' are purely IC then that shouldn't be a problem.

Quote:
-Turn emoticons into a different "say" verb. ":)" makes "you say" into "you smile" or something. Gank the trailing emoticon from the SAY itself.


I made the removal of the trailing emoticon optional (via a config option), as some people seem to really hate having content removed from their says. You'll also need to decide what to display if the entire content of the say is an emoticon (eg I have it so that when someone types "say :)" they see "You smile and say nothing.").

The same logic can apply to punctuation as well - something ending with '?' might be 'ask', something ending with a '!' might be exclaim, etc. Do you want to add extra emphasis for multiple punctuation? Will you strip multiple punctuation?

And what about other special characters? I support an optional additional character at the beginning of the say text: '+' for nod, '-' for shake, '%' for rolling your eyes, '^' for raising an eyebrow and '~' for shrugging your shoulders. For example:

> say ^interesting!! :}

You raise your eyebrow with a chuckle and exclaim in astonishment, 'Interesting!'


I also chose to combine it with the 'sayto' option that many muds provide - if the first token of your text is the word 'to' and the second token is the name of someone nearby, your 'say' will be directed at them (otherwise it'll assume the 'to' was supposed to be part of the text and just display the lot as part of the say message - although you can also use the '@' shortcut character instead of 'to', in which case the next token MUST be a valid name or the message won't go through).

Quote:
-Proper nouns...I want to do SOMETHING here. Maybe if it's a player's name that's in the room, cap it?


I'd suggest against doing that, unless you have strict naming policies which ensure that players can't use normal words for their names.
Back to top
View user's profile Send private message Visit poster's website
Author Message
Lared



Joined: 07 Oct 2005
Posts: 26

PostPosted: Mon Oct 10, 2005 3:17 pm    Post subject: Reply with quote

Spazmatic:

I should have really mentioned beforehand that I am a very hack coder. The idea is interesting, but beyond my capabilities. I do appreciate the suggestion, though.

KaVir:

Many good points, here, and I appreciate you taking the time to list them out. I agree in many places, and some of the things you mentioned are great ideas--seeing as how I plan to include all of six socials in my game.

Quote:
Don't forget to ignore colour codes as well.


I'm sure I would have seen this after my first code explosion...

Quote:
Unless the sentence already ends in punctuation - you don't want to add a period after a question mark, for example.


D'oh. I should have qualified that statement. And my SAY command is 100% IC; I'll have other commands for OOC chat.

This grammar tool will be turned on and off by the listener, so if they don't mind not seeing spaces, it'll accomodate them.

Quote:
I made the removal of the trailing emoticon optional (via a config option), as some people seem to really hate having content removed from their says. You'll also need to decide what to display if the entire content of the say is an emoticon (eg I have it so that when someone types "say Smile" they see "You smile and say nothing.").


Ehh...my original reaction was just to strip emoticons entirely from the SAY command. In RP, they absolutely irk the shit out of me. If they want to use emoticons, they can suffer the effects of having them ganked. Razz

Quote:
The same logic can apply to punctuation as well - something ending with '?' might be 'ask', something ending with a '!' might be exclaim, etc. Do you want to add extra emphasis for multiple punctuation? Will you strip multiple punctuation?


I've already thought of this, though I didn't mention it--I'm going to have verb/adverb combinations (if the latter is needed) for "?", "!", "?!", "!?", and "..." (asks, exclaims, asks incredulously, demands, "says, trailing off"). Emoticons also add a facial expression. A-like so:

Lared> say What!? Sad

Lared demands, frowning, "What?!"

Quote:
You raise your eyebrow with a chuckle and exclaim in astonishment, 'Interesting!'


I like the idea, but I think it's more than I want. I've always been a fan (assuming all the involved parties know the language being used) of including speech in emotes. See my RMOTE reply to a post elsewhere for an example of the command I'm going to use.

Quote:
I also chose to combine it with the 'sayto' option that many muds provide - if the first token of your text is the word 'to' and the second token is the name of someone nearby, your 'say' will be directed at them


I just plan to make "sayto" a SCMD branch of my CMD(do_say) function. Wink

Quote:
[Proper Nouns]
I'd suggest against doing that, unless you have strict naming policies which ensure that players can't use normal words for their names.


Yeah...this can be glossed over.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    mudlab.org Forum Index -> Coding All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Powered by phpBB © 2001, 2002 phpBB Group
BBTech Template by © 2003-04 MDesign