Implementation: identical PC names?
Goto page Previous  1, 2, 3, 4  Next
 
Post new topic   Reply to topic    mudlab.org Forum Index -> Coding
View previous topic :: View next topic  
Author Message
Tyche



Joined: 13 May 2005
Posts: 176
Location: Ohio, USA

PostPosted: Wed Jun 01, 2005 7:15 pm    Post subject: Reply with quote

Lindahl wrote:
The problem with nicknames (like I had mentioned in my post and Tyche expounded on) is the problem of space and time complexity.


The only thing I check is memory space and I handle that in a very general way. There is a limit to the size of any given object and each player has a limit of objects they can create or own. The former is a hard limit and the latter can be extended by command. The purpose of both is mainly to prevent some sort of spam or dos attack, or programming accident. Everything is an object so the limits apply to character objects as well. If the character object gets too large a catchable error is thrown. So the player would have to start deleting nicknames or command aliases or some other attributes they've added to their character in order to make room for more nicknames or command aliases. It's a 64K limit right now which is pretty huge, the largest single object I have is an 11K help file.

I store things like aliases and nicknames in simple hash tables. It's a softcoded server.
Back to top
View user's profile Send private message Visit poster's website
Author Message
Lindahl



Joined: 29 May 2005
Posts: 56

PostPosted: Wed Jun 01, 2005 9:41 pm    Post subject: Reply with quote

Quote:
The only thing I check is memory space and I handle that in a very general way. There is a limit to the size of any given object and each player has a limit of objects they can create or own.


Quotas, basically.

Quote:
I store things like aliases and nicknames in simple hash tables.


If they're stored in a hash table, how do you perform a reverse lookup? have two hash tables? You need to generate the name from the ID and the ID from the name.

64k seems a tad over the top - that can easily be over 1000 nicknames what bin size are you using for the hash tables? The sheer number of times the hash tables will be accessed just screams scalability.
Back to top
View user's profile Send private message
Author Message
eiz



Joined: 11 May 2005
Posts: 152
Location: Florida

PostPosted: Wed Jun 01, 2005 9:52 pm    Post subject: Reply with quote

No it doesn't. Any reasonable hash table implementation (the ID to nickname lookup that is) can handle many, many millions of queries per second.

Say there are 500 players in one room. We're already well outside reality here. Say each of them types 'look' every second. Again, the odds of this happening are astronomical. That's still only a quarter million lookups per second. Using the default OCaml hash table implementation (it's what I had handy, it's not the world's fastest) and a hash (auto-sized, as every decent hash table does load balancing automatically) with 4000 entries - again, astronomical - 250k lookups took 0.02 seconds on an Athlon 64 3000+. I cannot imagine any scenario in which a performance problem would arise due to nicknames.
Back to top
View user's profile Send private message Visit poster's website
Author Message
KaVir



Joined: 11 May 2005
Posts: 565
Location: Munich

PostPosted: Wed Jun 01, 2005 10:07 pm    Post subject: Reply with quote

eiz wrote:
Say there are 500 players in one room. We're already well outside reality here.


Agreed; you're rarely likely to have more than half a dozen people together in the same room. When I implemented an introduction system a few years back I used bumped lists - each time you saw someone, it'd bump that node to the front of the list. As most of the searches would be for the same handful of people, this seriously cut down the number of searches required. If I'd hashed the bumped lists they'd have been even faster still.
Back to top
View user's profile Send private message Visit poster's website
Author Message
Lindahl



Joined: 29 May 2005
Posts: 56

PostPosted: Thu Jun 02, 2005 2:48 am    Post subject: Reply with quote

KaVir wrote:
When I implemented an introduction system a few years back I used bumped lists - each time you saw someone, it'd bump that node to the front of the list.


You just described an LRU, what I suggested using.

Quote:
250k lookups took 0.02 seconds on an Athlon 64 3000+


Big difference between that and what a normal MUD runs on (not to mention being time-shared) - but the even big difference is that you're hitting the high-speed on-chip cache. Cache misses will give you a totally different outcome. Especially since each linked list has the potential to each be on a completely different page (much less be in the L1 or L2 cache). I see your point, but I'm not quite certain it'll be as effecient as you anticipate. An LRU, or as KaVir mentioned, a hash LRU will significantly reduce the processing time you're going to put into name and ID lookups.

Note that it isn't simply looking. Every time you'd display a short description for ANYTHING, you'll be doing a lookup - and you'll be doing that many for each person that saw that display.

I think you'll find, through profiling, name lookups will be up there on the list - so it's probably a good idea to put some thought into it.
Back to top
View user's profile Send private message
Author Message
Tyche



Joined: 13 May 2005
Posts: 176
Location: Ohio, USA

PostPosted: Thu Jun 02, 2005 4:53 am    Post subject: Reply with quote

Lindahl wrote:

Quotas, basically.


Correct.

Lindahl wrote:

If they're stored in a hash table, how do you perform a reverse lookup? have two hash tables? You need to generate the name from the ID and the ID from the name.


Mmm sorry. By hash table what I really meant is a hash as in Perl/Ruby hash, or associated array in LPC, or dictionary in ColdC. My actual implementation just happens to be STL's map, but that's opaque. To do a reverse lookup I just search the values. I suppose one could implement two hashes if it were a performance problem, but I don't.


Lindahl wrote:

64k seems a tad over the top - that can easily be over 1000 nicknames what bin size are you using for the hash tables?


An object's size is sum of its attributes, methods and symbol table sizes. A hash is just another attribute. It's size is the sum of the sizes of its contents. For practical purposes obj.size returns the amount of storage an object occupies in the database, not in memory. The only reason I mention it is its purpose (see above); just as an infinitely expandable network buffer is a feature waiting to be abused, so is an infinately expandable object (or quota).

Lindahl wrote:

The sheer number of times the hash tables will be accessed just screams scalability.


Depends on what you want in terms of scalability. I'm certainly competitive with LPMuds, MOO and ColdC.
Back to top
View user's profile Send private message Visit poster's website
Author Message
eiz



Joined: 11 May 2005
Posts: 152
Location: Florida

PostPosted: Thu Jun 02, 2005 10:09 am    Post subject: Reply with quote

Lindahl wrote:

Big difference between that and what a normal MUD runs on (not to mention being time-shared) - but the even big difference is that you're hitting the high-speed on-chip cache. Cache misses will give you a totally different outcome. Especially since each linked list has the potential to each be on a completely different page (much less be in the L1 or L2 cache). I see your point, but I'm not quite certain it'll be as effecient as you anticipate. An LRU, or as KaVir mentioned, a hash LRU will significantly reduce the processing time you're going to put into name and ID lookups.


I knew you were going to say this. Smile The truth is that I could put cache flushes at every iteration and it would still be more than fast enough. If you really want me to, I can do that. However, I added two orders of magnitude worth of fudge (putting all players in one room means n^2 lookups, when in reality it's going to be more like 5n lookups, max) just because of this.

Lindahl wrote:
Note that it isn't simply looking. Every time you'd display a short description for ANYTHING, you'll be doing a lookup - and you'll be doing that many for each person that saw that display.


And the number of descriptions you're displaying is strictly limited by the amount of text you can send to the player. To put this into perspective, the above benchmark would have resulted in about 64mbit/sec of network bandwidth usage if implemented with actual object descriptions, and the number of lookups corresponds 1:1 with the amount of output. Now where is the bottleneck here?
Back to top
View user's profile Send private message Visit poster's website
Author Message
Tyche



Joined: 13 May 2005
Posts: 176
Location: Ohio, USA

PostPosted: Thu Jun 02, 2005 12:07 pm    Post subject: Reply with quote

eiz wrote:

Now where is the bottleneck here?


Agility. That is not solving problems before their time.
Rendering nicknamed objects is just one small piece of the rendering puzzle.
Back to top
View user's profile Send private message Visit poster's website
Author Message
eiz



Joined: 11 May 2005
Posts: 152
Location: Florida

PostPosted: Thu Jun 02, 2005 12:26 pm    Post subject: Reply with quote

Tyche wrote:

Agility. That is not solving problems before their time.
Rendering nicknamed objects is just one small piece of the rendering puzzle.


Glad to see that we agree, then.
Back to top
View user's profile Send private message Visit poster's website
Author Message
Lindahl



Joined: 29 May 2005
Posts: 56

PostPosted: Thu Jun 02, 2005 2:23 pm    Post subject: Reply with quote

eiz wrote:
To put this into perspective, the above benchmark would have resulted in about 64mbit/sec of network bandwidth usage if implemented with actual object descriptions, and the number of lookups corresponds 1:1 with the amount of output. Now where is the bottleneck here?


You're comparing two different types of throughput, CPU and IO. Considering that you can utilize the CPU WHILE waiting for the network resource to become available, it's just wasting CPU - I abhore the concept. That CPU could be better utilized doing much more important things (for example, stronger AI models or NLP). It's a resource that's essentially being wasted. When I profile, I tend to aim at wasted cycles or wasted IO - I differentiate between the two to waste less resources overall. If the response time is adequate, I still try to cut down on waste so I can do more interesting things with the extra resources. I don't spend an overly-involved amount of time eliminating waste in the fore-front, but if the waste is obvious, I'm going to eliminate it the first time around, reguardless of whether it'll significantly affect response time. It's trivial to fix a problem that's obvious and has a (somewhat) obvious solution. I don't like returning to things for tuning, if I can avoid it with little effort in the first place. Call it premature optimization, but if it doesn't really take any extra effort, why not?

But I suppose this is a futile argument, because I know where both of you stand on this topic (I'm pretty sure we've had similiar discussions at least twice before).
Back to top
View user's profile Send private message
Author Message
eiz



Joined: 11 May 2005
Posts: 152
Location: Florida

PostPosted: Thu Jun 02, 2005 4:15 pm    Post subject: Reply with quote

Lindahl wrote:

You're comparing two different types of throughput, CPU and IO.


No, you're not getting it. That was just to show you how unrealistically large my test was. Just like you'd never see 64mbit/sec, you'd never see 250k lookups/sec or 4000-entry tables. And it was still more than fast enough.

Lindahl wrote:

Call it premature optimization, but if it doesn't really take any extra effort, why not?


It increases code complexity and it doesn't actually win you anything. Going off into the weeds optimizing things that don't need optimizing is not going to help you in the long run. Just like CPU time spent doing something useless is wasted compared to CPU time spent doing something useful, programmer time spent doing something useless is wasted. Except programmer time is much more costly than CPU time. Therefore it is prudent to save programmer time rather than CPU time when practical.

Anyway, that is a topic for another thread.
Back to top
View user's profile Send private message Visit poster's website
Author Message
Lindahl



Joined: 29 May 2005
Posts: 56

PostPosted: Thu Jun 02, 2005 9:16 pm    Post subject: Reply with quote

Quote:
Just like CPU time spent doing something useless is wasted compared to CPU time spent doing something useful, programmer time spent doing something useless is wasted. Except programmer time is much more costly than CPU time. Therefore it is prudent to save programmer time rather than CPU time when practical.


I think you missed the point where I said that it doesn't take any more effort (nor complexity). I can read and write the code just as easily - only now I don't have to go back and tune it if it IS a bottleneck - it's already tuned - which may save me time. It isn't really premature optimization, since that implies effort, it's really just "look before you leap".

It's like saying you'd rather use a vector than a linked list for an algorithm that does lots of insertions. They both take minimal effort and are equally complex (almost), but one is obviously the right choice.

Quote:
you'd never see 250k lookups/sec or 4000-entry tables. And it was still more than fast enough.


You can't say it was more than fast enough if its an unrealistic test. But this is neither here nor there, for the above reasons.

On the other hand, maybe it's just that an LRU seems obviously simple and trivial because of my familiarity with them. I must admit that probably few MUD implementors would actually recognize the container as an LRU and understand why it is used - bringing us back to the complexity issue. Oh well, moving on...
Back to top
View user's profile Send private message
Author Message
eiz



Joined: 11 May 2005
Posts: 152
Location: Florida

PostPosted: Thu Jun 02, 2005 9:40 pm    Post subject: Reply with quote

Edit: Eh. I wrote a post here, but it's not worth arguing over. However, unless you already have a special purpose data structure which matches this situation (which I'm sure _you_ do), it's best to use something standard and idiomatic, especially if it's unlikely to be a problem. And I think that's good advice in general.
Back to top
View user's profile Send private message Visit poster's website
Author Message
Tyche



Joined: 13 May 2005
Posts: 176
Location: Ohio, USA

PostPosted: Fri Jun 03, 2005 7:04 am    Post subject: Reply with quote

Lindahl wrote:

It's trivial to fix a problem that's obvious and has a (somewhat) obvious solution. I don't like returning to things for tuning, if I can avoid it with little effort in the first place. Call it premature optimization, but if it doesn't really take any extra effort, why not?


Of course it's premature optimization. There ain't even an adequate problem statement here. I've extracted three requirements from my initial post on nicknames, an additional three or four from Kavir, Eiz and your posts, and I can think of three..maybe more requirements that haven't been raised.

Lindahl wrote:

But I suppose this is a futile argument, because I know where both of you stand on this topic (I'm pretty sure we've had similiar discussions at least twice before).


The only conversation that I recall was something about me advocating anyone serious about muds ought to be writing everything in pascal or assembler. Wink

Lindahl wrote:

On the other hand, maybe it's just that an LRU seems obviously simple and trivial because of my familiarity with them. I must admit that probably few MUD implementors would actually recognize the container as an LRU and understand why it is used - bringing us back to the complexity issue. Oh well, moving on...


But given your assumptions about the problem...

1) Nickname tables should be limited and small. (32, 64, 255, 1000 entries?)
2) Every time you'd display a short description for ANYTHING, you'll be doing a lookup - and you'll be doing that many for each person that saw that display.

I think it's obvious one ought to be optimizing for lookup failure, not success.
Which optimizes for lookup failure better, an STL map or an LRU link list? Something else?
Back to top
View user's profile Send private message Visit poster's website
Author Message
eiz



Joined: 11 May 2005
Posts: 152
Location: Florida

PostPosted: Fri Jun 03, 2005 9:48 am    Post subject: Reply with quote

Obviously, you should be using Patricia (radix) trees. If it's good enough for routing tables... Wink

(I'm kidding, please don't do this)

Just to pretend this post is kind of ontopic, I'd actually prefer to have lots of nicknames. I know when I play Angband or Nethack or whatever, I go around tagging every piece of junk I pick up. This is because almost nothing you get in those games comes with a nice 'here's what this object does' label. You get "a blue potion." And identification of these items is not trivial, especially early in the game. So what happens is the player uses ad-hoc methods (e.g. scratching a wand on the ground will produce different effects) to sort of figure out what the item does, and then records that in the object name. I always thought this was a lot more fun and exciting than picking up 'a potion of healing' or whatever.

Especially when potions of acid are a lot more common...
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    mudlab.org Forum Index -> Coding All times are GMT
Goto page Previous  1, 2, 3, 4  Next
Page 2 of 4

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Powered by phpBB © 2001, 2002 phpBB Group
BBTech Template by © 2003-04 MDesign