Data store and persistence

 
Post new topic   Reply to topic    mudlab.org Forum Index -> Coding
View previous topic :: View next topic  
Author Message
cron0s



Joined: 13 May 2005
Posts: 34
Location: UK

PostPosted: Sat May 28, 2005 10:26 pm    Post subject: Data store and persistence Reply with quote

After a day spent dealing with the vagaries of the ROM 'dump the sodding lot to a text file' (TM) method of data storage, I began musing over some of the alternatives.

I hear MySQL mentioned a fair bit, and while mud data may be well suited to a relational database, I suspect a lot of implementations aren't really using these features to the full. I quite like SQLite, and have dabbled with it a couple of times in other things, but I wonder if a plain old hash table dbms like gdbm might be a lot easier.

If I was going to change things, I would probably want to move to a wholely persistent store where objects are fetched as needed and written back after each change rather than loading everything at boot. I am also not sure whether I would need some sort of object manager to take care of caching and marking objects for storage, or if I could just fetch/store on the fly and leave it to the driver to worry about. I haven't really thought about it in too much detail yet. I suppose I could always learn ColdC, but OOP makes my head hurt Confused

I am curious to know what systems other people are using in their muds, whether Diku derivatives or custom codebases or whatever.
Back to top
View user's profile Send private message
Author Message
Tyche



Joined: 13 May 2005
Posts: 176
Location: Ohio, USA

PostPosted: Sat May 28, 2005 10:38 pm    Post subject: Re: Data store and persistence Reply with quote

cron0s wrote:

I am curious to know what systems other people are using in their muds, whether Diku derivatives or custom codebases or whatever.


If your data layout is fairly static like Diku and children then RDMSs are ideal. RDMSs these days also sport OO extensions. Cold uses a hash table indexed with NDBM although GDBM, DBM, Berkely Db can be easily swapped in. Both RDMSs and ODBMSs use static schema. Since RDBMSs have implemented OO extensions I've never seen any reason at all to use ODBMSs. RDBMSs have in fact become ORDBMs because of the popularity of OO languages, and do it better IMO.

I needed a storage format where...

* Objects can be arbitrary extended at runtime.
* Objects can be reparented at runtime.
* Objects can have multiple versions.

At one point I had a translation layer, mapping an abstract object catalog into an RDMS. Performance was awful because of the join logic required to maintain it. Anyway I use Berkely DB now.
Back to top
View user's profile Send private message Visit poster's website
Author Message
Yui Unifex
Site Admin


Joined: 11 May 2005
Posts: 47
Location: Florida

PostPosted: Sat May 28, 2005 11:01 pm    Post subject: Reply with quote

In Aetas we're using Postgres to store data. The data access is straight-forward and boilerplate for the most part, which is why I got frustrated writing it and began looking into object/relational mapping (ORM) tools that would do the work for me. I found that NHibernate did most of what I needed, but it had some serious flaws that prompted me to write my own ORM. In my experience, an ORM is an uncomfortable thing to use if your storage model does not follow the fairly common idioms of one table per type, and one row per type instance.

Data access with an ORM is usually super simple. For example, my function to retrieve a player object by name is:
Code:
public Player GetPlayerByName (string name) {
   return (Player) MStore.Instance.EqualsScalar(typeof(Player), "name", name);
}


It can also handle dynamic sets of criteria. I have a search function that allows me to specify criteria with which to retrieve a set of objects.
Code:
public IList GetElements (Type elementType, int elementID, string namePrefix, int regionElementID) {
   Medusa.Criteria criteria = MStore.Config.CreateCriteria(elementType);

   if (elementID != 0)
      criteria.Add("elementid", "=", elementID);

   if (namePrefix != "")
      criteria.Add("name", "ILIKE", namePrefix + "%");

   if (regionElementID != 0)
      criteria.Add("container_elementid", "=", regionElementID);

   // Retrieve results.
   return MStore.Instance.List(criteria);
}


To achieve these results a mapping file is necessary to tell us which types map to which tables, and which members map to which columns. Here is how I map my core object model:
Code:
<!-- Elements -->
<class type="Aetas.Element, AetasLib" table="elements">
   <sequence-key name="ElementID" column="elementid" sequence="elementids" />

   <member name="Name" column="name" />
   <member name="BaseElementID" column="base_elementid" />
   <member name="ElementType" column="element_typeid" />
   <member name="InitialContainer" column="container_elementid" />
   <list name="DynamicProperties" type="Aetas.Properties.Value, AetasLib" column="elementid" />

   <joined-subclass type="Aetas.Region, AetasLib" table="regions" column="region_elementid">
      <joined-subclass type="Aetas.Room, AetasLib" table="rooms" column="room_elementid">      
         <member name="Description" column="description" />
      </joined-subclass>
   </joined-subclass>
   
   <joined-subclass type="Aetas.Character, AetasLib" table="characters" column="character_elementid">      
      <joined-subclass type="Aetas.Player, AetasLib" table="players" column="player_elementid">            
         <member name="Account" column="accountid" />
      </joined-subclass>
         
      <joined-subclass type="Aetas.Npc, AetasLib" table="npcs" column="npc_elementid">            
         <member name="Description" column="description" />
      </joined-subclass>
   </joined-subclass>
      
   <joined-subclass type="Aetas.Item, AetasLib" table="items" column="item_elementid">      
      <member name="Quantity" column="quantity" />
      <member name="Description" column="description" />
   </joined-subclass>
</class>


The ORM handles all of my caching for me. It is not anything intelligent just yet, but it ensures that I'll always have the same object reference if I attempt to load objects with the same primary key at different times, and the old object reference is still alive. For this purpose it simply uses a hash table of weak references, indexed by the primary key.
Back to top
View user's profile Send private message Send e-mail Visit poster's website AIM Address
Author Message
cron0s



Joined: 13 May 2005
Posts: 34
Location: UK

PostPosted: Mon May 30, 2005 9:55 pm    Post subject: Reply with quote

Thank you both for your comments, you've definitely given me something to think about.

Unfortunately I don't think it's worth trying to do anything too revolutionary with my Rom codebase. Once you start thinking too much about how your data is stored, how it is related, and how it is utilised, you might as well start over and design your own codebase. Which is exactly what I was trying to avoid by using Rom, ho hum Confused
Back to top
View user's profile Send private message
Author Message
Lindahl



Joined: 29 May 2005
Posts: 56

PostPosted: Mon May 30, 2005 10:18 pm    Post subject: Reply with quote

I've been working on an object-oriented database system tailored specifically for MUDs over the past year to two years now - when I'm finished, it should revolutionize the way C++ MUDs deal with data storage.

I emailed someone recently (Traithe) about it, and so I'll just copy and paste the email. It describes basically what it provides for MUDs as well as a little bit about the implementation and recovery model.

Quote:
Yes, my development is aimed for C++ MUDs. Its a low-intrusion smart pointer solution that loads everything on-demand. It supports virtual functions, seamlessly (except for indicating to the database that a class has a virtual function). I call it MudDB (or MDB).

Basically, for all classes that you want to reside in the database, you register the class with a macro, describe the members of the class (for automatic schema evolution or automatic versioning and garbage collection).

All pointers to classes that reside in the database are converted to the smart pointer class. Next, you set up the database, with a database path name, a log path name, a cache size and a log buffer size. Then you manipulate data using a smart pointer and two RAII classes (reader<T> and writer<T>), accessing everything starting at the database root. One can also tell the database to prefetch objects - so that the database can optimize retrieval of multiple objects needed for a single transaction.

Additional forms of data access are allowed, such as converting an object to XML to be later streamed via sockets on-demand for website publication.

On a periodic basis call the 'checkpoint' method on the database - which provides you with a sequence point for recovery. Checkpoints don't complete right away, they complete on a lazy basis, and you can queue up as many checkpoints as you want and the MUD won't block while writing a checkpoint, instead checkpoints will complete, asynchronously, in the background. Note: it is imperative that all RAII objects are destroyed before calling the checkpoint method (however, pointers may persist through the call).

The user process is completely isolated from all disk synchronization. This allows the database system to continue to complete checkpoints even if the user process crashes - meaning that if the user process crashes, the database can be recovered to state when the last checkpoint was queued (a very powerful feature).

However, if the computer crashes or the database processes crash, then the database will return to the state of the last checkpoint that had completed (checkpoints are lazy in order to put emphasis on forward progress in the user process). However, database processes are isolated enough from the user process that crashes are EXTREMELY rare and usually imply that forward progress is impossible (i.e. hard disk is full) - some recovery, seamless to the user process, is possible, however.

Logging and recovery is essential to provide a consistent database state and can provide full recovery, even after a power failure. MDB provides logging and recovery in a different manner (and more efficient) from most databases. Most databases provide recovery through WAL (write-ahead logging). This means that for almost every update a log record must be generated and written to disk. This is essential for concurrent fine-grained transaction databases, however, a significant optimization can be made. Since MDB only supports a single long-duration transaction (the checkpoint system) a pre-image logging system can be used. This means the only time a log record is generated is when the object is updated for the first time since the last queued checkpoint. This significantly reduces the amount of log activity while significantly increasing the update rate (how many objects can be modified per second). For instance, if an object is modified multiple times, only one log record will be generated by MDB, whereas other database systems will often generate multiple log records. This reduces disk traffic (providing faster reads) and allows for a smaller log buffer (more memory can then be devoted to the cache). The downside is that each transaction commit (checkpoint) queued is not guaranteed to complete before a crash.

The system architecture of the database consists of several processes. The user process, the checkpoint process, the logger process and the disk process.

The user process manipulates the data and the database itself. Because the database manipulation occurs in the context of the user process, all database access is extremely efficient and usually avoids communication and synchronization overhead with the other processes. However, there are a few situations when the user process can be blocked by the other processes:

1) When an object is not found in the cache, the user process must request the disk process to load that object to disk (and possibly other objects). The user process continues execution when the object has been read from disk. The disk process gives highest priority to read requests by the user process, so the user process will wait for, at most, two disk accesses.

2) When an object must be logged and the log buffer is full, the user process must wait until the logger process completes it's current disk write and signals the user process. This is often a configuration problem and can be resolved by increasing the log buffer size. After a checkpoint the logger process will often see a burst of logging activity from the user process.

3) When the disk process is busy writing an object to disk, the user process requests write access to the object AND the object hasn't been logged since the last queued checkpoint. This is a EXTREMELY rare occurrence and isn't a cause for concern. This scenario is included for completeness.

The communication between the user process and other processes are very minimal compared to the overhead of the request that required the communication. In most other database systems, communication between the user process and other database processes presents a much larger overhead. For completeness, the user process requires the following communications:
1) For each log request, one synchronized assembly instruction - overhead is negligible compared to the memory copy performed.
2) For each log request on an empty log buffer, one semaphore release - overhead is negligible compared to how often this is required.
3) For each disk request, one pipe write and one semaphore acquire - overhead is negligible compared to disk access.
4) For each checkpoint request, one spinlock operation, two semaphore releases and two semaphore acquires - overhead is negligible compared to the rest of checkpoint processing.

Say for instance the MUD process crashes, and you checkpoint once a second at the end of the main loop, the game will return to the state it was in at the end of the main loop, during the last second. If the machine crashes, and the last checkpoint that completed was a checkpoint that was initiated 3 seconds ago, the game will return to the state it was in at the end of the main loop, 3 seconds ago.

Some additional information:
If you hold most of the game in the database, you'll see a huge improvement in how long it takes to reboot (or copyover) the MUD. This is because the game is loaded on demand, not all at once. Not only that, but you no longer have to perform an enormous amount of disk activity on each copyover - the database can stay alive while the user process exec's itself, this is because all data resides in shared memory which will stay alive along with the logger process, the disk process and the checkpoint process. The database can then reattach itself to the shared memory and continue execution without any problems (a checkpoint completion isn't even necessary).


One of the things I didn't mention was that online backup is trivial. You can back up the entire database while the MUD is updating data. This will give you a consistent snapshot of the database at the time of the last checkpoint completion.

I also didn't explain the interface too well, so this is likely what you'd see:
Code:

#include <mdb.h>

class Foo : public mdb::Object {
// Class Declaration
MDB_DECLARE(Foo, mdb::Object)

public:
  inline Foo( ) {}

  bool        b;
  char        buf[4];
  nat1        n1;
  int1         i1;
  ptr<Foo> p;
  ptr<Foo> pbuf[4];
  char        var[1];
};

// Class Registration
MDB_REGISTER(Foo, mdb::Object)
// Member Declaration
  MDB_MBR(b)
  MDB_ARY(buf)
  MDB_MBR(n1)
  MDB_MBR(i1)
  MDB_MBR(p)
  MDB_ARY(pbuf)
  MDB_VRY(var)
MDB_END

// create a new Foo object in the database
ptr<Foo> pFooObj = database().create<Foo>()

// create a new Foo object in the database
ptr<Foo> pFooObj = dbnew<Foo>();

pFooObj->i1 = 3; // compiler error

modify(pFooObj)->i1 = 3; // OK, modify() enables write-access

int1 a = pFooObj->i1; // OK
 
// using a reader object guarantees memory-speed access
reader<Foo> r(pFooObj);
r->i1 = 3; // compiler error
a = r->i1; // OK

// using a writer object guarantees memory-speed modification and access
writer<Foo> w(pFooObj);
w->i1 = 3; // OK
a = r->i1; // OK

// mneumonic to create a block for memory-speed access
mdbRead(Foo, pFooObj, rd) {
  rd->i1 = 3; // compiler error;
  a = rd->i1; // OK
}

// mneumonic to create a block for memory-speed modification and access
mdbWrite(Foo, pFooObj, wr) {
  wr->i1 = 3; // OK
  a = wr->i1; // OK
}

// queue a checkpoint for recovery to the current database state
database().checkpoint();

// queue the backup procedure to generate a backup to the current database state
database().backup(pathname);

// rollback to the last queue'd checkpoint
database().rollback();

// reclaim all unused objects in the database (garbage collection)
database().reclaim();

// perform clustering and compaction to improve disk locality (performance)
database().cluster();

// perform schema evolution to update all database objects to the current class format
database().evolve();


The last 3, 'reclaim' 'cluster' and 'evolve' have to performed during database downtime. I am looking into methods to allow 'reclaim' during database modification and access. Though there's no way to do this without impacting runtime performance significantly - just the way it goes. Note that you can explicitly delete objects on your own, however it is dangerous to do such because it can introduce persistent dangling pointers.

A little bit of information about schema evolution. What does it mean? It means that as long as you describe your classes adequately (MDB_MBR, MDB_ARY and MDB_VRY), then the database system will do its best to morph from old versions of a class to newer versions of a class. For example, if you change a variable from a short integer (2 bytes) to a long integer (4 bytes) you don't have to do a thing. The database will update the objects of the old class format to the new class format. This also takes into account deleting from or adding to a class. This class conversion happens the first time the object is accessed using a different class format. You can also describe your own conversion procedure to perform more advanced evolution by overloading the 'evolve' method for an object. For example:

Code:
void Character::evolve( mdb::Object old, mdb::Class cls ) {
  new_member_name = cls["old_member_name"][old];
}


Note that you can access any object's variable using the symbolic name, by indexing the class on the symbolic name and then indexing the result by the object. Unfortunately it breaches encapsulation, so it's best to reserve this sort of access for the evolve functions.

A final feature of the database is that it can detect the first time any object is loaded since the last time the database was opened. This is an essential feature for persistent MUDs. For example, consider the case where a player is in the game, the MUD crashes, and when it comes back up, the player is no longer interested in playing. Unfortunately, the last recovery position places the player's character in the game. Luckily, since we can detect the first time an object is loaded since the last time the database was opened, we can remove a character from the game, the first time he or she is accessed. In this way, it's as if the character was never in the game on the reboot. Code to do this, for example, would be:

Code:
void Character::load( ) {
  remove_from_game();
}


In this manner, we can extract the character from the game the first time the character is loaded since the last time the database was opened. Note that there may be better ways to do this, although I haven't explored this problem for persistent MUDs yet.

There are other such uses for this feature, such as in-memory pointers inside objects (although not recommended). Such as:

Code:
void Character::load( ) {
  out_buffer = new char[MAX_OUT_LENGTH];
  in_buffer = new char[MAX_IN_LENGTH];
}


Questions? Comments?


Last edited by Lindahl on Wed Jun 01, 2005 2:27 pm; edited 1 time in total
Back to top
View user's profile Send private message
Author Message
Tyche



Joined: 13 May 2005
Posts: 176
Location: Ohio, USA

PostPosted: Wed Jun 01, 2005 10:18 am    Post subject: Reply with quote

Lindahl wrote:

Questions? Comments?


How does it handle relationships (aggregation, inheritance, association)?

Code:

class Foo : public Bar {
  Baz* q;
  Bub r;
  list<Bing> s;
}
Back to top
View user's profile Send private message Visit poster's website
Author Message
Lindahl



Joined: 29 May 2005
Posts: 56

PostPosted: Wed Jun 01, 2005 2:55 pm    Post subject: Reply with quote

Quote:
How does it handle relationships (aggregation, inheritance, association)?


Inheritence is simple, since all database objects must derive at the highest level from mdb::Object, inheritence is required which means you only need one macro parameter list. The MDB_DECLARE and MDB_REGISTER macros take, as the first argument, the class name, and as a second argument, the class' parent name. Its really just that simple. And you only have to describe members of that particular class. Database pointers work the same way as normal pointers - casting works as you'd expect - downcasting is implicit and upcasting is explicit. You just have to upcast with a slightly different syntax. Dynamic casting works as well, but requires a different syntax as well.

I only allow for single inheritence since multiple inheritence has some serious issues with restoring virtual base pointers and possibly multiple virtual function table pointers. The various implementations of such are highly compiler dependent and vary wildly. Single virtual table pointers are much easier to support since they are (almost) always located in one of two places - either at the beginning of the object, or at the end of the first object to have a virtual function.

An MDB_VIRTUAL notation is needed to designate the first object to include virtual functions (having multiple MDB_VIRTUAL entries in the inheritence chain works fine and doesn't screw up the system - it simply notes the first object in the inheritence chain to have a MDB_VIRTUAL entry).

Aggregation based on pointers is simple as well, you simply use ptr<Type> as opposed to Type*. Aggregation via inclusion is simple, if the aggregated class is a database object, you don't need to do anything. If it isn't, you can make it one, or avoid the 4 byte overhead and just describe the members with a different macro (MDB_DESCRIBE).

You can also provide a description for the aggregated class at aggregatee level by describing it from the point of view of the aggregatee. For example MDB_MBR(aggregated.member). However, I'm thinking of eliminating this option in order to force good design.

I plan on generating a class for associations, based on a B-Tree. It will be optimized for the architecture of the database system (prefetching, alignment, sizing, etc.).

I also plan on coming out with some other containers in the far future that are optimized for the architecture of MDB. However, until then, you'll have to roll your own for the STL containers.

Quote:
Code:

class Foo : public Bar {
  Baz* q;
  Bub r;
  list<Bing> s;
}


You'd probably see something like:

Code:

class Bub {
  ptr<Bar> p;
};

MDB_DESCRIBE(Bub)
  MDB_MBR(p)
MDB_END


// as you'll see below, unfortunately, T really can't be a base type
// such as int, so you'll need wrappers, similiar to Java, if you
// want the base types to be standalone database objects.
public SomeTemplate<T> {
  ptr<T> first;
  ptr<T> last;
  T something;
};

MDB_DESCRIBE(SomeTemplate<T>)
  MDB_MBR(first)
  MDB_MBR(last)
  MDB_MBR(something)
MDB_END

// Bar must be derived at the highest level from mdb::Object
class Foo : public Bar {
MDB_DECLARE(Foo, Bar)

   ptr<Baz> q;
   Bub r;
   SomeTemplate<Bing> s;
};

MDB_REGISTER(Foo, Bar)
  MDB_MBR(q)
  MDB_MBR(r)
  MDB_MBR(s)
MDB_END


Therefore, aggregation by inclusion can avoid the 4 byte overhead for all database objects (inherited from mdb::Object) as well as the extra indirection. However, you still have to describe the class to the database system so it knows where the pointers are for garbage collection, and how to perform proper schema evolution.

One drawback of aggregation by inclusion is that you can't have database pointers to the subobject (it increases the complexity of the garbage collector by a LOT), however, I am looking into effecient ways to support it.

The MDB_REGISTER and MDB_DESCRIBE macros can be placed in '.cpp' files to avoid extra compilation, inclusion and macro expansion. The MDB_REGISTER and MDB_DESCRIBE macros require inclusion of <mdb/class.h> while the MDB_DECLARE macro requires only <mdb/object.h>. Of course you can just ignore inclusion reduction and just include the entire <mdb.h> library (which is only required for the compilation units that access the database object).

The MDB_DESCRIBE macros must be defined in the same file for any MDB_REGISTER macro that uses the class type as an aggregate object. In the example above, this means that the MDB_DESCRIBE macros for SomeTemplate<T> and Bub must be included into the file that has the MDB_REGISTER macro for Foo which can all be (optionally) in a seperate .cpp file to reduce compilation time (highly recommended).

A final limitation of the system is that I haven't quite figured out how to allow for templated database objects - it gets tricky with respect to instantiation and declaration of the class internals. It is on my list of things to do however (especially for the B-Tree code). For now, you can aggregate them into a dummy database class - which works reliably. So basically, if you want templated database classes, instead of using a typedef, you design a wrapper class that aggregates (by inclusion) the template class of your type similiar to the SomeTemplate<T> type above.

EDIT: Oh, I also plan on including a string class that's optimized as a database object (size, alignment, etc.) and I also failed to mention that the maximum size of objects is 4096 bytes (the page size). This limitation was made because it allows one to make significant optimizations. Allowing for larger object sizes would severly hamper runtime for the commonly executed parts of the system. You can bypass this by creating large objects from groups of small objects. As a result, this allows you to optimize access for large objects according to their usage - something you'd have to forgo if you just wanted one big object. The trade-off seemed obvious to me, considering the size of almost all MUD objects.
Back to top
View user's profile Send private message
Author Message
Zygfryd



Joined: 01 Jun 2005
Posts: 6
Location: Poznan, Poland

PostPosted: Fri Jun 03, 2005 8:56 am    Post subject: Reply with quote

My MUD is mostly Lua softcode and I'm using a simple and flexible solution for referencing objects from the database, the actual backend doesn't matter, I used XML files in a directory tree but that'll change to Lua files in a directory tree.
Lua has one main structured type, an associative table, that is passed by reference during assignments and all objects are implemented as tables. I have a table constructed for each persistent object that acts as a reference, containing its name and database unit. When an access to an object's component is detected (by the means of an overloaded operator), the table is filled with the actual object's data from persistent storage and all references to the object remain in tact. After the object is loaded the overloaded operator is removed from it. It's 100% transparent to the rest of the code, as if all the objects were in memory the whole time,
I haven't read the whole of Lindahl's posts, but I got the impression that he's aiming for an analogous access scheme in C++ as far as it's possible.

PS. The memory overhead of a whole table structure allocated for quite a couple unneeded objects is not really a problem for me, since I intend to partition the MUD into a cluster. I'm not sure if I'm going to ever need to actually use the distribution across multiple machines, but it lets me think more about flexibility and care less about performance.
Back to top
View user's profile Send private message
Author Message
eiz



Joined: 11 May 2005
Posts: 152
Location: Florida

PostPosted: Fri Jun 03, 2005 9:43 am    Post subject: Reply with quote

I once used the Prolog fact database (assert/retract) as a data store. Since Prolog implementations typically build in-memory indexes, this is not quite as slow as it sounds, but almost. I just wrote stuff out to files of prolog terms. If any of you are familiar with Prolog you'll know that as far as handling source code, Prolog is very Lisp-like, so this took the place of S-expressions. In the file the terms looked like this:

Code:

type(thebox, room).
sym(thebox, thebox).
p(thebox, name='The BOX').
p(thebox, desc='You are in a nice, cozy box.').
p(thebox, exit/void=void).


It basically implemented a simple object system (similar to the one unifex is supposed to be describing, but still hasn't). Now once this was loaded into memory there were some operators to talk to the 'database'...

Code:

% Get property Prop = Value from Obj, or the given Default if it's unset.
Obj@Prop $> Default@Value :- ground(Default), pget_def(Obj, Prop = Value, Default), !.
% Get property Prop = Value from Obj, or fail if it's unset.
Obj@Prop $> Value :- pget(Obj, Prop = Value).
% Set property Prop = Value on Obj. This will remove any old binding of
% Prop = _ in the object.
Obj@Prop <$ Value :- pset(Obj, Prop = Value).
% $$ Obj@Prop will unset property Prop on Obj.
$$ Obj@Prop       :- punset(Obj, Prop).
% $/ Obj@Prop will unset all properties matching Prop on Obj.
$/ Obj@Prop       :- punsetall(Obj, Prop).
% Set property Prop on Obj. This is used for setting non-= properties.
% It will not unset any old version (although setting a property
% which is identical to a pre-existing one is a no-op)
$= Obj@Prop       :- pset(Obj, Prop).
% Just @ on its own will succeed for each property that matches Prop.
Obj@Prop          :- pget(Obj, Prop).


Here's a short example that uses most of the operators:

Code:

do(P, edit, ''):- R has P, P@editing <$ R, do(P, list, _).
do(P, edit, L) :- sym(L, I), exists(I), P@editing <$ I, do(P, list, _).
do(P, edit, L) :- exists(L), P@editing <$ L, do(P, list, _).
do(P, edit, L) :- find_keyword(P, L, O), P@editing <$ O, do(P, list, _).
do(P, edit, L) :- R has P, find_keyword(R, L, O), P@editing <$ O, do(P, list, _).
do(P, set, L)  :-
  atom_term(L, Prt), ground(Prt), P@editing $> P@E, $=E@Prt.
do(P, unset, L) :- atom_term(L, Pr), P@editing $> P@E, $/E@Pr.
do(P, list, _) :-
  P@editing $> P@E, send(P,E,_,['Properties for $Ss [$S<i/$Si]:\n']),
  forall(E@Pr, send(P,text(Pr),_, ['$S\n'])).


Now this system had some really bizarre and horrific features. For example property names could be compound values. Thus you had stuff like exit/foo=bar for an exit named foo to bar. / of course is a Prolog operator (used normally for naming predicates, e.g. foo/2). Then you could do stuff like:

Code:

  forall(R@exit/K$>D, send(P, text(K), D, ['  [$S] -> $On\n'])).


To poke at them. Since this was basically a delegation-based system, there were also 'private' properties that wouldn't be 'inherited'. They weren't private in the traditional Smalltalk sense though.

Anyway the real win was using Prolog logic statements as a sort of query language. Unfortunately most of my examples are even more obfuscated than the ones above. Yes, seriously. I'll try to find something.
Back to top
View user's profile Send private message Visit poster's website
Author Message
Lindahl



Joined: 29 May 2005
Posts: 56

PostPosted: Fri Jun 03, 2005 4:37 pm    Post subject: Reply with quote

Zygfryd wrote:
I haven't read the whole of Lindahl's posts, but I got the impression that he's aiming for an analogous access scheme in C++ as far as it's possible.


It's more of a persistent paged memory system for C++ with an add-ons for reliability, crash recovery, clustering, garbage collection, schema evolution and conversion to XML. There isn't any sort of associative lookup so the reference algorithms are all constant-time. It works almost exactly like virtual memory, except the address translation is at the software level instead of the hardware level. You can bypass address translation and logging overhead by using reader<T> and writer<T> objects or by using methods (object functions) - it is completely identical to having the data in-memory.

CPU overhead:
1) extra processes - minimal processing is done in the seperate processes, 99% of it is IO plus a few instructions
2) software address translation - page lookup is a cache-sized hash for virtually constant-time lookup
3) log overhead for writes - a few bitwise operations if the object was previously logged, otherwise a memory copy and a bus-lock for assembly synchronized instruction on SMPs
4) some processing at checkpoint queues to clear the old logs and some near constant-time address-sorting (sorts 32 pages at a time) to provide better IO throughput

Memory overhead:
1) page cache - no more than normal paged memory
2) log buffer - minimal compared to the buffer cache, sized to the update rate, several MB at most
3) 36-bytes per page of real memory (<1%)
4) 4-bytes per object
5) 16 bytes per page swapped to disk between checkpoints (zero if you have enough real memory for the entire working set)
6) 8-bytes for each member in each database class
7) 64-bytes for each database class
Cool 4k-bytes book-keeping information for classes
9) 512-bytes book-keeping information for the database

IO overhead:
1) log IO overhead - for each set of 128 bytes modified, 136 bytes of IO is generated, ideally should be done on a separate disk to provide essentially zero conflict
2) read IO overhead - no more than normal paged memory
3) write IO overhead - no more than normal paged memory
4) checkpoint IO overhead - synchronization to disk to provide reliable recovery spots, each dirtied page will be written at most once for each queued checkpoint (it's quite possible to achieve a rate of only half the dirtied pages written to disk for each checkpoint), this is invisible most of the time to the application because the IO is performed with low priority compared to the above IOs (but there will be conflict with other IO on the same machine)

As you can see, I've put a great deal of effort into making the overhead as small as possible. So it's a lot different from almost every other database system out there. All overhead-inducing process synchronization is lockless, which is a BIG win.

On the list of things to do is explicit striping and raw disk support. For example, you can get the database to be partitioned across multiple raw disks or file-systems (on other disks) and you'll have complete parallel access for each disk. Non-raw striping (file-system striping) can be done at the database system level, as opposed to the file-system level, so that you can ensure optimal parallel access - just make sure the disks are connected to the machine through parallel interfaces. This can also give you redundancy (though it's really not that necessary for MUDs). For raw disk support, the IO throughput is much higher, bypassing file systems completely. But thats all nuts and bolts to most of you. All of this is quite easy to plug-and-play. Since all IO is done in seperate processes, you can just switch disk processes or add disk processes to provide the support. Migration tools are on the to-do list - to provide minimal administration for moving from a single file system all the way up to a striped raw disk setup.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    mudlab.org Forum Index -> Coding All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Powered by phpBB © 2001, 2002 phpBB Group
BBTech Template by © 2003-04 MDesign