Bucket Design and Configuration
Bucket Design and Configuration
Every bucket in Riak is like a hash table (dictionary). Riak can add new buckets on the fly, and they come with no overhead at all as long as you stick with the default properties. There is no need to set up buckets before sending a Set() command from the game servers. If a bucket is missing in the database, Riak will create it the first time anyone tries to store an object in the bucket.
The default bucket properties are good enough for most situations. We recommend to keep this configuration when starting to learn uGameDB. The most important parameters a beginner should know about is the N-value and the allow_multi parameter.
N is 3 by default and this indicates that Riak will store 3 identical replicas of all stored values (somewhere in the database cluster.
allow_multi is false by default and this indicates that Riak never returns multiple (different) values when reading an object. In this case Riak lets the “last write win”, even when resolving a cluster failures when the cluster has been partitioned and two different replicas for the same identical key have been stored in each partition of the cluster. If allow_multi is turned on, the game server programmer will have to be aware that the database can return two or more conflicting values when calling Get(key), and that Riak will expect the client to resolve the conflict.
Changing the default bucket properties is described here: http://wiki.basho.com/Configuration-Files.html
Choosing Keys And Values
When choosing your database design the primary goal should be to find medium size for all your game server data. This does not include pictures, meshes and other binary data. This discussion is about storing integers, vectors, strings and other data in a smart way in buckets. We will try to explain why this is important with some examples:
Example 1 - not scalable design
Create one bucket called “Player”. Store all data for all players in this bucket. The key is chosen to be the (unique) login name for the player and the value object will contain “everything”: score, hitpoints, level, character name, item list, friend list, pets list + data for all pets, vehicle list + data for all vehicles, payment history details, auction house postings, in-game mail, and so on.
This is not scalable and the reason is simple. As soon as a player needs to update any value the game server will have to read and parse a large value object for the player and store a large value object when something s changed. This is not too hard on performance, but it is not efficient and the problem is more obvious when you want to find all players that have 3 vehicles or more, then the map-reduce query will be unnecessarily complex to write. And it is even more obvious that this design will get totally out of hand if two players need to share the ownership of a vehicle. Then it will be really hard to store that shared ownership.
Example 2 - Basic design
This is the recommended bucket design.
Create buckets that are close to the entities used in your gameplay. For example: Player, NPC, Vehicle, Highscore, Team, Guild, Level, Auction, Item, ChatHistory, and so on.
The goal with this basic bucket design is to make the data stored in each bucket not too big and not too small. In an ideal situation the game servers will have to read or write from only one bucket at a time. Even if this goal is never reached while coding, it is still a good design principle to aim for and “fight” for, because it will keep the risk och getting inconsistent data very low.
Also, it is a good idea to make every value in a bucket of the same type. For example, all values in the Vehicles bucket should be of the class Vehicle. Then it will be much easier to write query operations that work on some or all of the values without having to deal with different data types for each value.
Example 3 - too small buckets
If you create several buckets for one game object, the design is scalable, but potentially dangerous because the risk for inconsistent data will be higher. Let’s say you have five Buckets containing player data; one bucket for player login names, one for player licence key, one for player preferred language, one for player preferred server and one for player email address. This granularity is too fine. The cost of this design will be high.
The first cost is that the developers will have to update several buckets from one single save operation in the game server, for example when a game server should save input data from a client sending data from a GUI screen with all the player’s preferences. Imagine that this save operation updates five different buckets in the database. This is fast, but the code will be unnecessary long and thus hard to maintain.
The second cost is the risk for inconsistent data. If nodes fail during the writing of a new player’s basic data in the five buckets, some fields could be stored and some could be lost. The impact of this inconsistency could be dangerous if all fields are mandatory and required for a player to be able to play the game. What happens if a player has a login-name and a license key, but no email address? In this situation it might be tricky to know if a customer contacting you via email do belong to a certain login name.
Avoid server code that writes to several buckets in a serial manner. And if you have to do this, write game server code that can handle inconsistent data.
Handling inconsistent data
The risk and impact of inconsistent data is always something that a uGameDB developer needs to keep to a minimum. This is the price you pay for massive scalability.
A good example of inconsistent data is when a data value in the bucket Guilds contains a list of all players in one guild, but one of the players has been deleted from the database. When writing data to a database like Riak which is “eventually consistent” it is the responsibility of the clients (in this case the game servers) to be prepared for a situation when data in one or several buckets is inconsistent.
To continue the example, it is important to write code in your game servers to handle a situation when the list of players in the guild includes the key (the username) of a player that has been deleted. One simple solution is to detect this in the game server script code and then update the stored guild list and the reapir is finished. It is important to write game server code that can handle the situation that a player was added to the Players bucket and also to the guild’s list in the Guilds bucket and later removed from the Players bucket, but not from the guild list. This could happen after a node failure or a router failure or a game server failure.
If you make a good bucket design and program the game servers to be aware of some inconsistent situations, the furure is bright. uGameDB will provide massive scalability and uptime for persistent game data.