MemcacheDB

From Bauman National Library
This page was last modified on 25 June 2016, at 22:56.
MemcacheDB
Memcachedbscreen.jpg
Initial release December 25, 2008 (2008-12-25)
Repository {{#property:P1324}}
Written in C
Operating system Cross-platform
Type distributed memory caching system
License BSD License[1]
Website memcachedb.org

MemcacheDB is a distributed key-value storage system designed for persistent. It is NOT a cache solution, but a persistent storage engine for fast and reliable key-value based object storage and retrieval. It conforms to memcache protocol(not completed, see below), so any memcached client can have connectivity with it. MemcacheDB uses Berkeley DB as a storing backend, so lots of features including transaction and replication are supported.

General

In performance memcacheDB is comparable with memcache and another specific data bases, e.g. Apache CouchDB. In numbers it came out in thousands of operations in one second. Developers note, that memcachedDB is not a cache itself, so you cant't use it instead of memcached.

Besides of spectre of functionality is not so big — reading, writing, refreshing, deletion, these is quite enough for many tasks, where we used to choose usual data bases. However, if we move part of operations on specific solutions we can unload main db for operations, that use serious tools to handle data.

It's very simple to deploy memcacheDB — install and compile sources, install db and that's it. Set up parameters for clients — port, buffer size, store directory and cache size in memory.

memcached — software, that implement caching in ram service, that based on hash-table.

With client libraries it allows to cache data in ram for many servers. Distribution implemented by segmenting data by hash key. Client library compute hash, using key, and use it to choose server. Server failure interpret as hash fail. This is increasing fail tolerance.

By default memcached use 11211 port

Features

  • High performance read/write for a key-value based object
  • High reliable persistent storage with transcation
  • High availability data storage with replication
  • Memcache protocol compatibility

Client API

MemcacheDB is compatible with memcache protocol, so any clients that support memcache protocol have connectivity with it. But take a look at the commands now MemcacheDB supports:

  • get(also mutiple get)
  • set, add, replace
  • append/prepend
  • incr, decr
  • delete
  • stats

Also some private commands are supported:

  • rget
  • db_checkpoint
  • db_archive

Protocol

Clients of memcached communicate with server through TCP connections. (A UDP interface is also available; details are below under "UDP protocol.") A given running memcached server listens on some (configurable) port; clients connect to that port, send commands to the server, read responses, and eventually close the connection.

There is no need to send any command to end the session. A client may just close the connection at any moment it no longer needs it. Note, however, that clients are encouraged to cache their connections rather than reopen them every time they need to store or retrieve data. This is because memcached is especially designed to work very efficiently with a very large number (many hundreds, more than a thousand if necessary) of open connections. Caching connections will eliminate the overhead associated with establishing a TCP connection (the overhead of preparing for a new connection on the server side is insignificant compared to this).

There are two kinds of data sent in the memcache protocol: text lines and unstructured data. Text lines are used for commands from clients and responses from servers. Unstructured data is sent when a client wants to store or retrieve data. The server will transmit back unstructured data in exactly the same way it received it, as a byte stream. The server doesn't care about byte order issues in unstructured data and isn't aware of them. There are no limitations on characters that may appear in unstructured data; however, the reader of such data (either a client or a server) will always know, from a preceding text line, the exact length of the data block being transmitted.

Text lines are always terminated by \r\n. Unstructured data is also terminated by \r\n, even though \r, \n or any other 8-bit characters may also appear inside the data. Therefore, when a client retrieves data from a server, it must use the length of the data block (which it will be provided with) to determine where the data block ends, and not the fact that \r\n follows the end of the data block, even though it does.

Keys

Data stored by memcached is identified with the help of a key. A key is a text string which should uniquely identify the data for clients that are interested in storing and retrieving it. Currently the length limit of a key is set at 250 characters (of course, normally clients wouldn't need to use such long keys); the key must not include control characters or whitespace.

Commands

There are three types of commands.

Storage commands (there are six: set, add, replace, append prepend and cas) ask the server to store some data identified by a key. The client sends a command line, and then a data block; after that the client expects one line of response, which will indicate success or faulure.

Retrieval commands (there are two: get and gets) ask the server to retrieve data corresponding to a set of keys (one or more keys in one request). The client sends a command line, which includes all the requested keys; after that for each item the server finds it sends to the client one response line with information about the item, and one data block with the item's data; this continues until the server finished with the END response line.

All other commands don't involve unstructured data. In all of them, the client sends one command line, and expects (depending on the command) either one line of response, or several lines of response ending with END on the last line.

A command line always starts with the name of the command, followed by parameters (if any) delimited by whitespace. Command names are lower-case and are case-sensitive.

Performance [2]

Box: Dell 2950III OS: Linux CentOS 5 Version: memcachedb-1.0.0-beta Client API: libmemcached

Non-thread Edition

memcachedb -d -r -u root -H /data1/mdbtest/ -N -v

Write

key: 16 value: 100B, 8 concurrents, every process does 2,000,000 set

1 2 3 4 5 6 7 8 avg.
Cost 807 835 840 853 859 857 865 868 848

2000000 * 8 / 848 = 18868 w/s

Read

key: 16 value: 100B, 8 concurrents, every process does 2,000,000 get

1 2 3 4 5 6 7 8 avg.
Cost 354 354 359 358 357 364 363 365 360

2000000 * 8 / 360 = 44444 r/s

Thread Edition(4 Threads)

memcachedb -d -r -u root -H /data1/mdbtest/ -N -t 4 -v

Write

key: 16 value: 100B, 8 concurrents, every process does 2,000,000 set

1 2 3 4 5 6 7 8 avg.
Cost 663 669 680 680 684 683 687 686 679

2000000 * 8 / 679 = 23564 w/s

Read

key: 16 value: 100B, 8 concurrents, every process does 2,000,000 get

1 2 3 4 5 6 7 8 avg.
Cost 245 249 250 248 248 249 251 250 249

2000000 * 8 / 249 = 64257 r/s

References

Cite error: Invalid <references> tag; parameter "group" is allowed only.

Use <references />, or <references group="..." />

Links

  • "BSD License". 
  • "Benchmark".