Wednesday 10 July 2013

Memcached used as session replication/manager


What is memcache?

Memcache is an in-memory, distributed cache. The primary API for interacting with it are SET(key, value) and GET(key) operations. Memcache is essentially a hashmap (or dictionary) that is spread across multiple servers, where operations are still performed in constant time.
The most common usage of memcache is to cache expensive database queries and HTML renders such that these expensive operations don’t need to happen over and over again.
But in this post I would be focusing on how can we use memcache for session-replication.

How is memcache so fast?

Memcache is so fast for two reasons: the cache is entirely in-memory; and operations are performed in constant time. Memcache avoids writing any data to disk, which results in faster operations and no fault tolerance. If the server running memcache is restarted, the keys on that server will be lost, which is fine for the use case of memcache, as a temporal caching layer and not a definitive data store. The use of constant time operations is simply a deployment of all those good old tricks and tips for a computer science algorithms & data structures class. O(1) is a nice property to have.

Eviction, Expiration, and Memory Limits

Besides getting and setting keys, eviction and expiration are major components of memcache, too. When a key is SET in the cache, an expiration is set along with the key. Most memcache clients have a default expiration which can be optionally overwritten on a per-key basis.
Memcache doesn’t evict keys when their expiration time expires, as doing so isn’t possible while still guaranteeing O(1) operation times. Instead expiration is more of a way to say how long until a key should be considered stale. When a GET is performed, Memcache checks if the key’s expiration time is still valid before returning it.
A key is evicted from the cache when the total cache size has reached the limit. Most memcache implementations offer a least recently used (LRU) eviction policy, which evicts the key that was least recently used when a new key is added to the cache and the cache has reached its limit.
In general, the higher your cache limit, the fewer evictions you’ll have, and the higher your hit-rate will be, ultimately resulting in better performance and scalability.

Eviction meaning in simple terms: Removes items from the cache to free memory for new items.
Note: A key/value pair stored in memcached can get evicted prior to its expiry if there is still free space available?

Memcache allocates space in chuncks vs on-demand, and then stores items into the chunks and manages that memory manually. As a result, smaller items can "use" much larger pieces of memory than they would if space was allocated on a per-item basis.










To see stats from server side . A quick way to get memcache status
[kulshresht@web06 bin]$ echo stats | nc 172.16.0.52 11211 STAT pid 6274 STAT uptime 1807302 STAT time 1373449782 STAT version 1.4.15 STAT libevent 2.0.21-stable STAT pointer_size 64 STAT rusage_user 3916.447609 STAT rusage_system 7673.810403 STAT curr_connections 28 STAT total_connections 7126 STAT connection_structures 31 STAT reserved_fds 20 STAT cmd_get 60623944 STAT cmd_set 65457134 STAT cmd_flush 0 STAT cmd_touch 0 STAT get_hits 27331 STAT get_misses 60596613 STAT delete_misses 25500664 STAT delete_hits 224497 STAT incr_misses 0 STAT incr_hits 0 STAT decr_misses 0
.
.
.
.
[kulshresht@server1 bin]$ telnet 127.0.0.1 11211 Trying 127.0.0.1... Connected to 127.0.0.1. Escape character is '^]'. stats STAT pid 6274 STAT uptime 1808371 STAT time 1373450851 STAT version 1.4.15 STAT libevent 2.0.21-stable STAT pointer_size 64 STAT rusage_user 3919.035216 STAT rusage_system 7677.627823 STAT curr_connections 28 STAT total_connections 7131 STAT connection_structures 31

To get the beautful UI for memcahe go to below URL:
https://code.google.com/p/phpmemcacheadmin/

It's worth noting that memcached uses it's own slab allocator instead of using standard malloc()/free() in order to be speed up memory management.

Memcached allocates memory in blocks of 1MB (by default), referred to as a slab. Slabs belong to a slab class, which is defined by the size of item that is stored in that slab.Lets say that we have two slab classes, a slab class of 1KB and the 2nd class at 2.25KB (the default growth factor in slab classes is 1.25 ). An item of 500bytes or anything less than 1KB would get put into slab of 1KB slab class and any item  >1KB and <2.25KB will go in 2nd slab of 2.25KB.

Memcahe will make slab on it's own depending on need and demand. Just keep in mind to give -I parameter at the time of memcache startup. Otherwise you will get "SERVER_ERROR object too large for cache" if cache exceeds the maximum slab size which is 1 MB by default.

memcached -I 10m  # Allow objects up to 10MB

For my application memcache made 52 slabs and max chunk size was of 10MB.

We can refuse items larger than particular size too. e.g: memcached -I 128k # Refuse items larger than 128k.

When an itme is stored then first there is a check to see which class it should fall into , based on it's size, and then memcached will check that class's stack to see if there are any empty spots available. If not, a new slab will be allocated to store the new item.

All this leads to one conclusion, that there is some tradeoff in unused memory. Not every item is going to be exactly the size defined by the slab class, or you could have just one item stored in a class for items of size 1k, meaning that 1MB is allocated just to store that 1k item. But these operations are very fast and very simple.




No comments:

Post a Comment