Monday 22 July 2013

Understanding Java Heap Space

Heap|Xms|Xmx|Jconsole|Heap Space|Java Heap Space
Before going in depth , let us first understand the basics of heap and stack....

Heap - what exactly is this?

  1. Class instances and arrays are stored in heap memory. Heap memory is also called as shared memory. As this is the place where multiple threads will share the same data .Instance variables and the Objects lie on Heap.
  2. The heap is memory set aside for dynamic allocation. Unlike the stack, there's no enforced pattern to the allocation and deallocation of blocks from the heap, you can allocate/deallocate a block at any given time.
  3. This includes the Objects created in local scope of any methods. In this case, reference variables to local objects are stored with method frame in a stack, but actual object lies in heap.
  4. The heap is typically allocated at application startup by the runtime, and is reclaimed when the application (technically process) exits.

Stack - what exactly is this?
  1. Local variables and methods lie on the Stack.
  2. Java stacks (Sometimes referred as frames) are created private to a thread. Every thread will have a program counter (PC) and a java stack. PC will use the java stack to store the intermediate values, dynamic linking, return values for methods and dispatch exceptions. Every thread, including the main thread, daemons threads - get their own stack.
  3. When a thread invokes a method, the JVM pushes a new frame onto that thread's Java stack.
  4. All method calls, arguments, local variables, reference variables, intermediate computations and return values if any are kept in these stack corresponding to the method invoked.
  5. The memory allocated for frame does not need to be contiguous.
  6. The stack is always reserved in a LIFO order; the most recently reserved block is always the next block to be freed.
  7. The stack is attached to a thread, so when the thread exits the stack is reclaimed.

Gist:
  1. Local Variables are stored in stack during runtime.
  2. Static Variables are stored in Method Area.
  3. Arrays are stored in heap memory.
  4. Stack does not need to be contiguous.
  5. The stack is faster because the access pattern makes it trivial to allocate and deallocate memory from it (a pointer/integer is simply incremented or decremented), while the heap has much more complex bookkeeping involved in an allocation or free. Also, each byte in the stack tends to be reused very frequently which means it tends to be mapped to the processor's cache, making it very fast.


Heap Regions



[ Offline comment --> Used "sketchboard.me" provided by chrome store to make above snap. Cool stuff from Google. Try that out too :) ]



  1. Eden Space (heap): pool from which memory is initially allocated for most objects.
  2. Survivor Space (heap): pool containing objects that have survived GC of eden space.
  3. Tenured Generation (heap): pool containing objects that have existed for some time in the survivor space.
  4. Permanent Generation (non-heap): holds all the reflective data of the virtual machine itself, stores class level details, loading and unloading classes (e.g. JSPs), methods, String pool. PermGen contains meta-data of the classes and the objects i.e. pointers into the rest of the heap where the objects are allocated. The PermGen also contains Class-loaders which have to be manually destroyed at the end of their use else they stay in memory and also keep holding references to their objects on the heap.
  5. Code Cache (non-heap): HotSpot JVM also includes a "code cache" containing     memory used for compilation and storage of native code.

Understanding Heap allocation and Garbage Collection


Garbage collection (GC) is how the JVM frees memory occupied by objects that are no longer referenced. Garbage collection is the process of releasing memory used by the dead objects. The algorithms and parameters used by GC can have dramatic effects on performance.

The Java HotSpot VM defines two generations: the young generation (sometimes called 
the "nursery") and the old generation. The young generation consists of an "Eden space"
and two "survivor spaces." The VM initially assigns all objects to the Eden space, and
most objects die there. When it performs a minor GC, the VM moves any remaining
objects from the Eden space to one of the survivor spaces. The VM moves objects that 
live long enough in the survivor spaces to the "tenured" space in the old generation. When
the tenured generation fills up, there is a full GC that is often much slower because it
involves all live objects. The permanent generation holds all the reflective data of the
virtual machine itself, such as class and method objects.













Figure: Generations of Data in Garbage Collection


Points to remember (Tips which can be very useful for tuning Heap):


Garbage collection can become a bottleneck in highly parallel systems.  By understanding how GC works, it is possible to use a variety of command line options to minimize that impact. Java heap allocation starts with min size -Xms and increases upto Xmx. At any point, it has used heap(heap actually in use), committed heap (allocated heap at that point. includes used + free), max heap(max heap that can be allocated).Try to keep -Xms and -Xmx same to reduce frequent Full GC.

The bigger the young generation, the less often minor collections occur. However, for a bounded heap size a larger young generation implies a smaller old generation, which will increase the frequency of major collections (full GC's)


-XX:NewRatio=3 means that the ratio between the young and old generation is 1:3; in other words, the combined size of eden and the survivor spaces will be one fourth of the heap.


Recommended JVM parameters (Can differ from application to application. Study more and get the best tuning for your application)



export CATALINA_OPTS="-Xmx4096m –Xms4096m -Xmn1g -XX:ParallelGCThreads=
16 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:SurvivorRatio=8
 -XX:TargetSurvivorRatio=80 -XX:PermSize=512M -XX:MaxPermSize=1024M"
  1. Add -Xmn1g parameter [To mainly take care of young generation objects]
  2. Add  -XX:ParallelGCThreads=16  [Formula for this: We get 1 parallel GC thread per CPU for up to 8 CPUs, and 5/8 after that (so for 16 CPUs we get: 8 + 5/8 x 8 = 13
  3.  GC threads).]
  4. Add -XX:SurvivorRatio=8 
  5. Add -XX:TargetSurvivorRatio=80
  6. -XX:NewRatio=3
  7. Try to keep -Xms and -Xmx same to reduce frequent Full GC.
  8. Always use CATALINA_OPTS if if you're setting environment variables for used only by Tomcat, you'll be best advised to use CATALINA_OPTS, whereas if you're setting environment variables to be used by other java applications as well, such as by JBoss, you should put your settings in JAVA_OPTS. http://stackoverflow.com/questions/11222365/catalina-opts-vs-java-opts-what-is-the-difference


Use Jconsole to capture the clearest picture of how the different generations of memory are behaving for your application .Below is a good example of how to enable JMX port for monitoring your application using Jconsole/JvisualVM.

e.g: 

##### JConsloe options added by Kulshresht to analyse heap memory  ----------

CATALINA_OPTS="$CATALINA_OPTS  \
                               -Dcom.sun.management.jmxremote \
                               -Dcom.sun.management.jmxremote.port=15556 \
                               -Dcom.sun.management.jmxremote.ssl=false \
                               -Dcom.sun.management.jmxremote.authenticate=false\
                               -Djava.rmi.server.hostname=19.83.73.82"

Jconsole & JvisualVM are inbuilt JDK tools. So there is no extra tension of installing other softwares (assuming you already have JDK installed :)). JvisualVM has great graphics. The only disadvantage with these tools is that they do not save historical data. So, if something goes wrong with your application at midnight then you can't trouble shoot the issue withe these graphs until and unless you have enabled the Jconsole UI (24*7) on a particular machine. Don't get disheartened , there is always a way :). The tool which can be used to store historical data is "Hyperic".



No comments:

Post a Comment