Administration guide > Plan the WebSphere eXtreme Scale environment > Capacity planning


Sizing memory and partition count calculation

 

+

Search Tips   |   Advanced Search


Derive max recommended object count

To size the memory, load application data into a single JVM running verbosegc. When the heap usage reaches 60%, note the number of objects that are used. This number is the maximum recommended object count for each of the JVMs.

Use realistic data and include any defined indexes in the sizing because indexes also consume memory.

We can use the Pattern Modeling and Analysis Tool to derive number of objects.


Derive numShardsPerPartition

To calculate numShardsPerPartition, add 1 for the primary shard plus the total number of replica shards you want.

numShardsPerPartition = 1 + total_number_of_replicas

A good general rule is 10 shards per grid container JVM


Number of Java virtual machines (minNumJVMs value)

To scale up the configuration, first decide on the maximum number of objects that need to be stored in total.

To determine the number of Java virtual machines you need, use the following formula:

minNumJVMS=(numShardsPerPartition * numObjs) / numObjsPerJVM

Round this value up to the nearest integer value.


Number of shards (numShards value)

At the final growth size, 10 shards for each JVM should be used. As described before, each JVM has one primary shard and (N-1) shards for the replicas, or in this case, 9 replicas. Because you already have a number of Java virtual machines to store the data, you can multiply the number of Java virtual machines by 10 to determine the number of shards:

numShards = minNumJVMs * 10 shards/JVM


Number of partitions

If a partition has one primary shard and one replica shard, then the partition has two shards (primary and replica). The number of partitions is the shard count divided by 2, rounded up to the nearest prime number. If the partition has a primary and two replicas, then the number of partitions is the shard count divided by 3, rounded up to the nearest prime number.

numPartitions = numShards / numShardsPerPartition


Example of scaling

In this example, the number of entries begins at 250 million. Each year, the number of entries grows about 14%. After 7 years, the total number of entries is 500 million, so plan the capacity accordingly. For high availability, a single replica is needed. With a replica, the number of entries doubles, or 1 billion entries. As a test, 2 million entries can be stored in each JVM.

Using the calculations in this scenario the following configuration is needed:


Start configuration

Based on the previous calculations, you would start with 250 Java virtual machines and grow toward 500 Java virtual machines over 5 years, which allows you to manage incremental growth until you arrive at the final number of entries.

In this configuation, about 200,000 entries are stored per partition (500 million entries divided by 2503 partitions). You should set the numberOfBuckets parameter on the map that holds the entries to the closest higher prime number, in this example 70887, which keeps the ratio around 3.


When the maximum number of Java virtual machines is reached

When you reach the maximum number of 500 Java virtual machines, you can still grow your data grid. As the number of Java virtual machines grows beyond 500, the shard count begins to drop below 10 for each JVM, which is below the recommended number. The shards start to become larger, which can cause problems. You should repeat the sizing process considering future growth again, and reset the partition count. This practice requires a full data grid restart, or an outage of the data grid.


Number of servers

Attention: Do not use paging on a server under any circumstances. A single JVM uses more memory than the heap size. For example, 1 GB of heap for a JVM actually uses 1.4 GB of real memory. Determine the available free RAM on the server. Divide the amount of RAM by the memory per JVM to get the maximum number of Java virtual machines on the server.


Parent topic:

Capacity planning


Related concepts

Sizing CPU per partition for transactions

Sizing CPUs for parallel transactions

Dynamic cache capacity planning