Zextras Hierarchical Storage Management for Zimbra | Zimbra

Document
Alert! This article is written for Zimbra OSE users. As of December 2023, Synacor will no longer be providing support for Zimbra OSE. You might want to consider trying out Carbonio Community Edition – Zextras’s free and open-source email and collaboration platform.

For additional guidance, check out our community articles detailing the process of migrating from your current platform to Carbonio CE.

For enterprise-level requirements and advanced features, consider checking out Zextras Carbonio – the all-in-one private digital workplace designed for digital sovereignty trusted by the public sector, telcos, and regulated industries.

Currently, the most common practice to achieve acceptable fault tolerance in a mail server is using RAID volumes. Besides all the stability the RAID system brings us, there is a con side to it, the expansion of the RAID system is costly. For example, as the disks fill up there is a need for new storage to be added, disk space is an expensive resource, more so in a RAID system where you have to add the same number of disks to each node, there are also other costs such as the cost of the technical staff. However, the Hierarchical Storage Management Technique adopted by Zextras Powerstore not only can help to reduce these costs but also brings other advantages as will be discussed later.

What is Hierarchical Storage Management (HSM)?

Hierarchical Storage Management or HSM is a data storage technology that allows us to move data between storages based on a defined policy. This means that data is organized based on your needs, and each storage, depending on its properties and specifications, gets the share of your data that it can handle best. For, example suppose you stored all your data on high-speed storage and slower storage (like solid-state drives which are more expensive but faster than a cheaper hard disk drive), in a way that all the frequently used data are stored in the faster storage, the advantage is that the total amount of data is not on faster but more expensive storage, which translates to less cost, and since the less frequently used data are on the slower storage, the users generally won’t experience any slowdown. If a user reuses a file from the slower storage, it will automatically move to the faster storage.

You can consider HSM as the idea of cache memory in a CPU, where frequently used data are stored in small amounts of expensive SRAM memory running at very high speeds, and the least recently used data is stored in the slower but much larger main DRAM memory.

Why use Hierarchical Storage Management (HSM)?

The basic idea behind this is that not all data is accessed at the same rate and not all storage devices cost the same, so if we could move the least frequently accessed data to slower, cheaper volumes, and the most accessed data to faster, more expensive ones we could save on the costs and improve user experience.

Most often it is recent data that is accessed more often, so the idea above can be interpreted as moving old data to slower volumes (secondary store) and storing new data to newer ones (primary store).

This, however, is not the only applicable policy. For instance, you could choose your storage based on strategic reasons: Most important data goes on primary storage and the least important ones on secondary storage. The policy behavior can be customized based on your specific needs.

The HSM has a great impact on the user experience while reducing storage costs for the company. This approach is not only cost-effective because you could save on what you would have to spend on new, fast volumes; but it also helps you reduce structural costs by leaving the current mail data on small storage units, moving forward towards the expansion in a safe, automated way, taking place at the application level and in this way eliminating the need to deal with hardware.

HSM Put into Action

As mentioned before, the main functionality of Zextras storage management is moving the elements between different storages based on specific policies. Zimbra has three types of archives, index, meta-data, and blob. The index contains all the indices to facilitates search functionalities, while the meta-data contains the data stored in the database and the blob contains large data that can not be stored at once in the database so organized in files.

Zimbra data stores can be either a primary data store or a secondary data store. Moving direction is usually from the primary to the secondary storage based on a defined policy, although moving the data from the secondary to the main storage can also be possible. Policies are a set of rules defined by the administrators to move elements between the stores. This can be scheduled or done manually. The distribution of data in the secondary store can be managed via the administrator console without any need for complicated logical volume management by technical staff.

The policy specifies the type of data to be moved based on criteria. For example,

  • Moving all messages from a specific mailbox,
  • Moving all calendar elements prior to this year,
  • Moving all messages including an attachment larger than 5 MB.

These examples both identify the type of data (messages and calendar elements), and criteria (elements of specific mailboxes, prior to this year and including an attachment larger than 5 MB).

Generally, the criteria can be based on priorities, time, and properties. For example,

  • Priorities – Moving all the most important mailboxes into the faster disks and the less active ones into the slower disks.
  • Time – Moving all the elements of the last month into the faster disks and the older ones into the slower disks.
  • Properties – Moving all the elements with attachments larger than 5 MB into the slower disks.

The procedure is quite safe since the element wouldn’t be transferred to the destination disk unless the operation is completed correctly. Therefore in case of any issues with moving, the whole transfer will be rolled back to ensure data integrity. This not only minimizes the risk of data loss but also reduces the system downtime due to the transfer.

HSM Policies

Applying a policy means running the doMoveBlobs operation in order to move items between the primary and secondary store according to the defined policy. Zextras Powerstore checks which items in the primary store meet the defined policy and copies them into the secondary store and updates the database entries of copied items, then the old Blobs are deleted from the primary store. Each of these is executed if only the previous step has been completed to prevent any risk of data loss.

Zextras Powerstore module gives you the ability to define HSM policies. This can be done in one of these ways

  • Via the Administration Zimlet
  • Via the CLI

This is an example of a policy

message,document:before:-50day

which says to move all the emails and documents older than 50 days. Another example would be

message:before:-10day has:attachment

which says to move and all the emails older than 10 days containing an attachment.

You can define policies from the Zextras Powerstore tab in the Administration Zimlet or via CLI. To learn more about defining HSM policies and applying them using the Administration Zimlet or CLI, please refer to Zextras Powerstore Documentation.

Zextras Powerstore – Storage Management

Zextras Powerstore is not limited to HSM. The storage management also features some extremely useful command-line tools such as doCheckBlobs, doDeduplicate, doVolumeToVolumeMove, and getVolumeStats. The doCheckBlobs perform blob coherency checks on one or more volumes, doDeduplicate starts Item deduplication on a volume, doVolumeToVolumeMove moves all items from one volume to another, and getVolumeStats displays information about a volume’s size and the number of thereby contained items or blobs. Although these tools are not necessarily related to HSM they have an impact on the transferring process of HSM.

As said before one of the features of the Zextras Storage Management is item deduplication that can be useful while transferring the elements to the secondary storage. It can perform compression and deduplication of data to save storage space while moving items to the secondary storage. Move mailboxes is another feature of the Zextras Powerstore that enables you to move mailboxes of a given domain, from a mailstore to another under the same Zimbra infrastructure. It can be done using the doMailboxMove command. The operation has 3 stages for each mailbox, blobs (copying all blobs from the source server to the destination server), backup (backing up all entries from the source server to the destination server), and account (moving all database entries as-is and updating LDAP entries, causing the mailbox to move to the destination server). When performing doMailboxMove command, it can be also optionally specified whether to apply the HSM policies on the destination host when moving the blobs by hsm true option.

Download Zextras Suite for Zimbra OSE

Post your comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Zimbra OSE Backup on External Storage with Zextras | Zimbra
Zimbra User Guide: How to manage your email | Zimbra