In this chapter we will describe how persistence works with HornetQ and how to configure it.
HornetQ ships with a high performance journal. This journal has been implemented by the HornetQ team with a view to providing high performance in a messaging system. Since HornetQ handles its own persistence, rather than relying on a database or other 3rd party persistence engine, we have been able to tune the journal to gain optimal performance for the persistence of messages and transactions.
A HornetQ journal is an append only journal. It consists of a set of files on disk. Each file is pre-created to a fixed size and initially filled with padding. As operations are performed on the server, e.g. add message, update message, delete message, records are appended to the journal. When one journal file is full we move to the next one.
Because records are only appended, i.e. added to the end of the journal we minimise disk head movement, i.e. we minimise random access operations which is typically the slowest operation on a disk.
Making the file size configurable means that an optimal size can be chosen, i.e. making each file fit on a disk cylinder. Modern disk topologies are complex and we are not in control over which cylinder(s) the file is mapped onto so this is not an exact science. But by minimising the number of disk cylinders the file is using, we can minimise the amount of disk head movement, since an entire disk cylinder is accessible simply by the disk rotating - the head does not have to move.
As delete records are added to the journal, HornetQ has a sophisticated file garbage collection algorithm which can determine if a particular journal file is needed any more - i.e. has all it's data been deleted in the same or other files. If so, the file can be reclaimed and re-used.
HornetQ also has a compaction algorithm which removes dead space from the journal and compresses up the data so it takes up less files on disk.
The journal also fully supports transactional operation if required, supporting both local and XA transactions.
The majority of the journal is written in Java, however we abstract out the interaction with the actual file system to allow different pluggable implementations. HornetQ ships with two implementations:
The first implementation uses standard Java NIO to interface with the file system. This provides very good performance and runs on any platform where there's a Java 5+ runtime.
The second implementation uses a thin native code wrapper to talk to the Linux asynchronous IO library (AIO). In a highly concurrent environment, AIO can provide better overall persistent throughput since it does not require each individual transaction boundary to be synced to disk. Most disks can only support a limited number of syncs per second, so a syncing approach does not scale well when the number of concurrent transactions needed to be committed grows too large. With AIO, HornetQ will be called back when the data has made it to disk, allowing us to avoid explicit syncs altogether and simply send back confirmation of completion when AIO informs us that the data has been persisted.
The AIO journal is only available when running Linux kernel 2.6 or later and after having installed libaio (if it's not already installed). For instructions on how to install libaio please see Section 15.3, “Installing AIO”.
For more information on libaio please see Chapter 40, Libaio Native Libraries.
libaio is part of the kernel project.
The standard HornetQ core server uses two instances of the journal:
This journal is used to store bindings related data. That includes the set of queues that are deployed on the server and their attributes. It also stores data such as id sequence counters.
The bindings journal is always a NIO journal as it is typically low throughput compared to the message journal.
This journal instance stores all message related data, including the message themselves and also duplicate id caches.
By default HornetQ will try and use an AIO journal. If AIO is not available, e.g. the platform is not Linux with the correct kernel version or AIO has not been installed then it will automatically fall back to using Java NIO which is available on any Java platform.
For large messages, HornetQ persists them outside the message journal. This is discussed in Chapter 24, Large Messages.
HornetQ also pages messages to disk in low memory situations. This is discussed in Chapter 25, Paging.
If no persistence is required at all, HornetQ can also be configured not to persist any data at all to storage as discussed in Section 15.4, “Configuring HornetQ for Zero Persistence”.
The bindings journal is configured using the following attributes in hornetq-configuration.xml
This is the directory in which the bindings journal lives. The default value is data/bindings.
If this is set to true then the bindings directory will be automatically created at the location specified in bindings-directory if it does not already exist. The default value is true
The message journal is configured using the following attributes in hornetq-configuration.xml
This is the directory in which the message journal lives. The default value is data/journal.
For the best performance, we recommend the journal is located on its own physical volume in order to minimise disk head movement. If the journal is on a volume which is shared with other processes which might be writing other files (e.g. bindings journal, database, or transaction coordinator) then the disk head may well be moving rapidly between these files as it writes them, thus reducing performance.
When the message journal is stored on a SAN we recommend each journal instance that is stored on the SAN is given its own LUN (logical unit).
If this is set to true then the journal directory will be automatically created at the location specified in journal-directory if it does not already exist. The default value is true
Valid values are NIO or ASYNCIO.
Choosing NIO chooses the Java NIO journal. Choosing AIO chooses the Linux asynchronous IO journal. If you choose AIO but are not running Linux or you do not have libaio installed then HornetQ will detect this and automatically fall back to using NIO.
If this is set to true then HornetQ will wait for all transaction data to be persisted to disk on a commit before sending a commit response OK back to the client. The default value is true.
If this is set to true then HornetQ will wait for any non transactional data to be persisted to disk on a send before sending the response back to the client. The default value for this is false.
The size of each journal file in bytes. The default value for this is 10485760 bytes (10MiB).
The minimum number of files the journal will maintain. When HornetQ starts and there is no initial message data, HornetQ will pre-create journal-min-files number of files.
Creating journal files and filling them with padding is a fairly expensive operation and we want to minimise doing this at run-time as files get filled. By precreating files, as one is filled the journal can immediately resume with the next one without pausing to create it.
Depending on how much data you expect your queues to contain at steady state you should tune this number of files to match that total amount of data.
When using an AIO journal, write requests are queued up before being submitted to AIO for execution. Then when AIO has completed them it calls HornetQ back. This parameter controls the maximum number of write requests that can be in the AIO queue at any one time. If the queue becomes full then writes will block until space is freed up. This parameter has no meaning when using the NIO journal.
There is a limit and the total max AIO can't be higher than what is configured at the OS level (/proc/sys/fs/aio-max-nr) usually at 65536.
The default value for this is 500.
Flush period on the internal AIO timed buffer, configured in nano seconds. For performance reasons we buffer data before submitting it to the kernel in a single batch. This parameter determines the maximum amount of time to wait before flushing the buffer, if it does not get full by itself in that time.
The default value for this paramater is 20000 nano seconds (i.e. 20 microseconds).
If this is set to true, the internal buffers are flushed right away when a sync request is performed. Sync requests are performed on transactions if journal-sync-transactional is true, or on sending regular messages if journalsync-non-transactional is true.
HornetQ was made to scale up to hundreds of producers. We try to use most of the hardware resources by scheduling multiple writes and syncs in a single OS call.
However in some use cases it may be better to not wait any data and just flush and write to the OS right away. For example if you have a single producer writing small transactions. On this case it would be better to always flush-on-sync.
The default value for this parameter is false.
The size of the timed buffer on AIO. The default value is 128KiB.
The minimal number of files before we can consider compacting the journal. The compacting algorithm won't start until you have at least journal-compact-min-files
The default for this parameter is 10
The threshold to start compacting. When less than this percentage is considered live data, we start compacting. Note also that compacting won't kick in until you have at least journal-compact-min-files data files on the journal
The default for this parameter is 30
The Java NIO journal gives great performance, but If you are running HornetQ using Linux Kernel 2.6 or later, we highly recommend you use the AIO journal for the best persistence performance especially under high concurrency.
It's not possible to use the AIO journal under other operating systems or earlier versions of the Linux kernel.
If you are running Linux kernel 2.6 or later and don't already have libaio installed, you can easily install it using the following steps:
Using yum, (e.g. on Fedora or Red Hat Enterprise Linux):
sudo yum install libaio
Using aptitude, (e.g. on Ubuntu or Debian system):
sudo apt-get install libaio
In some situations, zero persistence is sometimes required for a messaging system. Configuring HornetQ to perform zero persistence is straightforward. Simply set the parameter persistence-enabled in hornetq-configuration.xml to false.
Please note that if you set this parameter to false, then zero persistence will occur. That means no bindings data, message data, large message data, duplicate id caches or paging data will be persisted.