eheap Part 1: Configuration

Manuals

White Papers

	Home > Library > White Papers

eheap Part 1: Configuration

by Ralph Moore, eheap Architect
August 2021

Introduction

Many embedded systems use no heap. Many of those that do use a heap use a serial heap that must be searched sequentially and thus can be very slow. Most other embedded systems probably use dlmalloc, which is widely available. dlmalloc provides fast operation at the expense of a fixed structure that cannot be altered.

eheap is a new embedded heap that provides a middle ground between these extremes. It provides high performance, adaptability, and safety for embedded systems running on RTOSs or standalone. eheap is designed to be easily customizable to specific embedded systems. It conforms to the ANSI C Standard for malloc(), free(), realloc(), and calloc() and it offers additional functions for heap control and safety.

This is the first in a series of three papers on eheap:

eheap is a bin-type heap, like dlmalloc. It is assumed that reader is familiar with this type of heap structure or is not interested in the details of how these heaps operate. Hence this paper and subsequent papers discuss what is unique about eheap and not how it works. For those desiring more detailed information, see eheap vs. dlmalloc.

eheap has been designed for the following embedded system characteristics:

Wide range of RAM sizes from very small to large.
Expected to run forever.
High performance and deterministic operation are required.
High priority tasks must be able to preempt and run quickly.
Small code size is often necessary.
Each embedded system has a relatively narrow range of heap requirements.
Embedded systems have significant idle time.
There is a growing need for self-healing in embedded systems.

Bin Configuration

Like other bin-type heaps, eheap has a small bin array (SBA) for small chunks and an upper bin array (UBA) for larger chunks. The SBA is like an array of block pools, but not as fast. Unlike dlmalloc, for which the bin sizes or trees in the UBA are successive powers-of-2 bytes (e.g. 256, 512, etc.) eheap permits bin sizes to be chosen to fit the application. eheap bin sizes are determined by a constant bin size array, specified by the programmer. For small bins, the bin size is the chunk size[1] of the bin. For large bins, the bin size is the smallest chunk size in the bin. The maximum chunk size of a bin is the size of the next bin - 8.[2]

The bin size array is usually located in ROM, for safety. It is used by the heap initialization function to create the bins and all related variables needed by eheap. The heap can be quickly reconfigured by changing values in the bin size array, recompiling, and restarting the application. Such rapid reconfiguration supports an heuristic approach to tuning eheap for the best performance for a specific system.

Small System Example

A small system might have a bin size array like the following:

   bin    0   1   2   3   4   5   6
   size  24  32  40  48  128 136 264 -1

For the above, the SBA consists of bins 0, 1, and 2 holding chunk sizes 24, 32, and 40, respectively; bin3 contains 10 sizes from 48 to 120; bin4 is a small bin containing only the size 128; bin 5 contains 14 sizes from 136 to 248; and bin6, the top bin, contains all sizes from 264 on up. The -1 marks the end of the bin size array. In this particular embedded system there apparently is a limited need for small chunks, some need for medium chunks, and little need for large chunks above 264 bytes. Note that a small bin has been added for 128-byte chunks. This is a size frequently used in this system.

Upper Bin Searches

eheap and dlmalloc are similar with regard to their SBAs. Both are optimized for very fast accesses, as typically needed by object-oriented programs. A major difference is in the UBAs, where dlmalloc builds trees during free() operations to guide best-fit allocations during malloc() operations. Tree node links are stored in the chunks, themselves.

For eheap, each bin heads a free list of chunks that fit in the bin. During a free() operation, if the chunk is larger than the first chunk in the bin, it is put at the end of the bin free list; otherwise, it is put at the start of the bin free list. Thus free() is very fast — no tree tracing is required. During idle time, large bin free lists are sorted by increasing chunk size. During a malloc() operation for a large chunk the selected large bin free list is searched for the first big-enough chunk. If the bin is sorted, this will also be the best fit.

eheap best fits systems that have favored chunk sizes and do not use a wide range of chunk sizes. As shown in the example above, a small bin can be put into the UBA to provide a frequently used size. Another way to accomplish this result is to make a large bin size equal to a frequently used chunk size. Then the best fit will always be the first chunk in the bin, if it is sorted, and larger sizes will follow after it.

To some degree, large bins operate like small serial heaps that have been sorted. Excessive search times can be bounded by judicious choice of bin sizes vs. system needs and judicious use of merging, as described later. A tenet of heap design is that larger chunks take longer to process, so longer malloc() times do not degrade average performance. Of course, a malloc() cannot take so long that higher-priority tasks miss their deadlines.

One Bin Heap Example

   bin    0
   size  24  -1

In this case, there is no SBA and no UBA, only the top bin. This heap is close to the serial type heap, but still offers advantages over it.

No SBA Heap Example

   bin    0   1   2   3
   size  24  128 136 264  -1

Embedded systems written in C may have little use for small chunks, or alternatively block pools may be preferable for small chunks in some systems. In this case bin 0 starts with 24 because all sizes from 24 bytes and up must be covered, but bin 0 might be used only for chunks 64 bytes, and larger.[3]

No UBA Heap Example

   bin    0   1   2   3   4   5   6
   size  24  32  40  48  56  64  72  -1

In this heap, The SBA covers chunk sizes 24 to 64 and all other chunks go into bin 6, which covers 72-bytes and up.

Obviously, countless heap configurations are possible. Currently eheap is limited to 32 bins, for best performance. This limit could be increased if more bins prove to be beneficial for some embedded systems.

Donor Chunk

eheap has an optional donor chunk (dc), which is similar to dlmalloc's designated victim chunk (dvc), but operates differently and serves different purposes:

Initial source of fast small chunk allocations.
Segregation of small chunks into the lower heap and larger chunks into the upper heap.
Localization of small chunks for cache-based systems.

If enabled, space for dc is allocated below the top chunk (tc) by the programmer. Normally it is a small fraction of total initial free space, which is the combination of dc and tc. If the selected SBA bin for allocation of a small chunk is empty, the chunk is calved from dc, rather than taking it from the next larger occupied bin. This is not only faster, but also avoids depopulating larger bins. Since dc is initially the lower part of the heap, allocated and freed small chunks will be in the lower heap. This improves localization[4] for small chunk allocations and also tends to segregate the heap by chunk size to help reduce inuse small chunks from blocking the merger of larger chunks.

After the heap has been in use for a while, small chunk allocations will primarily come from SBA bins and seldom from dc. At this point using dc can be turned off and what is left of it freed to a bin. Alternatively, dc can continue to be used. As freed chunks below and adjacent to it are merged into it, it will grow and as chunks are allocated from it, it will shrink. This will improve performance for empty SBA bins.

Chunk Merge & Leaky Bins

An important difference between dlmalloc and eheap is that dlmalloc always merges adjacent free chunks, whereas eheap permits deferred merging. eheap merging is controlled by its cmerge mode, which an application can turn ON and OFF. It is OFF following heap initialization.

The reason for deferred merging is that chunk merging produces leaky bins. For example, suppose a 24-byte chunk is freed to bin 0 and a physically adjacent 48-byte free chunk resides in bin 3. If merging is enabled, these chunks will be merged into a 72-byte chunk, which will be put into bin 6. This obviously hurts performance if the application needs 24- and 48-byte chunks and not 72-byte chunks. Additionally, dequeueing the 48-byte chunk and merging it requires additional time during the free() operation. This is wasted time if a subsequent malloc() gets the 72-byte chunk, splits it into 24- and 48-byte chunks, then requeues the remnant.

Unlike general processing, which may use a wide variety of chunk sizes, embedded systems tend to use a limited repertoire of chunk sizes. Hence, leaky bins are not desirable for embedded systems.

cmerge may be directly controlled or automatically controlled. If the automatic mode, amerge, is ON, cmerge will be turned ON when free space drops below a lower limit and it will be turned OFF when free space rises above an upper limit. The limits are determined by configuration constants. Alternatively, the number of free chunks, the average free chunk size, or the total space in bins, etc. can be used to control cmerge, depending upon what works best in a specific system.

Summary

eheap is a new embedded heap, which can be customized to specific embedded system requirements by adjusting the bin size array, deciding whether or not to use dc and how large it should be, and deciding whether or not to use deferred merging and, if used, how to control it. These options allow tuning eheap to a specific system in order to achieve optimum performance without allocation failures due to fragmentation. The flexibility of eheap is even more useful in partitioned systems requiring multiple heaps.

Notes

[1] Heaps work with chunks, whereas applications work with data blocks, which are contained in chunks. Chunks are larger because they also contain metadata to manage the heap.

[2] All chunks have sizes that are multiples of 8 bytes and they are located on 8-byte boundaries. This is fairly standard among heaps.

[3] Of course splitting of larger chunks could result in chunks smaller than 64 bytes being put into bin 0. However eheap allows specifying a MIN_FRAG constant to prevent this.

[4] Localization means that small chunks that are allocated, in succession, will be physically near each other in memory, thus improving cache performance, if there is a cache.

back to White Papers page

Home Sitemap Contact