|
DSPs provide special hardware for data memory address computation. These address generation units (AGUs) support data memory access by indirect addressing in parallel to the execution of other machine operations. After accessing a memory location, the address pointer can be modified at no extra cost within a fixed range which we denote as automodify range. However, loading address pointers by immediate addressing require additional instruction cycles. This implies that optimizing the placement of program variables in memory allows to reduce the addressing costs. AGU ModelDSPs include one or more special address generation units that are dedicated to address calculation. AGUs can perform operations without using the data path of the processor. This allows address calculations to take place in parallel with arithmetic operations on data.
Most AGUs provide three register files: the index register file (IRF), the modify register file (MRF), and the length register file (LRF). A specific register is selected either by a register pointer (IRP, MRP, or LRP) or a few bits in the instruction word. In general, maximum processor performance is achieved by register-indirect addressing modes. The index registers contain the actual addresses for data memory access. When data is accessed in register-indirect mode, the address stored in the selected index register becomes the memory address. AGUs employ a postmodify scheme, that is, an increment is added to the index register value after the address is used to access data in memory. The most common increment is plus or minus one. Often postincrement operations by plus/minus one can be carried out without initializing a register with the increment. Most processors also provide the ability to add values other than ±1 to the index registers. These increment values are stored in a special group of registers, the modify registers. After an indirect data access, the increment value in the specified modify register is added to the selected index register. Some processors allow indirect addressing with indexing. Here the values stored in two address registers (index register and modify register) are added together to form the effective address to be used to access data in memory. In this addressing mode, however, the value in the index register is not modified. Often, circular buffer management is supported by modulo addressing where the selected length register contains the buffer size. Offset AssignmentOffset assignment is the problem of finding an optimized placement of program variables in data memory for a given sequence of memory accesses. The goal is to generate a data memory layout such that the number of address offsets of two consecutive memory accesses that are outside the auto-modify range is minimized. In general, this problem has a large solution space and can be optimally solved for small instances only. Let V be a set of program variables. Each variable vi A variable access stream S = [vs(1),vs(2),…,vs(M)] is defined by a function
where M denotes the stream length and N ≤ M. The image s(ℓ) of any
ℓ A memory layout is a permutation
that assigns addresses to all program variables which appear in the access stream S.
Equivalently, a layout can be regarded as a string φ of nodes with each node of V
appearing exactly once. The correspondence between these two definitions
is simply that π(i) = k with vi Let us assume that an address register points to vi and it should be used for
accessing vj. To this end, the address pointer has to be modified by address offset
π(j) -π(i). AGUs of DSPs support zero-cost address pointer updates for a limited set
of offset values. Obviously, costs can be minimized by memory layout optimization.
This problem is denoted as offset assignment (OA). Let cij with i,j
Modifying π such that C1 becomes a minimum in Equ. 3 is a quadratic assignment problem (QAP). Since it is NP-hard, optimum solutions can be found just for small values of N. Simple Offset AssignmentOffset assignment algorithms were originally developed for linear addressing with automodify range ±1. For simple offset assignment (SOA), it is assumed that the target machine has a single index register. SOA heuristics can be used as core procedures for general offset assignment (GOA) that handles multiple index registers. Several SOA heuristics are based on finding a maximum-weighted Hamiltonian path in the access graph by using a greedy edge selection strategy.
General Offset AssignmentGeneral offset assignment (GOA) is the problem of optimizing the memory layout for multiple address pointers. In case of GOA, both memory layout and address pointer assignment are optimized. Assigning address pointers can be regarded as K-coloring the access stream S. For K homogeneous address pointers, an optimum solution has to be found out of KM-1 different colorings by simple exhaustive search. A coloring of S is a partition that decomposes {1,2,…,M} into K disjoint subsets. Each subset defines an access stream
with
For each Sk, all elements appear in the same relative order as in S. We calculate the overall address computation costs by
The goal is to minimize CK by optimizing the memory layout π and the partitioning of S. This optimization problem consists of K interdependent QAPs with a solution space size of N!KM-1.
|