Sparse Arrays

The occurrence of zero elements in a large array is both a computational and storage inconvenience. An array in which a large percentage of elements are zeros is referred to as being sparse.

Because standard linear algebra techniques are highly inefficient when dealing with sparse arrays, IDL incorporates a collection of routines designed to handle them effectively. These routines use the row-indexed sparse storage method, which stores the array in structure form, as a vector of data and a vector of indices. The length of each vector is equal to 1 plus the number of diagonal elements of the array plus the number of off-diagonal elements with an absolute magnitude greater than or equal to a specified threshold value. Diagonal elements of the array are always retained even if their absolute magnitude is less than the specified threshold. Sparse array routines that handle array-vector and array-array multiplication, file input/output, and the solution of systems of simultaneous linear equations are included.

Note
For more information on IDL's sparse array storage method, see section 2.7, "Sparse Linear Systems," in Numerical Recipes in C: The Art of Scientific Computing (Second Edition), published by Cambridge University Press.

When considering using IDL's sparse array routines, remember that the computational savings gained by working in sparse storage format is at least partially offset by the need to first convert the arrays to that format. Although an absolute determination of when to use sparse format is not possible, the example below demonstrates the time savings when solving a 500 by 500 linear system in which approximately 50% of the coefficient array's elements as zeros.

Diagonally-Dominant Array

Create a 500-by-500 element pseudo-random diagonally-dominant floating-point array in which approximately 50% of the elements as zeros. (In a diagonally-dominant array, the diagonal element in a given row is greater than the sum of the absolute values of the non-diagonal elements in that row.)

N = 500L 
A = RANDOMN(SEED, N, N)*10 
; Set elements with absolute magnitude greater than or 
; equal to eight to zero: 
I = WHERE(ABS(A) GE 8) 
A[I] = 0.0 
; Set each diagonal element to the absolute sum of 
; its row elements plus 1.0: 
diag = TOTAL(ABS(A), 1) 
A(INDGEN(N) * (N+1)) = diag + 1.0 
; Create a right-hand side vector, b, in which 40% of 
; the elements are ones and 60% are twos. 
B = [REPLICATE(1.0, 0.4*N), REPLICATE(2.0, 0.6*N)] 

We now calculate a solution to this system using two different methods, measuring the time elapsed. First, we compute the solution using the iterative biconjugate gradient method and a sparse array storage format. Note that we include everything between the start and stop timer commands as a single operation, so that only computation time (as opposed to typing time) is recorded.

; Begin with an initial guess: 
X = REPLICATE(1.0, N_ELEMENTS(B)) 
; Start the timer: 
start = SYSTIME(1) & $ 
; Solve the system: 
result1 = LINBCG(SPRSIN(A), B, X) & $ 
; Stop the timer. 
stop = SYSTIME(1) 
; Print the time taken, in seconds: 
PRINT, 'Time for Iterative Biconjugate Gradient:', stop-start 

IDL prints:

Time for Iterative Biconjugate Gradient       1.1259040 

Remember that your result will depend on your hardware configuration.

Next, we compute the solution using LU decomposition.

; Start the timer: 
start = SYSTIME(1) & $ 
; Compute the LU decomposition of A: 
LUDC, A, index & $ 
; Compute the solution: 
result2 = LUSOL(A, index, B) & $ 
; Stop the timer: 
stop = SYSTIME(1) 
; Print the time taken, in seconds: 
PRINT, 'Time for LU Decomposition:', stop-start 

IDL prints:

Time for LU decomposition       14.871168 

Finally, we can compare the absolute error between result1 and result2. The following commands will print the indices of any elements of the two results that differ by more than 1.0 × 10-5, or a –1 if the two results are identical to within five decimal places.

error = ABS(result1-result2) 
PRINT, WHERE(error GT 1.0e-5) 

IDL prints:

-1 

See the documentation for the WTN function for an example using IDL's sparse array functions with image data.

Note
The times shown here were recorded on a DEC 3000 Alpha workstation running OSF/1; they are shown as examples only. Your times will depend on your specific computing platform.

Routines for Handling Sparse Arrays

See Sparse Arrays (in the functional category "Mathematics" (IDL Quick Reference)) for a brief description of IDL routines for handling sparse arrays.More detailed information is available in the IDL Reference Guide.