Saturday, October 27, 2007

More on bash, arrays, and index values

So, if you "man bash", you will see the following:
Bash provides one-dimensional array variables. Any variable may be
used as an array; the declare builtin will explicitly declare an array.
There is no maximum limit on the size of an array, nor any requirement
that members be indexed or assigned contiguously. Arrays are indexed
using integers and are zero-based.
Any element of an array may be referenced using ${name[subscript]}.
${#name[subscript]} expands to the length of ${name[subscript]}. If subscript is * or @,
the expansion is the number of elements in the array.
The unset builtin is used to destroy arrays. unset name[subscript]
destroys the array element at index subscript. unset name, where name
is an array, or unset name[subscript], where subscript is * or @,
removes the entire array.
What does all that mean? It means that index values don't get reset when you unset array elements, so you shouldn't rely on ${#name[*]} in loops, unless you know that the index values are contiguous. So, if you want to loop based on the number of elements, you had better reset the index values by using something like: name=(`echo ${name[*]}`)

Copy and paste the following into a file, and then run it using your favorite implementation of bash (Cygwin, OS X, Linux, BSD, etc.) to see how it all works:
function show_array {
echo There are ${#NUMBERS[*]} numbers in the array.
echo The index positions and values are:
while [ $IDX -lt ${#NUMBERS[*]} ]; do
echo -e $IDX \\t ${NUMBERS[$IDX]}
let IDX+=1
function pause () {
read -p "$*"
echo Bash Shell Array Element Index Handling
echo ----------------------------------------
NUMBERS=(zero one two three four five six seven eight nine)
pause "Press Enter "
echo ----
echo Now, let\'s unset number five, and see what happens...
unset NUMBERS[5]
echo Eek!, looks like there\'s a NULL element in there.
echo So, the number of elements has been reset, but the indexing has not.
echo Indexing is non-contiguous by design, so this isn\'t a bug. However,
echo this means that we do not get to see all of the elements if we are
echo looping based on the number of elements in the array.
pause "Press Enter "
echo ----
echo Let\'s try to unset number 1 this time...
unset NUMBERS[1]
echo Looks like there\'s another NULL element in there, now.
echo Again, the number of elements has been reset, but the indexing has not...
echo The \${#NUMBERS[*]} does not count NULLs, but index values do not
echo change. The while loop produces apparent garbage after some elements
echo have been unset.
pause "Press Enter "
echo ----
echo Enter the cheesy work-around:
echo "It looks like this: NUMBERS=(\`echo \${NUMBERS[*]}\`)"
echo This kills the NULL values and resets the indexing so it is
echo contiguous again...
NUMBERS=(`echo ${NUMBERS[*]}`)
Happy bash-ing...