A *structure* is a heterogeneous container object, i.e., it is
an object with elements whose values do not have to be of the same
data type. The elements or fields of a structure are named, and
one accesses a particular field of the structure via the field
name. This should be contrasted with an array whose values are of
the same type, and whose elements are accessed via array indices.

A *user-defined* data type is a structure with a fixed set of
fields defined by the user.

The `struct`

keyword is used to define a structure. The syntax
for this operation is:

```
struct {
```*field-name-1*, *field-name-2*, ... *field-name-N*};

This creates and returns a structure with `NULL`

.
For example,

```
``` variable t = struct { city_name, population, next };

creates a structure with three fields and assigns it to the
variable `t`

.
Alternatively, a structure may be created by dereferencing
`Struct_Type`

. Using this technique, the above structure may
be created using one of the two forms:

```
``` t = @Struct_Type ("city_name", "population", "next");
t = @Struct_Type (["city_name", "population", "next"]);

This approach is useful when creating structures dynamically where
one does not know the name of the fields until run-time.
Like arrays, structures are passed around by reference. Thus,
in the above example, the value of `t`

is a reference to the
structure. This means that after execution of

```
``` u = t;

`t`

and `u`

refer to the ```
``` variable u = @t;

It create new structure whose field names are identical to the old
and copies the field values to the new structure. If any of the
values are objects that are passed by reference, then only the
references will be copied. In other words,
```
``` t = struct{a};
t.a = [1:10];
u = @t;

will produce a structure `u`

that references the same array as
`t`

.

The dot (`.`

) operator is used to specify the particular
field of structure. If `s`

is a structure and `field_name`

is a field of the structure, then `s.field_name`

specifies
that field of `s`

. This specification can be used in
expressions just like ordinary variables. Again, consider

```
``` t = struct { city_name, population, next };

described in the last section. Then,
```
``` t.city_name = "New York";
t.population = 13000000;
if (t.population > 200) t = t.next;

are all valid statements involving the fields of `t`

.

One of the most important uses of structures is the creation of
*dynamic* data structures such as *linked-lists*.
A linked-list is simply a chain of structures that are linked
together such that one structure in the chain is the value of a
field of the previous structure in the chain. To be concrete,
consider the structure discussed earlier:

```
``` t = struct { city_name, population, next };

and suppose that it is desired to create a linked-list of such
objects to store population data.
The purpose of the `next`

field is to provide the link to the
next structure in the chain. Suppose that there exists a function,
`read_next_city`

, that reads city names and populations from a
file. Then the list may be created using:
```
``` define create_population_list ()
{
variable city_name, population, list_root, list_tail;
variable next;
list_root = NULL;
while (read_next_city (&city_name, &population))
{
next = struct {city_name, population, next };
next.city_name = city_name;
next.population = population;
next.next = NULL;
if (list_root == NULL)
list_root = next;
else
list_tail.next = next;
list_tail = next;
}
return list_root;
}

In this function, the variables `list_root`

and `list_tail`

represent the beginning and end of the list, respectively. As long
as `read_next_city`

returns a non-zero value, a new structure is
created, initialized, and then appended to the list via the
`next`

field of the `list_tail`

structure. On the first
time through the loop, the list is created via the assignment to the
`list_root`

variable.
This function may be used as follows:

```
``` Population_List = create_population_list ();
if (Population_List == NULL)
throw RunTimeError, "List is empty";

Other functions may be created that manipulate the list. Here is one
that finds the city with the largest population:
```
``` define get_largest_city (list)
{
variable largest;
largest = list;
while (list != NULL)
{
if (list.population > largest.population)
largest = list;
list = list.next;
}
return largest.city_name;
}
vmessage ("%s is the largest city in the list",
get_largest_city (Population_List));

The `get_largest_city`

is a typical example of how one traverses
a linear linked-list by starting at the head of the list and
successively moves to the next element of the list via the
`next`

field.
In the previous example, a `while`

loop was used to traverse the
linked list. It is also possible to use a `foreach`

loop for this:

```
``` define get_largest_city (list)
{
variable largest, elem;
largest = list;
foreach elem (list)
{
if (elem.population > largest.population)
largest = elem;
}
return largest.city_name;
}

Here a `foreach`

loop has been used to walk the list via its
`next`

field. If the field name linking the elements was not
called `next`

, then it would have been necessary to use the
`using`

form of the `foreach`

statement. For example, if the
field name implementing the linked list was `next_item`

, then
```
``` foreach item (list) using ("next_item")
{
.
.
}

would have been used. In other words, unless otherwise indicated
via the `using`

clause, `foreach`

walks the list using a field
named `next`

by default.
Now consider a function that sorts the list according to population.
To illustrate the technique, a *bubble-sort* will be used, not
because it is efficient (it is not), but because it is simple,
intuitive, and provides another example of structure manipulation:

```
``` define sort_population_list (list)
{
variable changed;
variable node, next_node, last_node;
do
{
changed = 0;
node = list;
next_node = node.next;
last_node = NULL;
while (next_node != NULL)
{
if (node.population < next_node.population)
{
% swap node and next_node
node.next = next_node.next;
next_node.next = node;
if (last_node != NULL)
last_node.next = next_node;
if (list == node) list = next_node;
node = next_node;
next_node = node.next;
changed++;
}
last_node = node;
node = next_node;
next_node = next_node.next;
}
}
while (changed);
return list;
}

Note the test for equality between `list`

and `node`

, i.e.,
```
``` if (list == node) list = next_node;

It is important to appreciate the fact that the values of these
variables are references to structures, and that the
comparison only compares the references and

A user-defined data type may be defined using the `typedef`

keyword. In the current implementation, a user-defined data type
is essentially a structure with a user-defined set of fields. For
example, in the previous section a structure was used to represent
a city/population pair. We can define a data type called
`Population_Type`

to represent the same information:

```
``` typedef struct
{
city_name,
population
} Population_Type;

This data type can be used like all other data types. For example,
an array of Population_Type types can be created,
```
``` variable a = Population_Type[10];

and `populated' via expressions such as
```
``` a[0].city_name = "Boston";
a[0].population = 2500000;

The new type `Population_Type`

may also be used with the
`typeof`

function:
```
``` if (Population_Type == typeof (a))
city = a.city_name;

The dereference `@`

may be used to create an instance of the
new type:
```
``` a = @Population_Type;
a.city_name = "Calcutta";
a.population = 13000000;

Another feature that user-defined types possess is that the action of the binary and unary operations may be defined for them. This idea is discussed in more detail below.

The binary and unary operators may be extended to user-defined types. To illustrate how this works, consider a data type that represents a vector in 3-space:

```
``` typedef struct { x, y, z } Vector_Type;

and a function that instantiates such an object:
```
``` define vector_new (x, y, z)
{
variable v = @Vector_Type;
v.x = double(x); v.y = double(y); v.z = double(z);
return v;
}

This function may be used to define a function that adds two vectors
together:
```
``` define vector_add (v1, v2)
{
return vector_new (v1.x+v2.x, v1.y+v2.y, v1.z+v2.z);
}

Using these functions, three vectors representing the points
`(2,3,4)`

, `(6,2,1)`

, and `(-3,1,-6)`

may be created using
```
``` V1 = vector_new (2,3,4);
V2 = vector_new (6,2,1);
V3 = vector_new (-3,1,-6);

and then added together via
```
``` V4 = vector_add (V1, vector_add (V2, V3));

The problem with the last statement is that it is not a very natural
way to express the addition of three vectors. It would be far better
to extend the action of the binary `+`

operator to the
`Vector_Type`

objects and then write the above sum more simply as
```
``` V4 = V1 + V2 + V3;

The `__add_binary`

function defines the result of a binary
operation between two data types:

```
__add_binary (
```*op*, *result-type*, *funct*, *typeA*,*typeB*);

Here, `"+"`

, `"-"`

, `"*"`

, `"/"`

, `"=="`

,...),
and This function may be used to extend the `+`

operator to
*Vector_Type* objects:

```
``` __add_binary ("+", Vector_Type, &vector_add, Vector_Type, Vector_Type);

Similarly the subtraction and equality operators may be extended to
`Vector_Type`

via
```
``` define vector_minus (v1, v2)
{
return vector_new (v1.x-v2.x, v1.y-v2.y, v1.z-v2.z);
}
__add_binary ("-", Vector_Type, &vector_minus, Vector_Type, Vector_Type);
define vector_eqs (v1, v2)
{
return (v1.x==v2.x) and (v1.y==v2.y) and (v1.z==v2.z);
}
__add_binary ("==", Char_Type, &vector_eqs, Vector_Type, Vector_Type);

permitting a statement such as
```
``` if (V2 != V1) V3 = V2 - V1;

The `-`

operator is also an unary operator that is customarily
used to change the sign of an object. Unary operations may be
extended to `Vector_Type`

objects using the `__add_unary`

function:
```
``` define vector_chs (v)
{
return vector_new (-v.x, -v.y, -v.z);
}
__add_unary ("-", Vector_Type, &vector_chs, Vector_Type);

A trivial example of the use of the unary minus is `V4 = -V2`

.
It is interesting to consider the extension of the multiplication
operator `*`

to `Vector_Type`

. A vector may be multiplied
by a scalar to produce another vector. This can happen in two ways as
reflected by the following functions:

```
``` define vector_scalar_mul (v, a)
{
return vector_new (a*v.x, a*v.y, a*v.z);
}
define scalar_vector_mul (a, v)
{
return vector_new (a*v.x, a*v.y, a*v.z);
}

Here `a`

represents the scalar, which can be any object that may
be multiplied by a `Double_Type`

, e.g., `Int_Type`

,
`Float_Type`

, etc. Instead of using multiple statements
involving `__add_binary`

to define the action of
`Int_Type+Vector_Type`

, `Float_Type+Vector_Type`

, etc, a
single statement using `Any_Type`

to represent a ``wildcard''
type may be used:
```
``` __add_binary ("*", Vector_Type, &vector_scalar_mul, Vector_Type, Any_Type);
__add_binary ("*", Vector_Type, &scalar_vector_mul, Any_Type, Vector_Type);

There are a couple of natural possibilities for
`Vector_Type*Vector_Type`

: The cross-product defined by
```
``` define crossprod (v1, v2)
{
return vector_new (v1.y*v2.z-v1.z*v2.y,
v1.z*v2.x-v1.x*v2.z,
v1.x*v2.y-v1.y*v2.x);
}

and the dot-product:
```
``` define dotprod (v1, v2)
{
return v1.x*v2.x + v1.y*v2.y + v1.z*v2.z;
}

The binary `*`

operator between two vector types may be defined
to be just one of these functions--- it cannot be extended to both.
If the dot-product is chosen then one would use
```
``` __add_binary ("*", Double_Type, &dotprod, Vector_Type_Type, Vector_Type);

Just because it is possible to define the action of a binary or unary operator on an user-defined type, it is not always wise to do so. A useful rule of thumb is to ask whether defining a particular operation leads to more readable and maintainable code. For example, simply looking at

```
``` c = a + b;

in isolation one can easily overlook the fact that a function such as
`vector_add`

may be getting executed. Moreover, in cases where
the action is ambiguous such as `Vector_Type*Vector_Type`

it may
not be clear what
```
``` c = a*b;

means unless one knows exactly what choice was made when extending
the `*`

operator to the types. For this reason it may
be wise to leave `Vector_Type*Vector_Type`

undefined and use
``old-fashioned'' function calls such as
```
``` c = dotprod (a, b);
d = crossprod (a, b);

to avoid the ambiguity altogether.
Finally, the `__add_string`

function may be used to define the
string representation of an object. Examples involving the string
representation include:

```
``` message ("The value is " + string (V));
vmessage ("The result of %S+%S is %S", V1, V1, V1+V2);
str = "The value of V is $V"$;

For the `Vector_Type`

one might want to use the string
represention generated by
```
``` define vector_string (v)
{
return sprintf ("(%S,%S,%S)", v.x, v.y, v.z);
}
__add_string (Vector_Type, &vector_string);

Next Previous Contents