Implementation in C++

Using object-oriented design principles Milthorpe's algorithm has been implemented using the computer programming language C++[33]. The name `CONVECT' was given to the computer program since this seemed to be an important feature of Milthorpe's algorithm and this is how it will be referred to. Objects have been used to specify the properties and behavior of gas species, computational cells and of regions within the computational domain. Features of C++, such as data-hiding and inheritance, should reduce the time needed for further development of the code. This is because the code modules that have already been tested need not be changed, minimising the introduction of new bugs. Object-oriented programming is a paradigm where data structures and the computer code that operates on those data structures are kept together. Being a modular approach, object-oriented programming allows a programmer, or team of programmers, to deal with one part of a computer program at time.

One of the objectives of the new implementation was to investigate whether this approach to computer programming would aid in the process of computational research - in this case with Milthorpe's algorithm. Milthorpe's computer program had become complex and somewhat difficult for an outside researcher to grapple and add to. An object-oriented version of this computer program would allow other researchers to identify and modify only those parts of the code that are relevant to the problem they are trying to solve; potentially saving them a great deal of time. If this approach did indeed make the further development of Milthorpe's algorithm simpler and less prone to bugs, time could then be concentrated on dealing with the more important fluid dynamics and numerical aspects of the work.

Object-oriented programming

Procedural programming

Procedural programming languages have been widely used for decades. They grew out of a perceived need for computer programs to be more structured than earlier programming languages allowed. The process of writing and re-writing code using procedural languages would be more efficient. In such a language, procedures are used to introduce order to a computer program. A computer program is divided into a set of logical tasks and procedures are written to carry out these tasks. Procedures can have parameters, or arguments, passed to them. They can be grouped together in libraries that can then be used as part of many different computer programs separated in time and space. In this way, the libraries represent a kind of tool box that can be used by computer programmers who do not need to know how a tool functions, just how to use it. This saves a computer programmer time, since time is not wasted `re-inventing the wheel'. In the computer language Fortran, these procedures are called subroutines - in other languages they may be called procedures or functions.

Modular programming

A set of related procedures, together with the data they manipulate, can be grouped together into a `module'. Modular programming is also referred to as `data hiding'. A computer programmer using a pre-written module does not necessarily need to know the structure of the data that the module uses internally. The programmer just needs to deal with the module's `interface' - a set of procedures and the parameters they require. A computer program written to use a particular module need not be modified if the internal data representation of the module is later refined. Another advantage of modular programming is that computer programmers do not have to keep track of data that may be global (accessible by all the procedures within a module) and there is no danger that they would accidentally introduce bugs by modifying this global data unnecessarily.

Data abstraction

A programming language that supports modular programming could also support the idea of `user defined data types', known as `data abstraction'. For example, a language may not know how to deal with complex numbers so a computer programmer could define a new data type for this purpose. This new data type would define how a complex number is stored and what needs to be done when you indicate that you want to add two of them together. Data abstraction has some powerful applications in areas such as linear algebra where complex matrix operations can be expressed in a concise form, analogous to expressions used by mathematicians.

A problem with data abstraction is that an abstract data type defines a sort of black box and that there is no way of adapting it to new uses except by modifying its definition. This can lead to severe inflexibility.

Objects

Objects are program modules that provide data abstraction and also allow `inheritance'. A computer programmer can create a new kind of object by inheriting the properties and behaviour of a parent object and then customising only those properties and behaviour that make the new object different from the parent. This process can be used to create a whole hierarchy of objects that handle a range of different aspects of a computer application. In fact, serious converts to the object-oriented programming paradigm delegate all aspects of their computer application to objects that belong to a single hierarchy.

A good example of a set of objects is the idea of a `shape' in a computer graphics system. Each shape could have a colour and a point defining its location on the screen. This can be represented by an object with actions associated with it such as move, draw and rotate. To implement a circle a computer programmer would inherit a shape and redefine what draw and rotate mean for it. A circle would need to have an extra property defined for it called a radius. Similarly, a square can be set up by redefining only those parts of a shape that are needed to specify a square.

When defining objects in an object-oriented computer language the term `class' is used. This creates a distinction between the definition of a class of objects and actual `instances' of that class.

Inheritance allows an object that has already been developed and debugged to be re-used and enhanced. This feature lends itself to a kind of incremental, or `building block', approach to software development. Object-oriented programming allows different programmers to use the same code and add to it without having to spend too much time analysing the existing code and without the danger of introducing new bugs to the existing code (which is also the case with other styles of modular programming).

In designing an object-oriented computer program, a programmer divides the problem into tasks, each to be handled by a set of objects. A list of behaviours, or `methods', are defined for these objects. During this process, commonality between types of objects are identified so that a hierarchy of objects can be set up using inheritance.

The design of CONVECT

Although the fundamental nature of Milthorpe's algorithm has been preserved, the code has been completely redesigned and rewritten in the object-oriented computer language C++. As described above, the design process involved analysing Milthorpe's algorithm and delegating parts of its functionality to a set of objects. A hierarchy of these objects was set up so that where objects had common features, inheritance could be used.

The original Fortran program stored all fluid quantities in separate arrays. There was also a proliferation of parameters, loop counters and other variables. Using the data abstraction inherent in an object-oriented programming approach, keeping track of variables and arrays when dealing with the program code is vastly simplified.

It was decided to use physical concepts to guide the choice of objects used in CONVECT. Each Species of gas is represented by an object class. The various physical subdivisions of the problem are represented by different object classes. These include Domain, Region and Cell. Objects that represent tasks, not physical quantities have also been defined. Convector is one example and it carries out the process of convection. In most cases, it is the responsibility of a particular object to modify, input or output its own data. A representation of the main objects can be seen in Figure 5.1 and an explanation follows.

In general, Milthorpe's algorithm has been directly translated into this new program structure. Section 5.5 outlines the main differences between the current implementation and Milthorpe's.

**Figure:** A representation of the object hierarchy in *CONVECT*
$\resizebox*{0.9\textwidth}{!}{\includegraphics{objects.eps}}$

The object classes

Domain

A Domain holds most of the properties of the entire computational domain. This includes the overall size and shape of the problem. Domain methods setup and ensure the execution of the whole problem. Many command line parameters to the computer program translate directly into calls to Domain methods.

Region

A Region was originally distinguished from a Domain to allow the code to be implemented in parallel. The idea was that a Domain would contain several Regions that would be executed in parallel. When the parallel version was implemented, however, the method used resulted in single Regions occupying a single Domain. There was in the end no need for these two objects to be separate but the distinction was kept to reduce unnecessary debugging.

Regions are given the responsibility of keeping track of the Cells they contain and looping through those Cells, instructing them to be processed or updated. Regions `know' how to input and output their data to binary or text files. They keep track of Cell dimensions $dx$ and $dy$ as well as the adjustable timestep $\delta t$ . It was envisaged that different Regions in a parallel problem may have different values of $dx$ , $dy$ or $\delta t$ . This would allow the computer algorithm to create finer grids and smaller timesteps, as required, similar to an adaptive mesh. This has not yet been implemented.

Inheritance has been used to implement different kinds of problems with slightly different kinds of Regions and Domains. Two examples of specialised Regions are the flatplate and the shocktube.

Cell

A Cell holds the property of Momentum and a list of the gas Species in it (see Section 5.4.4). Cell methods carry out the actual fluid dynamics by either implementing the various physical processes themselves or calling other objects that do. There are also methods that aid a Region in determining whether to adjust the timestep by checking certain stability criteria. Boundary conditions are implemented by populating a Region with different kinds of Cells. There are upstream Cells, wall Cells and symmetric boundary Cells. Inheritance has been widely used in the implementation of the hierarchy of Cell types.

Species

A Species holds the properties of Mass and Energy. Different kinds of Species represent the different kinds of gas species that are present in a particular problem. Each kind of Species can have its own definitions for certain gas properties such as molecular weight, the Prandtl number and the Schmidt number.

Diffuser

A Diffuser is given the addresses of two Cells as an argument and carries out the process of diffusion between them, updating Cell quantities accordingly. Different kinds of Diffusers are implemented to handle different kinds of diffusion process in certain boundary Cells. For example, there is a solidDiffuser for adiabatic no slip diffusion and a symmDiffuser for diffusion at a symmetric boundary.

Convector

A Convector is given the address of a Cell and convects mass, momentum and energy out of that Cell into neighbouring Cells according to the velocity in the parent Cell. The calculation of the convection velocity factor, a parameter that determines the proportion of material being convected, was delegated back to the Cell object class since special boundary Cells needed to deal with this calculation in different ways. This is a good example of how the identity of an object can determine how a particular method is executed. On the other hand, a more traditional language needs sometimes to go through an elaborate series of conditional statements to determine what needs to be done.

Squeezer

A Squeezer is given the address of a Cell and implements pressure gradient effects on that Cell. The name Squeezer was assigned to this object because a single worded descriptive name was needed.

Transport

A Transport object is used by a Diffuser to encapsulate the local transport properties of viscosity, thermal conductivity and mass diffusivity. It is `contained' within a Diffuser object. When the Diffuser is initialised with the addresses of those Cells it will operate on, the Transport object then calculates the effective local viscosity, thermal conductivity and mass diffusivity, based on the local temperature and species concentrations. These values are then called on by the Diffuser for its calculations.

Implementation differences between Milthorpe's method and CONVECT

Dynamic memory allocation

Memory for the Cell objects was allocated dynamically as this is standard C++ programming practice [33], and an array of pointers to these objects was kept. This meant that adjacent Cells in the computational domain were probably, but not necessarily, adjacent in memory. This can have performance effects, since memory caching schemes on modern computers read in chunks of adjacent memory, assuming that operations are often performed on contiguous data.

Beyond that, modern compilers have built-in optimisation features that use knowledge of particular computer architectures to get the best performance out of computer code. When data structures are abstracted, as they have been in CONVECT, it is much less likely that an optimising compiler will be able to achieve as much speed-up as with an ordinary array storage.

De-referencing

Much of the data in CONVECT is accessed using pointers or a series of pointers. This `de-referencing' of memory can slow down the reading of the data into the processor. An example of this is a Region wanting to access the energy of the first Species in a particular Cell. This would be written something like:

Functions are de-referenced also and this too has a computational overhead. It is for this reason that `inlined' functions were used where possible. Inlined functions are compiled into sections of computer code that are then inserted, as is, in place of a branch instruction. This means that the compiled program has to execute less instructions, making it much more efficient. The C++ compiler gcc was used and extensive use of its explicit and automatic inlining features drastically reduced the function de-referencing overhead.

Momentum and velocity as cell properties

Momentum in CONVECT is stored for each Cell as opposed to each gas species, as in Milthorpe's method. This was done so as to save a little on storage. Velocity is calculated as $Momentum/TotalMass$ and is also a Cell property. This has not had a significant impact on the accuracy of solutions.

Timestep adjustments

Timestep adjustments were essentially carried out in the same way as in Milthorpe's computer program. Log files of program output were examined to determine which adjustments were actually being used. Where it was found that a timestep adjustment was not used it was disabled. Generally, the CFL stability adjustment (Section 5.5.8) and an adjustment that looks at local oscillations seemed to dominate. Some parameters, including the minimum allowed timestep and the minimum density allowed in a diffusion calculation, were also adjusted to improve the stability of the particular problems being implemented.

Inputs file

To save having to re-compile the program for different problem conditions, most of the program parameters can be entered from a kind of free-form inputs file. The inputs file is a text file with a series of parameter names and values. For parameters that are not present in the inputs file, sensible defaults are used (see Appendix B).

Calcinit.pl - calculation of initial conditions

Since the code was being used to model a shock tube, it was necessary to determine initial conditions such that there would be a main shock heading downstream with no upstream shock. The inlet condition was set to be the condition immediately behind the contact surface. This required conditions to be `matched' so that only the primary shock would propagate and there would be no shock or expansion wave travelling upstream. A Perl program was written to calculate initial conditions given the shock tube relations [17] and to produce a valid inputs file for CONVECT (see Appendix C).

Cross platform features

The code was, at different times, developed on Sun workstations or under Linux on a PC, or on a Macintosh. It was therefore necessary to make sure that CONVECT could be compiled and run on the different platforms without major changes. Another reason for implementing CONVECT on different platforms was to allow for parallel processing of the program on a heterogeneous workstation farm. It was necessary to modify the format of the binary data files produced so that they could be read on different platforms. Intel based Linux uses a different byte ordering than Sun so without the new data format data files could not be read on the other platform.

Stability

During testing of CONVECT, it was found that the solution seemed to be less stable when $dx \neq dy$ . A two dimensional, compressible version of the CFL (Courant, Friedrichs and Lewey[6]) condition was briefly presented by Roache [30, p. 229]. This condition was used in place of a one dimensional version that Milthorpe used during the convection step of the calculation. The condition is:

$\begin{displaymath} C=\frac{\left( \left\vert \widetilde{u}\right\vert +a\right) \delta t}{\Delta}\leq \frac{\sqrt{dxdy}}{\sqrt{dx^{2}+dy^{2}}}\end{displaymath}$

Artificial viscosity

During the development and testing of CONVECT, it was found that the boundary layer thickness being produced was not nearly large enough. It was much smaller than that predicted by Mirels [28] (see Section 3.4.1). It was even much smaller than the theoretical laminary boundary layer shown in Figure 3.4. Analysis of this showed that there was not enough diffusion occurring and that this was not due to a programming bug. Furthermore, this discrepancy in boundary layer thickness only seemed to result from flows with high Reynolds numbers or, in other words, slightly viscous flows. For example, a flat plate calculation with a much lower Reynolds number produced the expected boundary layer profile.

Roache [30] discussed artificial viscosity for incompressible flow problems in some detail. He described it as a truncation error that appeared in some finite difference analogs. He used a Taylor series expansion of the governing finite difference equation to develop an expression for this truncation error. In the case of Equation 2.9 the transient artificial viscosity term is:

$\begin{displaymath} \alpha _{e}=-\frac{u^{2}\Delta t}{2} \end{displaymath}$

(24)

This kind of stability analysis is much more difficult for compressible flow since the compressible flow equations cannot be linearised [30, p. 228]. The governing equations could be expressed as a set of simultaneous linearised equations and analysed as such but that is beyond the scope of this work. The conclusion is therefore that there is some kind of artificial viscosity effect of the form shown in Equation 5.1. Since this term is negative, it would have the effect of canceling out some of the explicitly modelled viscosity. This provides one explanation for the smaller than expected boundary layer.

Another change in the viscosity calculation implemented at the same time was a change in the values of $\mu _{0}$ used in Equation 4.2 and the way that viscosity was calculated for a gas mixture. Previous values of $\mu _{0}$ were calculated based on a Reynolds number that was supplied as a parameter for the problem. Further reading about Sutherland's law [16] revealed that this use of $\mu _{0}$ was incorrect. The correct values were obtained from the GASP User's Manual [16]. The value for the helium argon driver gas was taken to be that of helium and the value for air was taken to be that of nitrogen ( $N_{2}$ ). To deal with the change in the use of Sutherland's law, it was necessary to use real flow values to specify the problem. Previously, inlet flow values were scaled to unity to reduce the chance of roundoff error occurring in some calculations.

Previously, an average of $\mu _{0}$ was used in the Sutherland's law calculation when calculating transport between gas species. Wilke's equation is now used to calculate the mixture viscosity [1, p. 24]. Wilke's equation is given by:

$\begin{displaymath} \mu _{mix}=\sum _{i=1}^{n}\frac{x_{i}\mu _{i}}{\sum _{j=1}^{n}x_{j}\Phi _{ij}} \end{displaymath}$

$\begin{displaymath} \Phi _{ij}=\frac{1}{\sqrt{8}}\left( 1+\frac{M_{i}}{M_{j}}\ri... ...ght) ^{1/2}\left( \frac{M_{i}}{M_{j}}\right) ^{1/4}\right] ^{2}\end{displaymath}$

The tuning algorithm

During the development of the code it was found that the amounts of energy and momentum being generated by the pressure gradient effects were incorrect in regions close to shocks. After some investigation, it appeared that Milthorpe's upwinding coefficients were not appropriate for the driver gas contamination problem. A tuning algorithm was developed to tune the upwinding coefficients to produce more accurate results.

The particular tuning algorithm used was given the label `hill climbing' (Figure 5.2). To visualise it imagine that we want to vary two parameters until some function depending on them was maximised. We can depict this function as a graphical surface and the maximum would then be a hill. To find the `top' of the hill, one parameter is varied using a certain increment while we are still climbing the hill. Before we start going down, we change direction - or vary the second parameter. If we can't go any higher, we reduce the size of the steps we are taking and start varying the first parameter again. The top of the hill is reached when we can't go any higher and the step size has reached some minimum value.

This algorithm is applied to the tuning of the upwinding parameters by getting CONVECT to repeatedly calculate the propagation of a one dimensional shock at the conditions of interest. At the end of each calculation, an error function is produced by comparing the shock profile against an analytical solution. The aim is to minimise this error function and, to use the analogy of the hill, when it is minimised we are at the top of the hill. By producing a one dimensional version of CONVECT that is in turn called by a program implementing the hill climbing algorithm, the whole process was automated quite efficiently. There were actually 30 parameters to vary so the hill climbing algorithm was effectively climbing a 31-dimensional hill!

**Figure 5.2:** Illustration of hill climbing algorithm showing contours of a two parameter function
$\includegraphics{hill.eps}$

Parallel implementation

Overview

One of the principal objectives of re-implementing Milthorpe's code was to introduce parallel processing capabilities. A parallel program is able to spread its processing load across two or more computer processors. The extra processors may be part of a multi-processor computer or a set of computers that are networked together or some combination of both. The ideal `speedup' of a parallel program is the number of processors - that is, running an ideal parallel program on three processors would go three times as fast as it would have on one processor. In reality, a somewhat less than ideal speedup is usually achieved due to the finite communication time between processors, the fact that parts of a computer program do not execute in parallel and that there is sometimes extra code to execute to implement the parallel behavior.

Implementation points

Message Passing Interface (MPI)

It was decided to use a freely available programming interface called MPI to implement the parallel nature of CONVECT. It is a set of computer program routines that provide the framework for writing parallel programs. MPI was used rather than other programming interfaces because it is freely available and it has been implemented on most flavours of UNIX. It is even becoming available for the Windows and Macintosh platforms. Using MPI makes CONVECT a fairly portable computer program. A computer program using MPI could technically run on a mixture of high performance multi-processor computers, several UNIX workstations, some Windows based PCs and some Macintoshes, all at once and in parallel. In practice, it is simpler to get programs written with MPI to run on a single multi-processor computer or a set of networked UNIX workstations.

Data structures

When running under MPI, CONVECT runs as a set of separate processes, whether on a multi-processor computer or a set of different workstations. Each process has a single Domain that contains a single Region. These are special parallel versions of a Domain and a Region that were constructed by inheriting the existing non-parallel versions. One of the Domains has the responsibility of orchestrating the input and output of the whole parallel computer program. It instructs the parallel Regions to process themselves, and to exchange boundary information and other parameters between each other.

**Figure 5.3:** How *Region* boundaries are handled
$\resizebox*{0.9\textwidth}{!}{\includegraphics{parregion.eps}}$

Figure 5.3 shows how cells along the boundaries of two neighbouring parallel Regions are associated with each other. The cells adjacent to a parallel boundary are essentially ghost cells that are updated only from their associated cells in the neighbouring Region. The current parallel version of CONVECT synchronises the timestep and some other parameters between all parallel Regions. Care has been taken during the development of this computer program to ensure that the fact splitting amongst several parallel Regions does not affect the solution.

Output file processing

The simplest way to extend the handling of data files when implementing CONVECT in parallel was to simply get each parallel Region to output its results to separate, numbered data files. This meant that some post-processing was required to produce a single, meaningful graphics plot. The ghost cells have to be stripped from the output and the data from the parallel Regions must be pasted together.

Implementation in C++

Overview

Object-oriented programming

Procedural programming

Modular programming

Data abstraction

Objects

The design of CONVECT

The object classes

Domain

Region

Cell

Species

Diffuser

Convector

Squeezer

Transport

Implementation differences between Milthorpe's method and CONVECT

Dynamic memory allocation

De-referencing

Momentum and velocity as cell properties

Timestep adjustments

Inputs file

Calcinit.pl - calculation of initial conditions

Cross platform features

Stability

Artificial viscosity

The tuning algorithm

Parallel implementation

Overview

Implementation points

Message Passing Interface (MPI)

Data structures

Output file processing