Output Format (Python)

A Skywalker program writes all of its data (input parameters and output values) to a text file containing a Python module. The data in the Python module is structured in a regular way for easy use by postprocessing scripts. The module contains two Python variables:

input, an object whose fields list all of the input parameters specified in the YAML input file, in ascending lexicographic order by name
output, an object whose fields list all of the output parameters corresponding to the input parameters, in ascending lexicographic order by name
settings (if settings are present), an object whose fields list all of the driver-specific settings used to process the ensemble. All settings fields are strings, and the settings are sorted in ascending lexicographic order by name.

Consider the input block from the example input file in the Input Format section:

my_settings:
  method = quadrature

input:
  lattice:
    relative_humidity: [0.01, 1.00, 0.01]
    temperature: [230.15, 300.15, 1]
  fixed:
    c_h2so4: 5e8 # [#/cc]
    planetary_boundary_layer_height: 1100
    height: 500
    xi_nh3: 0

With this input, Skywalker constructs an ensemble whose members assume all possible combinations of its lattice parameters. There are 100 values for relative_humidity, 71 values for temperature, and a single value for all other parameters. So there are \(100 \times 71 = 7100\) members in the resulting lattice ensemble.

Suppose our program generates output variables nucleation_rate and nucleation_threshold, both of which depend on some or all of the input parameters. Here's how the resulting Python module might look.

# This file was automatically generated by skywalker.

from math import nan as nan

# Object is just a dynamic container that stores input/output data.
class Object(object):
    pass

# Settings are stored here.
settings = Object()
settings.method = 'quadrature'

# Input is stored here.
input = Object()
input.relative_humidity = [0.01, 0.02, 0.03, ..., 0.98, 0.99, 1.00, ]
input.temperature = [230.15, 230.15, 230.15, ..., 300.15, 300.15, 300.15, ]
input.c_h2so4 = [5e8, 5e8, 5e8, ..., 5e8, 5e8, 5e8, ]
input.planetary_boundary_layer_height = [1100, 1100, 1100, ..., 1100, 1100, 1100, ]
input.height = [500, 500, 500, ..., 500, 500, 500, ]

# Output data is stored here.
output = Object()
output.nucleation_rate = [1.56031e+06, 4.00182e+06, 6.14731e+06, ..., 2.75652e-08, 3.8792e-08, 5.44561e-08, ]
output.nucleation_threshold = [2.52217e+09, 2.37766e+09, 2.24821e+09, ..., 9.00622e+08, 8.85519e+08, 8.70748e+08, ]

We've used ellipsis to omit unnecessary detail. The Object type is just a simple trick to allow us to dynamically create fields for the settings, input, and output variables.

The important thing here is that all input and output lists in this module have 7100 values, and these values all appear in the same order. The first value in each list belongs to the first member of the ensemble, the second value to the second member, and so on, up to the last value in each list, which belongs to the last ensemble member.

This allows you to write postprocessing logic that can easily associate input and output variables. You can easily write a postprocessor to do sensitivity analysis, parameter estimation, or comparisons of two or more different algorithms or codes.

You can even write a conversion utility that imports the Python module and writes it to another format. For an example of this, take a look at the py2ncl program included with Skywalker. py2ncl converts Skywalker output to a text file that can be used with legacy NCL programs. (The latest version of NCL is adopting Python as its language.)

NaNs

Sometimes a Skywalker program emits a NaN, either as the result of pathological numeric arithmetic or as an indicator that the value is undefined. In this case, the value is written using Python's nan representation. This ensures a faithful translation for all data, no matter what the circumstance.

Array-Valued Outputs

Just as you can store multiple values in a single input array parameter, you can write outputs with multiple values stored in an array. And just as input array parameters are indicated with two sets of braces in a YAML input file, output array values are similarly indicated with two sets of braces, even for ensembles having only a single member. The syntax is easy and intuitive, mostly because YAML and Python use the same format for defining lists.

Skywalker doesn't impose any structure on array-valued outputs. For example, it's possible to write arrays with different sizes to a single output variable. Therefore, your program must write array-valued outputs in the most sensible way for your work.