Output Format (Python)
A Skywalker program writes all of its data (input parameters and output values) to a text file containing a Python module. The data in the Python module is structured in a regular way for easy use by postprocessing scripts. The module contains two Python variables:
input
, an object whose fields list all of the input parameters specified in the YAML input file, in ascending lexicographic order by nameoutput
, an object whose fields list all of the output parameters corresponding to the input parameters, in ascending lexicographic order by namesettings
(if settings are present), an object whose fields list all of the driver-specific settings used to process the ensemble. All settings fields are strings, and the settings are sorted in ascending lexicographic order by name.
Consider the input
block from the example input file in the
Input Format section:
my_settings:
method = quadrature
input:
lattice:
relative_humidity: [0.01, 1.00, 0.01]
temperature: [230.15, 300.15, 1]
fixed:
c_h2so4: 5e8 # [#/cc]
planetary_boundary_layer_height: 1100
height: 500
xi_nh3: 0
With this input, Skywalker constructs an ensemble whose members assume all
possible combinations of its lattice
parameters. There are 100 values for
relative_humidity
, 71 values for temperature
, and a single value for all
other parameters. So there are \(100 \times 71 = 7100\) members in the resulting
lattice ensemble.
Suppose our program generates output variables nucleation_rate
and
nucleation_threshold
, both of which depend on some or all of the input
parameters. Here's how the resulting Python module might look.
# This file was automatically generated by skywalker.
from math import nan as nan
# Object is just a dynamic container that stores input/output data.
class Object(object):
pass
# Settings are stored here.
settings = Object()
settings.method = 'quadrature'
# Input is stored here.
input = Object()
input.relative_humidity = [0.01, 0.02, 0.03, ..., 0.98, 0.99, 1.00, ]
input.temperature = [230.15, 230.15, 230.15, ..., 300.15, 300.15, 300.15, ]
input.c_h2so4 = [5e8, 5e8, 5e8, ..., 5e8, 5e8, 5e8, ]
input.planetary_boundary_layer_height = [1100, 1100, 1100, ..., 1100, 1100, 1100, ]
input.height = [500, 500, 500, ..., 500, 500, 500, ]
# Output data is stored here.
output = Object()
output.nucleation_rate = [1.56031e+06, 4.00182e+06, 6.14731e+06, ..., 2.75652e-08, 3.8792e-08, 5.44561e-08, ]
output.nucleation_threshold = [2.52217e+09, 2.37766e+09, 2.24821e+09, ..., 9.00622e+08, 8.85519e+08, 8.70748e+08, ]
We've used ellipsis to omit unnecessary detail. The Object
type is just a
simple trick to allow us to dynamically create fields for the settings
,
input
, and output
variables.
The important thing here is that all input and output lists in this module have 7100 values, and these values all appear in the same order. The first value in each list belongs to the first member of the ensemble, the second value to the second member, and so on, up to the last value in each list, which belongs to the last ensemble member.
This allows you to write postprocessing logic that can easily associate input and output variables. You can easily write a postprocessor to do sensitivity analysis, parameter estimation, or comparisons of two or more different algorithms or codes.
You can even write a conversion utility that imports the Python module and
writes it to another format. For an example of this, take a look at the
py2ncl
program included with Skywalker. py2ncl
converts Skywalker output to a text
file that can be used with legacy NCL programs.
(The latest version of NCL is adopting Python as its language.)
NaNs
Sometimes a Skywalker program emits a NaN, either as the result of pathological
numeric arithmetic or as an indicator that the value is undefined. In this case,
the value is written using Python's nan
representation. This ensures a
faithful translation for all data, no matter what the circumstance.
Array-Valued Outputs
Just as you can store multiple values in a single input array parameter, you can write outputs with multiple values stored in an array. And just as input array parameters are indicated with two sets of braces in a YAML input file, output array values are similarly indicated with two sets of braces, even for ensembles having only a single member. The syntax is easy and intuitive, mostly because YAML and Python use the same format for defining lists.
Skywalker doesn't impose any structure on array-valued outputs. For example, it's possible to write arrays with different sizes to a single output variable. Therefore, your program must write array-valued outputs in the most sensible way for your work.