1.107. mkRms

Full Name: herschel.hifi.pipeline.generic.MkRmsTask
Alias: mkRms
Type: Java Task - Java Task
Import: from herschel.hifi.pipeline.generic import MkRmsTask
Category

HIFI/Pipeline/Level 2 Pipeline

Description

Determine RMS noise in >= level 2 HIFI data (both HRS and WBS).

An essential step is the determination of the baseline and the line masks which is delegated to the SmoothBaselineTask. Once mask and baseline are determined the baseline(s) is (are) subtracted from the input spectra (if 'subtractBase' has been set to True) and the line mask(s) is (are) set, accordingly. Optionally, the results can be smoothed (by default in accordance to smoothing widths configured in the uplink product by the noiseMaxWidth-, noiseMinWidth-parameters). After this optional smoothing the statistics are computed on a per spectrum basis (across the channels) by ignoring the line masked regions.

From this raw output which is found in the output product with integer keys, we also prepare suitable overviews:

  • index table: summarizes the information of all the raw statistics output datasets, by giving specifying the index (the integer key in the output product, an identifier of the dataset used as basis for computing the statistics, the associated LO frequency if available, the smoothing applied, etc. and the average, standard deviation, min and max of the statistics quantities reported in the raw statistics output datasets.

  • summary table (per smoothing): in contrast to the index table, the summary table reports the average, stdev, median of the all the statistics found in all the datasets for a given smoothing.

For point modes, index and summary tables contain the same information. However, for scan modes or mapping modes it differs. When computing the statistics, in particular, when determining baseline and mask, spectra with different LO frequency or associated with a different point in sky should be treated separately. Spectra treated separately in this sense are combined in groups. Hence, we arrive at many different raw statistics datasets in the output product. The raw statistics datasets included in the output product may contain many rows. This is possible, when the average at the end of the level 2 pipeline is skipped, and the corresponding timeline product is passed as input to the mkRms task. Then, the index table contains the summary statistics (e.g. average of the standard deviations found in the different rows of the raw statistics dataset) - whereas the summary table contains the the summary statistics over all quantities found in the raw statistics datasets for given smoothing.

Various different input data can be passed - and, for a given input data type, different behavior results according to the observing mode. In the following, we describe the default behavior as intended for the pipeline. See the description of parameters for further options.

  • ObservationContext : By setting the parameters backend and sideband, the associated level 2 timeline product is selected. See further details on how this data type is handled.

  • HifiTimelineProduct :

    • Point modes : We expect that at level 2 we see a single dataset. In case the average in level 2 pipeline has been skipped, the mask is computed from an average over all these spectra. Then, one raw statistics dataset is included in the output product with the statistics for the individual scans.

    • Mapping Modes : We expect that at level 2 we see for each point in sky a seperate dataset. By default, for each of these points mask and baseline are computed separately and the statistics will end up in a separate raw statistics dataset. In case the average in level 2 pipeline has been skipped, the mask is computed from an average over all these spectra per point in sky. Then, per point in sky one raw statistics dataset is included in the output product with the statistics for the individual scans.

    • Scan Modes : We expect that at level 2 we see for each LO setting a separate dataset. If the parameter 'scanModeBehavior' has been set to True (which is the default) the statistics is computed only for at most three datasets with LO frequency around noiseRefFrequency (as found in the uplink product). Here, the datasets with LO frequencies closest to noiseRefFrequency, noiseRefFrequency + 4GHz, noiseRefFrequency - 4GHz are selected. Three datasets are selected so that in case the dataset closest to noiseRefFrequency has a large fraction line masked, an alternative dataset can be picked. The average fraction of the total spectrum / spectra that is line-masked is reported in the meta data of the raw statistics datasets as 'fractionMasked_avg'. Finally, in the summary table, only one single raw statistics dataset is used (usually, the one closest to noiseRefFrequency). If the parameter 'scanModeBehavior' is set to False, for all LO settings mask and baseline are computed separately and the raw statistics computed. Then, in the summary table, the e.g. average over all the stdev values found in the raw statistics quantities is reported. In case the average in level 2 pipeline has been skipped, the mask is computed from an average over all these spectra LO setting. Then again, per LO setting one raw statistics dataset is included in the output product which contains the statistics for the individual scans. - Here, the behavior is somewhat different when 'scanModeBehavior' is set to True: Single datasets in the input htp are selected and other datasets that belong to the same group are not included in the calculations.

    Note that, generally, when more than one dataset is found in the original htp for a given group a single identifier is used for the group data in the htp. This is of the form 'htp[k]' where k enumerates the groups. If we apply mkRms to a htp with average applied at the end of the level2 pipeline this matches the dataset ids of the original htp. However, if the average has been skipped this no longer holds true.

  • SpectralSimpleCube : For these data (such as found at level 2.5) raw statistics are computed for each spaxel separately - including the computation of mask and baseline. Here the identifier for the data used for the computation of a raw statistics dataset is denoted by 'spaxel(j,k)' where j,k denote integer row and column number within the raster map.

  • Deconvolved Spectrum Dataset (Spectrum1d, 'scanModeBehavior=True' : For this spectrum (such as found at level 2.5) raw statistics are computed for suitable ranges. These ranges are determined as follows: Ranges are picked around the noiseRefFrequency - with a similar rule as adopted for spectral scan htps - ranges are selected around noiseRefFrequency, noiseRefFrequency + 4GHz and noiseRefFrequency - 4GHz. These ranges are selected only provided that at least 50% overlap with the input spectrum is obtained. The width of the ranges to select depends on

    • Band: 4GHz for the SIS bands and 2.6 GHZ for the HEP bands

    • oneGhzReference-parameter: if set to True 1GHz.

    The range is then defined such that the reference frequency is in the center of the range.

  • General Spectrum Container (such as Spectrum2d etc) A single raw statistics dataset is computed based on a single baseline and mask calculation.

Example

Example 1: in HIPE: obs = getObservation(1342205481, useHsa=True) stats
= mkRms(input=obs, backend="WBS-H", sideband="USB",
domask=1, base=80.0, \ rebin=2.0, doglue=False, uplink=None,
preserveGroups=True, \ maskFromAvg=False, percentiles=[],
segmentIndex=2, smoothing=0.0, stitchSegments=False, plot=1,
subtractBase=False)
htp = obs.refs["level2"].product.refs["WBS-H-USB"].product
stats = mkRms(input=htp, domask=1, base=80.0, \ rebin=2.0,
doglue=False, uplink=None, preserveGroups=True, \
maskFromAvg=False, percentiles=[], segmentIndex=2, \
stitchSegments=False, plot=1, subtractBase=False)
ds =
obs.refs["level2"].product.refs["WBS-H-USB"].product.refs
["box_001"].product["0001"] stats = mkRms(input=ds,
domask=1, base=80.0, \ rebin=2.0, doglue=False, uplink=None,
\ maskFromAvg=False, percentiles=[], segmentIndex=2, \
stitchSegments=False, plot=1, subtractBase=False)

API Summary

Jython Syntax

See example below.

Properties
ObservationContext | HifiTimelineProduct | input [INPUT, MANDATORY, default=no default value]
String backend [INPUT, OPTIONAL, default=no default value. Is required]
String sideband [INPUT, OPTIONAL, default=no default value. Is required]
Integer segmentIndex [INPUT, OPTIONAL, default=no default value. If set]
Boolean doglue [INPUT, OPTIONAL, default=default value is False. If set]
Boolean stitchSegments [INPUT, OPTIONAL, default=no default value. If]
Object smoothing [INPUT, OPTIONAL, default=no default value. Smoothing]
ObservationContext | AuxiliaryContext | uplink [INPUT, MANDATORY, default=no default value]
Integer domask [INPUT, OPTIONAL, default=default value is 1. Configures]
Boolean maskFromAvg [INPUT, OPTIONAL, default=default value is True.]
Integer flagToIgnore [INPUT, OPTIONAL, default=no default value. By]
Double base [INPUT, OPTIONAL, default=default value is 80.0. Width [in]
Double rebin [INPUT, OPTIONAL, default=default value is 2.0. Before]
Boolean subtractBase [INPUT, OPTIONAL, default=default value is False.]
Boolean preserveGroups [INPUT, OPTIONAL, default=False]
Boolean scanModeBehavior [INPUT, OPTIONAL, default=True]
Double lineMaskFraction [INPUT, OPTIONAL, default=0.3]
PipelineConfiguration params [INPUT, OPTIONAL, default=no default value]
Integer plot [INPUT, OPTIONAL, default=0 Configure how to display]
Boolean ignore [INPUT, OPTIONAL, default=False Flag to indicate whether]
StatisticsTrendProduct stats [OUTPUT, OPTIONAL, default=no default value]
Product plotsOut [OUTPUT, OPTIONAL, default=no default value]

API details

Properties

ObservationContext | HifiTimelineProduct | input [INPUT, MANDATORY, default=no default value]

SpectrumContainer, MANDATORY, no default value. The input data with the spectra to compute RMS for.

String backend [INPUT, OPTIONAL, default=no default value. Is required]

if the input is of type ObservationContext and is ignored otherwise. Can assume the values "WBS-H", "WBS-V", "HRS-H" or "HRS-V".

String sideband [INPUT, OPTIONAL, default=no default value. Is required]

if the input is of type ObservationContext and is ignored otherwise. Can assume the values "USB" or "LSB".

Integer segmentIndex [INPUT, OPTIONAL, default=no default value. If set]

the calculation of the RMS is restricted to the associated segment(subband).

Boolean doglue [INPUT, OPTIONAL, default=default value is False. If set]

to True, the mask is determined on combined sub-band spectra (all segments of a given scan concatenated). Note that the segment selection (segmentIndex) is applied first. This is not desired if there are sub-band jumps, hence the default is False.

Boolean stitchSegments [INPUT, OPTIONAL, default=no default value. If]

set to True, the segments are stitch before computing the line mask (if applicable) and computing the RMS. Note that this overrules the segmentIndex parameter. Note also that the stitch is only possible when segments overlap. In this sense it is distinct from the doGlue which always concatenates the segments and doGlue is applied after segment selection.

Object smoothing [INPUT, OPTIONAL, default=no default value. Smoothing]

widths [in MHz] used for smoothing the (possibly masked) spectra before computing the statistics. More than one smoothing width to iteratively compute the statistics width can be specified with a dictionary of values, the keys (smoothing configuration) will be included in the index summary and for each smoothing width a dedicated summary is created (with name 'summary_width1' if 'width1' has been specified as name for the smoothing configuration). Smoothing will be applied only if the smoothing width > 0 have been specified or the uplink parameter has been set. In case no smoothing parameter is specified but the uplink parameter has been set, two smoothing widths (with names noiseMaxWidth, noiseMinWidth) are looked up from there and applied. In case no smoothing width and no uplink product is set the name of the smoothing configuration is 'noSmoothing'. In case a double smoothing parameter is set the name of the smoothing configuration is 'customSmoothing'. In case smoothing is applied a BOX filter method is used in combination with CUT edge behavior. See the smoothing task for further details.

ObservationContext | AuxiliaryContext | uplink [INPUT, MANDATORY, default=no default value]

UplinkProduct, OPTIONAL, no default value. Product from which to obtain uplink information for the calculation of the 'smoothing' parameter. Note that if the parameter is set the 'smoothing' parameter is computed from uplink information only if the smoothing parameter is not explicitly set. If 'uplink' is not set the smoothing width is never computed from information found in the observation context even if an observation context is passed as 'input' to the task.

Integer domask [INPUT, OPTIONAL, default=default value is 1. Configures]

the behavior of the task for how line masking should be dealt with:

  • domask=0 : no line masking

  • domask=1 : automatically mask lines using a sigma-clip algorithm

  • ... other options to follow

Boolean maskFromAvg [INPUT, OPTIONAL, default=default value is True.]

Only applicable if masking is applied (domask > 0). If set to True, it computes the mask for an average over all spectra for the given group (see preserveGroups-parameter). If set to False it computes a mask separately for each scan.

Integer flagToIgnore [INPUT, OPTIONAL, default=no default value. By]

setting this parameter you can configure which flags should be ignored when computing the statistics (after applying all the line masking). For example, if you want to ignore channels that have the flag 1,4 or 16 set you specify a flagToIgnore=21. By default, the flagToIgnore is set to 0 which means that any flagged channel is ignored.

Double base [INPUT, OPTIONAL, default=default value is 80.0. Width [in]

MHz] over which periodic structures in the spectra are considered real. Larger numbers will lead to nearly straight baselines, smaller numbers will follow spectral curvatures. Translates to the midcycle in the smoothBaseline task according to midcycle = c / base where c is the speed of light in SI units.

Double rebin [INPUT, OPTIONAL, default=default value is 2.0. Before]

determining the mask, resample the spectrum to this resolution [in MHz]. Its only purpose is to speed up the calculations. Note however that a larger rebin value will lead to more masked channels around lines. It is strongly recommended to use the default value.

Boolean subtractBase [INPUT, OPTIONAL, default=default value is False.]

If set to True the flux values in the input data are corrected for the baseline determined as part of this task.

Boolean preserveGroups [INPUT, OPTIONAL, default=False]

Describes whether to LO or pointing groups should be treated separately.

Boolean scanModeBehavior [INPUT, OPTIONAL, default=True]

Specify whether the scan-mode specific settings should be applied. This has only an effect on scan modes: If set to true, RMS is computed at the noiseRefFrequency. Note that in this case the preserveGroups parameter is not considered (grouping is not considered) and only a single dataset with LO closest to the noiseRefFrequency is picked. This is fine for level 2 data or beyond - but may cause issues when applying the task at levels smaller than level 2.

Double lineMaskFraction [INPUT, OPTIONAL, default=0.3]

If more than the given fraction of the total number of pixels considered for computing the RMS are line masked a suitable warning is added to the output. For spectral scans the RMS is computed for a single LO setting as specified by the noiseRefFrequency. If at the given LO frequency the line masking exceeds the given threshold an alternative region which is shifted by +/- 4 GHz is considered.

PipelineConfiguration params [INPUT, OPTIONAL, default=no default value]

Pipeline configuration parameters that can be passed to the task.

Integer plot [INPUT, OPTIONAL, default=0 Configure how to display]

diagnostic plots: no plots (0), all in one (1), all in separate plot (2). Note that if preserveGroups has been set to True all the diagnostics for a given group go into a separate plot.

Boolean ignore [INPUT, OPTIONAL, default=False Flag to indicate whether]

the execution of the module should be ignored.

StatisticsTrendProduct stats [OUTPUT, OPTIONAL, default=no default value]

Statistics output

Product plotsOut [OUTPUT, OPTIONAL, default=no default value]

Statistics plot output

See also

History

  • 2013-01-10 - Melchior: : Initial java version - translated from getRms.py
  • 2013-07-15 - Melchior: : Various updates