1.31. doAvg

Full Name: herschel.hifi.pipeline.generic.DoAverageTask
Alias: doAvg
Type: Java Task - Java Task
Import: from herschel.hifi.pipeline.generic import DoAverageTask

HIFI/Pipeline/Level 2 Pipeline


Takes the average over the group of datasets defined by type ("sds_type") and frequency group.

Note that for the proper execution of this module the CheckFreqGrid-task should be run beforehand so that the frequency groups are available (and added as meta data to the datasets). If these groups are not available all the datasets for a given type are averaged. Most of the work is delegated to the task herschel.ia.toolbox.spectrum.AverageSpectrum. In particular, the

  • the different processing modes that define how flags and weights should be included in the processing can be set by the "variant" parameter;

  • the other attribute data (other data columns in the table datasets) are processed according to the rules defined for HIFI in the spectrum toolbox.

Note that this task requires the output to be assigned to a variable (in the form htp=doAvg(...)). The type of the output depends on the value set for return_single_ds.


Example 1: In HIPE:
htp = doAvg(htp=htp, params=params) # the standard pipeline mode
htp = doAvg(htp=htp, return_single_ds=False) # the standard pipeline mode
# the following returns a dataset with just science data. Note that the grouping
# treats the ON and OFF scans in different groups so these are not mixed.
doAvg(htp=htp, selection_meta=["science", "hc"], return_single_ds=True)
doAvg(htp=htp, selection_meta=["scienceOff"], return_single_ds=True)
doAvg(htp=htp, selection_meta={"frequencyGroup":["1","2"]}, return_single_ds=True)
doAvg(htp=htp, selection={"bbtype":[6005,6031]}, return_single_ds=True)
ds = doAvg.result
# The following averages over all the (selected) scans in the htp - irrespective of a grouping
# The result typically is a dataset with a single row.
doAvg(htp=htp, selection={"bbtype":[6005,6031]}, return_single_ds=True, preserveGroups=False)
ds = doAvg.result

API details


HifiTimelineProduct htp [INPUT, MANDATORY, default=no default value]

The timeline product (observation) to be passed to the module.

Object result [OUTPUT, MANDATORY, default=no default value]

The result of the average operation.

PipelineConfiguration params [INPUT, OPTIONAL, default=no default value]

Pipeline configuration parameters that can be passed to the task.

Boolean ignore [INPUT, OPTIONAL, default=no default value]

Flag to indicate whether the execution of the module should be ignored.

Object selection_meta [INPUT, OPTIONAL, default=no default value]

Allows to specify a selection of datasets included in the timeline product by just looking at specific meta data values. Basically, there are two possibilities:

  • As a py dictionary with keys specifying the meta data key and values the admissible values. These values are specified as lists.

  • As a list of admissible 'sds_type's (the type appearing in the summary table). Hence this is internally translated into a py dictionary with 'sds_type' set as key. In order to distinguish ON and OFF science datasets, wee (artificially) allow for the values "scienceOn" and "scienceOff".

Object selection [INPUT, OPTIONAL, default=no default value]

This selection allows to make sub-selection from the set of all scans by looking up matches for suitable values to be found in specific columns of the dataset. Here, the functionality of the the AverageSpectrum-task ('avg') sitting in the spectrum toolbox (ia.toolbox.spectrum) is reused. See there for further details.

Boolean preserveGroups [INPUT, OPTIONAL, default=no default value]

Specifies that the groups should be preserved in the sense that datasets that are associated with different groups are not mixed. Note that the grouping is a concept defined on the level of the meta data of the datasets. This grouping is also used when cleaning up data (see doCleanUp for further information). By default, when this flag is not set, the groups will be preserved. In case some of the meta data needed to specify the grouping is missing the average is performed on a per dataset basis.

String variant [INPUT, OPTIONAL, default="flux-flag-weight".]

Specify the variant of processing mode you would like to run (including flags / weights / ...). Possible values are "flux" / "flux-wave" / "flux-wave-weight" / "flux-flag-wave" / "flux-weight-flag-wave"

  • "flux": average flux, wave set to wave of the first input spectrum, add weights, propagate flags with OR.

  • "flux-wave": average flux, average wave, add weights, propagate flags with OR.

  • "flux-weight-wave": weighted average flux, weighted average wave, add weights, propagate flags with OR.

  • "flux-flag-wave": filter for non-flagged and average flux, filter for non-flagged and average wave, filter for non-flagged and add weights, set flag to 0; in case there are no unflagged values, straight average of flux and weights and add all weights but set the flag to a value propagated with OR.

  • "flux-weight-flag-wave": the same as flux-flag-wave but using weighted averages.

Integer channelFlagMask [INPUT, OPTIONAL, default=2^30+1]

Mask to identify the channels that should be omitted from the average. Applies only if a 'variant' with the substring 'flag' has been specified. Flagged channels that are not masked by this integer will be propagated as informative flags. For example, if you want to ignore channels that have the flag 1,4 or 16 set you specify a channelFlagMask=21. By default, the channelFlagMask is set to 2^30+1 which corresponds to ignoring channels with a HifiMask.IGNORE_DATA or HifiMask.BAD_PIXEL bit set.

Integer rowFlagMask [INPUT, OPTIONAL, default=RowMask.IGNORE_DATA.getValue() | RowMask.BAD_DATA.getValue()]

Mask to identify the rows that should be omitted from the average. Flagged rows that are not masked by this integer will be propagated as informative flags. For example, if you want to ignore rows that have the flag 1,4 or 16 set you specify a rowFlagMask=21. By default, the rowFlagMask is set to discard rows with the flags RowMask.IGNORE_DATA or RowMask.BAD_DATA set. If in a given group (see preserveGroups parameter) all spectra are flagged (i.e. the configured bits set) no data is discarded for this group but the flags are propagated to the resulting average. If data is discarded due to a specific row mask set a badDataDiscarded meta data item is added to the htp, if flagged data finds its way to the resulting average badDataInResult meta data item is added. If you want to disregard rows with any flag set specify a rowFlagMask = 0. If you want to include all rows irrespective of the row flag set specify a negative rowFlagMask.

Boolean return_single_ds [INPUT, OPTIONAL, default=True]

Return a single dataset with the average or, in case you have several groups to handle in the timeline product, the averages. Each line in the result will correspond to one of the groups. You can identify the groups from the bbtype, LO Frequency, rasterLineNum/rasterColumnNum, or scanLineNum. Note that, typically, this does not allow you to include comb datsets to be included in the average. Be aware that if you set this option to False the original timeline product will be overwritten. This is the default pipeline setting since you want to have a timeline product back.

Double loThrow_tolerance [INPUT, OPTIONAL, default=no default]

The groups to be averaged are checked for having consistent LO throw values. These values are said to be consistent if the LO throw changes in the given group are smaller than a given tolerance. The tolerance is computed from the parameter specified here by multiplying the value with the average LO frequency. The value specified here is specified in units km/sec.

See also


  • 2011-07-17 - Melchior: : History added
  • 2011-08-14 - Melchior: : Renamed to DoAverageTask