ufzLogo rdmLogo

Flags#

class Flags(raw_data=None, copy=False)[source]#

Bases: object

SaQC’s flags container.

This container class holds the quality flags associated with the data. It hold key-value pairs, where the key is the name of the column and the value is a pandas.Series of flags. The index of the series and the key-value pair can be assumed to be immutable, which means, only the values of the series can be change, once the series exist. In other words: an existing column can not be overwritten by a column with a different index.

The flags can be accessed via __getitem__ and __setitem__, in real life known as the []-operator.

For the curious:

Under the hood, the series are stored in a history, which allows the advanced user to retrieve all flags once was set in this object, but in the most cases this is irrelevant. For simplicity one can safely assume, that this class works just stores the flag-series one sets.

See also

initFlagsLike

create a Flags instance, with same dimensions as a reference object.

History

class that actually store the flags

Examples

We create an empty instance, by calling Flags without any arguments and then add a column to it.

>>> from saqc import UNFLAGGED, BAD, DOUBTFUL, Flags
>>> flags = Flags()
>>> flags
Empty Flags
>>> flags['v0'] = pd.Series([BAD,BAD,UNFLAGGED], dtype=float)
>>> flags 
      v0 |
======== |
0  255.0 |
1  255.0 |
2   -inf |

Once the column exist, we cannot overwrite it anymore, with a different series.

>>> flags['v0'] = pd.Series([666.], dtype=float) 
Traceback (most recent call last):
  ...
ValueError: Index does not match

But if we pass a series, which index match it will work, because the series now is interpreted as value-to-set.

>>> flags['v0'] = pd.Series([DOUBTFUL,np.nan,DOUBTFUL], dtype=float)
>>> flags 
      v0 |
======== |
0   25.0 |
1  255.0 |
2   25.0 |

As we see above, the column now holds a combination from the values from the first and the second set. This is, because numpy.nan was used. We can inspect all the updates that was made by looking in the history.

>>> flags['v0'] = pd.Series([DOUBTFUL, np.nan, DOUBTFUL], dtype=float)
>>> flags.history['v0'] 
      0     1     2
0  255.0  25.0  25.0
1  255.0   nan   nan
2   -inf  25.0  25.0

As we see now, the second call sets 25.0 and shadows (represented by the parentheses) (255.0) in the first row and (-inf) in the last, but in the second row 255.0 still is valid, because it was not touched by the set.

It is also possible to set values by a mask, which can be interpreted as condidional setting. Imagine we want to reset all flags to 0. if the existing flags are lower that 255..

>>> mask = flags['v0'] < BAD
>>> mask 
0     True
1    False
2     True
dtype: bool
>>> flags[mask, 'v0'] = 0
>>> flags 
      v0 |
======== |
0    0.0 |
1  255.0 |
2    0.0 |

The objects you can pass as a row selector (flags[rows, column]) are:

  • boolen arraylike, with or without index. Must have same length than the undeliing series.

  • slices working on the index

  • pd.Index, which must be a subset of the existing index

For example, to set all values to a scalar value, use a Null-slice:

>>> flags[:, 'v0'] = 99.0
>>> flags 
     v0 |
======= |
0  99.0 |
1  99.0 |
2  99.0 |

After all calls presented here, the history looks like this:

>>> flags.history['v0']
      0     1     2    3     4
0  255.0  25.0  25.0  0.0  99.0
1  255.0   nan   nan  nan  99.0
2   -inf  25.0  25.0  0.0  99.0

Attributes Summary

columns

Column index of the flags container

empty

True if flags has no columns.

history

Accessor for the flags history.

Methods Summary

copy([deep])

Copy the flags container.

drop(key)

Delete a flags column.

keys()

rtype:

KeysView

toFrame()

Transform the flags container to a pd.DataFrame.

Attributes Documentation

columns#

Column index of the flags container

Returns:

columns – The columns index

Return type:

pd.Index

empty#

True if flags has no columns.

Returns:

True if the container has no columns, otherwise False.

Return type:

bool

history#

Accessor for the flags history.

Access via flags.history['var']. To set a new history use flags.history['var'] = value. The passed value must be an instance of History or must be convertible to a history.

Returns:

history – Accessor for the flags history

Return type:

History

See also

saqc.core.History

History storage class.

Methods Documentation

copy(deep=True)[source]#

Copy the flags container.

Parameters:

deep (bool, default True) – If False, a new reference to the Flags container is returned, otherwise the underlying data is also copied.

Return type:

copy of flags

drop(key)[source]#

Delete a flags column.

Parameters:

key (str) – column name

Return type:

flags object with dropped column, not a copy

keys()[source]#
Return type:

KeysView

toFrame()[source]#

Transform the flags container to a pd.DataFrame.

Return type:

pd.DataFrame