SaQC - System for automated Quality Control#
SaQC is an open-source framework for automated, transparent, and reproducible quality control of time series data. It transforms raw time series data into trustworthy data products by making quality control an explicit step in FAIR-compliant workflows, enabling reliable use in applications such as monitoring, modelling, and decision-making.
Quality control logic in SaQC can be defined using its Python API or through structured, low-code configuration files. The low-code approach enables domain experts to define checks, compound flagging strategies, and processing steps with minimal programming effort—and to apply the same rules consistently to both historical archives and live data streams.
A distinctive feature of SaQC is its flexible quality annotation, which provides a complete, observation-level flag history to ensure end-to-end provenance, traceability, and auditability. Its anomaly detection capabilities range from classical validation methods to advanced techniques. Most components of SaQC, including quality annotation and QC functionality, are easily extensible through well-defined interfaces, enabling hybrid rule-based and machine learning workflows.
installation and setup
first steps
Python API introduction
command-line usage
overview of flagging methods
overview of processing algorithms
overview of tools
configuration-based quality control
global keywords
flags and flagging
customization
outlier detection
frequency alignment
drift detection
data modeling
custom and generic function usage
configure and run SaQC
integrate into larger workflows
publications
users and partners
SaQC turns quality control into an explicit, traceable, and version-controlled step in time series data workflows, enabling the production of AI-ready data and supporting reliable downstream use in research data portals, environmental models, and digital twins.
Beyond stand-alone use, SaQC is designed as a modular building block that can be integrated into various applications. It is, for example, an integral part of Neptoon, is integrated into time.IO - a time series data infrastructure developed at the UFZ - and is also available on Galaxy Europe for workflow-based, low-barrier execution within larger analysis pipelines.
SaQC is developed and maintained by the Research Data Management Team at UFZ at the Helmholtz Centre for Environmental Research - UFZ. It reflects the requirements and experience gained from implementing and operating fully automated quality control pipelines for environmental sensor data.
The diversity of involved communities, along with the specific demands of scientific data acquisition and provisioning, has shaped SaQC into its current form: inherently consistent yet externally extensible, fully traceable, accessible to non-programmers, and applicable across a wide range of use cases— from exploratory, interactive programming environments to large-scale, fully automated workflows.

