Skip to content

metEAUdata Documentation

A lightweight package for tracking metadata about time series to create repeatable data pipelines.

metEAUdata is a Python library designed for comprehensive management and processing of time series data, particularly focusing on environmental data analytics. It provides tools for detailed metadata handling, data transformations, and serialization of processing steps to ensure reproducibility and clarity in data manipulation workflows.

Key Features

  • ๐Ÿ“Š Comprehensive Metadata Management - Track the complete lineage of your time series data
  • ๐Ÿ”„ Reproducible Processing Pipelines - Every transformation is documented and repeatable
  • ๐Ÿงช Environmental Data Focus - Built specifically for research and monitoring applications
  • ๐Ÿ“ˆ Built-in Visualization - Generate interactive plots and dependency graphs
  • ๐Ÿ’พ Serialization and Export Support - Save and load complete datasets with full metadata
  • ๐Ÿ”— Processing Step Tracking - Maintain detailed records of all data transformations

Quick Start

import pandas as pd
from meteaudata.types import DataProvenance, Signal

# Create some sample data
data = pd.Series([20.1, 21.2, 22.3], name="RAW")

# Define data provenance
provenance = DataProvenance(
    source_repository="my_project",
    location="Laboratory A",
    equipment="Temperature Sensor #1",
    parameter="Air Temperature",
    purpose="Environmental monitoring"
)

# Create a signal
signal = Signal(
    input_data=data,
    name="temperature",
    units="ยฐC",
    provenance=provenance
)

# For tracking purposes, signals and time series get numbered automatically.
# 'temperature' becomes temperature#1
# The same is true for data series. Names are suffixed with a number and prefixed with the signal name.
# 'RAW' becomes 'temperature#1_RAW#1'

# Process the data
from meteaudata.processing_steps.univariate.resample import resample
signal.process(["temperature#1_RAW#1"], resample, "1min", output_signal_names=["RESAMPLED"])

# Visualize
fig = signal.plot(["temperature#1_RAW#1", "temperature#1_RESAMPLED#1"])
fig.show()

Core Concepts

๐Ÿ—๏ธ Hierarchical Data Structure

metEAUdata organizes your data in a three-level hierarchy:

  • Dataset - A collection of related signals for a project
  • Signal - A collection of time series from the same measurement source
  • TimeSeries - Individual time series with complete processing history

๐Ÿ“‹ Metadata-First Approach

Every piece of data includes comprehensive metadata:

  • Data Provenance - Where did this data come from?
  • Processing Steps - What transformations were applied?
  • Function Information - Which functions were used and when?
  • Parameters - What settings were used for each processing step?

๐Ÿ”„ Processing Pipeline Tracking

Build processing pipelines while automatically tracking:

  • Input/output relationships between time series
  • Function versions and parameters
  • Processing timestamps
  • Step-by-step transformation history

Documentation Sections

๐Ÿš€ Getting Started

Installation instructions, quick start guide, and basic concepts.

๐Ÿ“– User Guide

Comprehensive guides for working with signals, datasets, and processing pipelines.

๐Ÿ“š Metadata Dictionary

Official definitions for all metadata attributes and data structures.

๐Ÿ”ง API Reference

Complete API documentation generated from source code.

๐Ÿ’ก Examples

Real-world examples and use cases.

๐Ÿ› ๏ธ Development

Contributing guidelines and architecture documentation.

Why metEAUdata?

Traditional data analysis often loses track of data lineage and processing steps. metEAUdata solves this by:

  • Preserving Context - Never lose track of where your data came from
  • Allowing Experimentation - Try out different combinations of algorithms and parameters and visually compare the outcomes.
  • Ensuring Reproducibility - Recreate any analysis with full parameter history
  • Facilitating Collaboration - Share datasets with complete documentation
  • Supporting Quality Assurance - Trace errors back to their source

Installation

pip install meteaudata

Community

License

This project is licensed under the CC-BY-4.0 License - see the LICENSE file for details.