metEAUdata Documentation¶
A lightweight package for tracking metadata about time series to create repeatable data pipelines.
metEAUdata is a Python library designed for comprehensive management and processing of time series data, particularly focusing on environmental data analytics. It provides tools for detailed metadata handling, data transformations, and serialization of processing steps to ensure reproducibility and clarity in data manipulation workflows.
Key Features¶
- ๐ Comprehensive Metadata Management - Track the complete lineage of your time series data
- ๐ Reproducible Processing Pipelines - Every transformation is documented and repeatable
- ๐งช Environmental Data Focus - Built specifically for research and monitoring applications
- ๐ Built-in Visualization - Generate interactive plots and dependency graphs
- ๐พ Serialization and Export Support - Save and load complete datasets with full metadata
- ๐ Processing Step Tracking - Maintain detailed records of all data transformations
Quick Start¶
import pandas as pd
from meteaudata.types import DataProvenance, Signal
# Create some sample data
data = pd.Series([20.1, 21.2, 22.3], name="RAW")
# Define data provenance
provenance = DataProvenance(
source_repository="my_project",
location="Laboratory A",
equipment="Temperature Sensor #1",
parameter="Air Temperature",
purpose="Environmental monitoring"
)
# Create a signal
signal = Signal(
input_data=data,
name="temperature",
units="ยฐC",
provenance=provenance
)
# For tracking purposes, signals and time series get numbered automatically.
# 'temperature' becomes temperature#1
# The same is true for data series. Names are suffixed with a number and prefixed with the signal name.
# 'RAW' becomes 'temperature#1_RAW#1'
# Process the data
from meteaudata.processing_steps.univariate.resample import resample
signal.process(["temperature#1_RAW#1"], resample, "1min", output_signal_names=["RESAMPLED"])
# Visualize
fig = signal.plot(["temperature#1_RAW#1", "temperature#1_RESAMPLED#1"])
fig.show()
Core Concepts¶
๐๏ธ Hierarchical Data Structure¶
metEAUdata organizes your data in a three-level hierarchy:
- Dataset - A collection of related signals for a project
- Signal - A collection of time series from the same measurement source
- TimeSeries - Individual time series with complete processing history
๐ Metadata-First Approach¶
Every piece of data includes comprehensive metadata:
- Data Provenance - Where did this data come from?
- Processing Steps - What transformations were applied?
- Function Information - Which functions were used and when?
- Parameters - What settings were used for each processing step?
๐ Processing Pipeline Tracking¶
Build processing pipelines while automatically tracking:
- Input/output relationships between time series
- Function versions and parameters
- Processing timestamps
- Step-by-step transformation history
Documentation Sections¶
๐ Getting Started¶
Installation instructions, quick start guide, and basic concepts.
๐ User Guide¶
Comprehensive guides for working with signals, datasets, and processing pipelines.
๐ Metadata Dictionary¶
Official definitions for all metadata attributes and data structures.
๐ง API Reference¶
Complete API documentation generated from source code.
๐ก Examples¶
Real-world examples and use cases.
๐ ๏ธ Development¶
Contributing guidelines and architecture documentation.
Why metEAUdata?¶
Traditional data analysis often loses track of data lineage and processing steps. metEAUdata solves this by:
- Preserving Context - Never lose track of where your data came from
- Allowing Experimentation - Try out different combinations of algorithms and parameters and visually compare the outcomes.
- Ensuring Reproducibility - Recreate any analysis with full parameter history
- Facilitating Collaboration - Share datasets with complete documentation
- Supporting Quality Assurance - Trace errors back to their source
Installation¶
Community¶
- GitHub: modelEAU/meteaudata
- Issues: Report bugs or request features
- Discussions: Community discussions
License¶
This project is licensed under the CC-BY-4.0 License - see the LICENSE file for details.