TelecomTS: A Multi-Modal Observability Dataset for Time Series and Language Analysis
To tackle the lack of large observability datasets in public benchmarks, a new dataset named TelecomTS has been launched. Sourced from a 5G telecommunications network, TelecomTS includes varied, de-anonymized covariates that contain clear absolute scale details. It offers a wide range of downstream tasks, such as root cause analysis and anomaly detection, facilitating multi-modal reasoning. This dataset seeks to address the shortcomings of current anonymized and normalized datasets, which strip away essential scale information.
Key facts
- TelecomTS is a large-scale observability dataset derived from a 5G telecommunications network.
- It features heterogeneous, de-anonymized covariates with explicit absolute scale information.
- The dataset provides downstream tasks including anomaly detection and root cause analysis.
- Existing observability datasets are often anonymized and normalized, removing scale information.
- Observability data are zero-inflated, highly stochastic, and exhibit minimal temporal structure.
- The dataset is introduced to address the underrepresentation of observability data in public benchmarks.
- Proprietary restrictions and privacy concerns have limited the availability of such datasets.
- TelecomTS enables multi-modal reasoning tasks.
Entities
—