Skip to content

Conversation

@yuxin370
Copy link

This PR introduces five widely-used real-world time series datasets spanning diverse domains — including IoT, smart grids, traffic systems, and infrastructure monitoring. These datasets are ideal for benchmarking time series models in forecasting, anomaly detection, resource allocation, and streaming analytics.

Dataset Summary

Name # Attr Original Length Domain Citation
Weather Forecast (WF) 6 910,576 Weather prediction [1]
AMPds 11 10,490,860 Household power [2]
Smart Grid (SG) 5 100,000,000 Grid load [3]
Linear Road (LR) 6 108,437,193 Traffic simulation [4]
Computer Monitor (CM) 4 144,370,688 System monitoring [5]

Note: Due to repository file size constraints, each dataset has been truncated to 65,535 rows to ensure fast loading and maintainability. Full datasets are available via the original sources below.

References

[1] Yu Zheng, Xiuwen Yi, Ming Li, Ruiyuan Li, Zhang Shan, Eric Chang, and Tianrui Li. 2015. Forecasting Fine-Grained Air Quality Based on Big Data. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '15).
[2] Stephen Makonin, Bradley Ellert, Ivan V. Bajić, and Fred Popowich. 2016. Electricity, Water, and Natural Gas Consumption of a Residential House in Canada from 2012 to 2014. Scientific Data 3, Article 160037.
[3] Zbigniew Jerzak and Holger Ziekow. 2014. The DEBS 2014 Grand Challenge. In Proceedings of the 8th ACM International Conference on Distributed Event-Based Systems (DEBS).
[4] Arvind Arasu, Mitch Cherniack, Eduardo Galvez, David Maier, Anurag S. Maskey, Esther Ryvkina, Michael Stonebraker, and Richard Tibbetts. 2004. Linear Road: A Stream Data Management Benchmark. In Proceedings of the 30th VLDB Conference, pp. 480–491.
[5] Google. 2011. More Google Cluster Data. Google AI Blog. https://ai.googleblog.com/2011/11/more-google-cluster-data.html

@azimafroozeh
Copy link
Collaborator

CI is failing on this PR. Could you take a look?

@yuxin370
Copy link
Author

Hi! On macOS-15 (arm64) the job fails while installing pyarrow; this looks like an env/dependency issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants