SEBASENet is the spanish Network of Excellence in Search Based Software Engineering (SBSE), which brings together the Spanish research groups working in SBSE. Search Based Software Engineering (SBSE) arises from the synergies between software engineering and other areas, such as optimization and search. Its main objective is the application of this type of techniques to the resolution of complex software engineering problems, offering the engineer better solutions to existing problems, while reducing the effort and cost required.
In this Github organization you will find links and repositories for source code and datasets created by the members of SEBASENet. The offical SEBASENet web page is available here.
-
EXEMPLAR: is a platform for the publication and tracking of experimental materials. It also integrates tools and allows the generation of descriptions of the experiments with the appropriate level of detail so that they can be replicated by other researchers.
-
GAmera: is an open-source WS-BPEL mutation testing framework which uses genetic algorithms to reduce the number of mutants required.
-
Rodan: is a test case generator for WS-BPEL 2.0 compositions. Rodan is based on mutation testing and genetic algorithms. Rodan generates test cases for killing mutants of the original WS-BPEL composition. The MuBPEL tool generates the mutants for the original composition.
-
STATService: STATService is a web tool that helps users to apply statistical hypothesis testing easily, in accordance with the scientific methodology, the data distribution and the type of data to be analysed.
- The old repository of the PROMISE datasets contains many well-known datasets used in academic articles: http://promise.site.uottawa.ca/SERepository/datasets-page.html
There are many articles that use those classic datasets in different formats, e.g. https://data.mendeley.com/datasets/923xvkk5mm/1; [https://www.openml.org/d/1050; https://github.com/klainfo/DefectData/](https://www.openml.org/d/1050; https://github.com/klainfo/DefectData/)
-
Section “Software” of “Awesome Public Data sets”: some datasets need to be curated to be of some use https://github.com/awesomedata/awesome-public-datasets#software
-
Awesome Empirical Software Engineering: many datasets related to software engineering https://github.com/dspinellis/awesome-msr
-
The recently released book “Evidence-based Software Engineering” (Derek M. Jones, 2020) contains a collection of publicly available datasets on which their figures and plots are based http://www.knosof.co.uk/ESEUR/index.html https://github.com/Derek-Jones/ESEUR-code-data
-
The series of the PROMISE conferences is devoted to the analysis of data in the software engineering field PROMISE 2020 https://promiseconf.github.io/2020/index.html http://promisedata.org/2019/index.html
-
Mining Software Repositories Conferences https://www.msrconf.org/
-
Other sources of data sets, ZENODO https://zenodo.org/:
-
Efstathiou Vasiliki, Chatzilenas Christos, & Spinellis Diomidis. (2018). Word Embeddings for the Software Engineering Domain. Proceedings of the 15th International Conference on Mining Software Repositories. ACM. Zenodo. http://doi.org/10.5281/zenodo.1199620
-
Antoine Pietri, Diomidis Spinellis, & Stefano Zacchiroli. (2019). The Software Heritage Graph Dataset: Public software development under one roof. In proceedings of MSR 2019: The 16th International Conference on Mining Software Repositories, May 2019, Montreal, QC, Canada: Zenodo. http://doi.org/10.5281/zenodo.2583978
-
Vahid Garousi, Dietmar Pfahl, Michael Felderer, Mika Mäntylä, & João M Fernandes. (2017). Dataset for survey of industry-academia collaboration in software engineering (phase 1). http://doi.org/10.5281/zenodo.842239
-
Francesco Osborne. (2019). Dataset of Reducing the Effort for Systematic Reviews in Software Engineering [Data set]. Zenodo. http://doi.org/10.5281/zenodo.2653925
-
Halin, A., Nuttinck, A., Acher, M., Devroey, X., Perrouin, G. and Baudry, B. (2019). Test them all, is it worth it? Assessing configuration sampling on the JHipster Web development stack. Empirical Software Engineering 24, 2(Apr. 2019), pp. 674–717. http://doi.org/10.5281/zenodo.3766691
-
Massimiliano Di Penta, Gabriele Bavota, & Fiorella Zampetti. (2020). On the Relationship between Refactoring Actions and Bugs: A Differentiated Replication -- Replication Package. Zenodo. http://doi.org/10.5281/zenodo.4018691
-
Zagalsky, Alexey, German, Daniel, Storey, Margaret-Anne, Gomez-Teshima, Carlos, & Poo-Caamaño, Germán. (2017). Replication package for How the R Community Creates and Curates Knowledge: An Extended Study of Stack Overflow and Mailing Lists. Journal Empirical Software Engineering. Zenodo. http://doi.org/10.5281/zenodo.831805
-
Baltes, & Ralph. (2020). Sampling in Software Engineering Research Supplementary Material. Zenodo. http://doi.org/10.5281/zenodo.3977049
-
Carlos Bernal Cardenas, Nathan Cooper, Kevin Moran, Oscar Chaparro, Andrian Marcus, & Denys Poshyvanyk. (2020). The V2S Dataset: A Set of Android Screen Recordings, Training Images, and Models (Version 1.0) 42nd International Conference on Software Engineering (ICSE'20), Virtual/Seoul, South Korea: Zenodo. http://doi.org/10.5281/zenodo.3934403
-
Yalin Liu. (2020). Traceability Solutions for Supporting IntermingledBilingual Artifacts. MSR, Zenodo. http://doi.org/10.5281/zenodo.3713256
This listing of external datasets was compiled and contributed by Javier Dolado and Daniel Rodríguez.