Paper title: | A Survival Analysis-Based Prioritization of Code Checker Warning: A Case Study using PMD |
Journal: | Springer: Studies in Computational Intelligence |
Status: | published as Roger Lee (ed.), Big Data, Cloud Computing, and Data Science Engineering. Studies in Computational Intelligence, vol. 844, pp. 69--83, Springer, Cham, Jan. 2020. [DOI 10.1007/978-3-030-24405-7_5] |
data | file | description |
List of investigated Java projects (CSV format) |
projects.csv (8KB) |
This CSV file presents 100 records including "Git repository URL", "project's name" and "clone date." |
Lists of investigated Java source files |
source_file_lists.zip (445KB) |
This ZIP file contains 100 directories whose names correspond to project names; each directory has a text file presenting source file paths. Note: The following three projects' lists are empty since all of their source files looked any one of sample, example, test, documents or demo programs: Android-CleanArchitecture, gradle-retrolambda and java8-tutorial. |
Lists of source file changes (tab-separated values (TSV) format) |
source_file_change_histories.zip (28MB) |
This ZIP file contains 100 directories whose names correspond to project names; each directory has 1 TSV file: source_file_change_history.tsv. In the TSV file, each line corresponds to each commit of each source file. They have the following five columns:
|
Warnings made by PMD (tab-separated values (TSV) format) |
pmd_results.zip (718MB) |
This ZIP file contains 100 directories whose names correspond to project names; each directory has a TSV file which presents the warnings made by PMD for all versions of all source files, whose columns are: "the commit hash," "the file path," "the line number corresponding to the warning," and "the warning's priority." Warnings' priorities are pre-defined in the PMD rule sets. |
Warning lifetime data (by project) (CSV format) |
survival_data_by_project.zip (16MB) |
This ZIP file contains 100 directories whose names correspond to project names; each directory has a CSV file presenting sets of "warning," "censor" and "lifetime (in days)." In "censor" column, it gives 0 if the warning sample is a censor sample; otherwise, 1. |
Warning lifetime data (by kind of warning) (CSV format) |
survival_data_by_warning.zip (14MB) |
This ZIP file contains 259 CSV files whose names correspond to warning names; each CSV file presenting sets of "warning," "censor" and "lifetime (in days)." In "censor" column, it gives 0 if the warning sample is a censor sample; otherwise, 1. |
List of the expected lifetimes of warnings (CSV format) |
survival_analysis_results.csv (11KB) |
This CSV file lists 259 results of the survival analysis: each line corresponds to each kind of warning, which gives the warning name and the expected lifetime (in days). |