Hi, I have come across this wonderful integrated dataset and done several analyses such as DEG and survival. You really made great contributions to PCa dataset collection and integration. However, I have also found some mistakes, maybe I assume, in the integrated dataset. For example, the PSA variable in dataset tcga is annotated as at the time point of follow-up according to the cBioPortal raw metadata. However, in curatedPCaData the psa in the metadata is annotated as psa at diagnosis according to the user guide at https://bioconductor.org/packages/devel/data/experiment/vignettes/curatedPCaData/inst/doc/overview.html. You may check the PSA values and find most of the PSA values in tcga dataset are normally below 4. Really the metadata in different datasets are hard to integrate considering the sharply heterogeneously organized metadata information in different datasets. So I suggest re-check the organized data in the curatedPCaData, particularly metadata and their annotations. But in summary curatedPCaData is a wonderful integrated dataset for PCa bioinformatic analysis and thanks to your great efforts.
Hi, I have come across this wonderful integrated dataset and done several analyses such as DEG and survival. You really made great contributions to PCa dataset collection and integration. However, I have also found some mistakes, maybe I assume, in the integrated dataset. For example, the PSA variable in dataset tcga is annotated as at the time point of follow-up according to the cBioPortal raw metadata. However, in curatedPCaData the psa in the metadata is annotated as psa at diagnosis according to the user guide at https://bioconductor.org/packages/devel/data/experiment/vignettes/curatedPCaData/inst/doc/overview.html. You may check the PSA values and find most of the PSA values in tcga dataset are normally below 4. Really the metadata in different datasets are hard to integrate considering the sharply heterogeneously organized metadata information in different datasets. So I suggest re-check the organized data in the curatedPCaData, particularly metadata and their annotations. But in summary curatedPCaData is a wonderful integrated dataset for PCa bioinformatic analysis and thanks to your great efforts.