Utilisateur:Vincentr

From Challenge4Cancer
Jump to: navigation, search

Vincent Rouilly, Data Scientist, Paris France.

Interests:

  • Data Visualization
  • Semantic Web
  • Citizen Science
  • Open Data
  • Open Hardware

Useful R Packages for the Challenge:

  • CKAN interactions: ckanr, jsonlite
  • Data wrangling: readr, xlsx, tidyr, dplyr, mitools
  • Data Visualization: ggplot2, leaflet, corrplot, tabplot
  • Feature Selection: FactoMineR, RandomForest
  • Predictive Modelling / Machine Learning: caret, mlr
  • Time series: CRAN task view on time series

Data Management Resources:

  • The Epidemiology Ontology: http://www.jbiomedsem.com/content/5/1/4#B13
  • http://www.openannotation.org/spec/core/

To Do list:

  • R scripts to load core cancer data
  • ggplot2 body/organs maps based on geom_polygon
  • "Data density" measure (time, location, info type)

Visual Exploration

ggplot2-based body map
french regions map
time series/distribution viz
time series/distribution viz
time series/distribution viz
time series/distribution viz