Baseline v2

From Challenge4Cancer
Jump to: navigation, search
Hello, Welcome to Cancer Baseline V2!
alternative textuelle
The Baseline mathematical concept. Cancer risks Y as a function of many variables X even though not all variables are available at all geographic zones
Parenthesis: pour ceux qui souhaitent participer, mais en français
Vraiment? Ecrivez à cancerbaseline à googlegroups point com et nous ferons potentiellement deux groupes, un en français et un en anglais

From v1 to v2

Cancer Baseline v1, Nov 2015 - May 2016, conducted by 50 - 200 participants (depending on the degree of involvment), has lead to:

  • 5000€ prize, mostly spent on biomedical research (*) and materials for a good server to continue Cancer Baseline (**)
  • Main conclusions:
    1. Overall: factors like eating bread or pasta aren't as important as those: Long term unemployment kills (indirectly), Smoking kills, and Aging kills even more!
    2. Aging: what if the link with cancer was quite fundamental?
    3. Oher factors, and methodological aspects
      • Cancer Baseline v1 identified a factor that could be a good model for learning how to do epidemiology on aggregated-data: populations in Africa and from African origins appear to have almost the double risk of prostate cancer compared to other populations (such is a correlation, not causality, ,and similar high risks are for example found for colon and breast cancers in Europe). Cancer Baseline v2 could study that -- and the association Longevity Nigeria, from Nigeria, is interested in that aspect, as a good model to then extrapolate to other cancers (eg high risks of colon and breast cancers in Europe) for which diagnosis campaigns make epidemiology more complex.
      • A logistic regression, with weights for wholes, underlined risk factors that match well known risks: obesity, alcohol, and other factors. Such was understood the day of presenting Cancer Baseline v1 (!) and explored after, so it shall be a starting point for Cancer Baseline v2.
    4. Data
      • A difficulty is to gather enough big data and to assemble it well enough. Cancer Baseline v1 did a great job but going through the already identified data and collecting it in a scalable way should be a long and key component for the v2. It shall be clarified that gathering data correctly is, for the long run, an essential part of the project.
      • (**) The Nigerian team is establishing a server that will allow such collection, as well as analyzing the data with innovative methods (deep learning and GPUs), and allowing work mixing Cancer Baseline and private data. The latter may not be part of Cancer Baseline v2, given the non open nature of private data, but it may help build an ecosystem of organizations around Cancer Baseline and with time get the most of human data worldwide.
      • Last but not least, it would be useful to sensitivize institutions (how, however...) that open, aggregate data would be much more meaning full if quantiles were provided. Indeed, being overweight, drinking, smoking and not paying attention to one's health are known to be correlated, and having a measure of the two populations rather than only to the average of the population mix would provide finer analyses.

How to participate to Cancer Baseline v2

  1. Write to cancerbaseline at googlegroups dot com. Indicate if you think you can come to La Paillasse on Thursday evenings in Paris twice per month during 6 months, or if you can provide similar work. Indicate your skills and what you would like to do. It would be good if you attach a picture so that we can put it here, as we had done for Cancer Baseline V1.
  2. Sign-in here (top of this page) and indicate just here below how you think that you would like to contribute. For now we are then working on slack (the one from v1) but it may change:
  3. Indeed, more to come: watch your emails and : the Challenge4Cancer V2 shall start soon, with specific ways to participate.


Edouard Debonneuil - limited time currently but I have been a major handler of Cancer Baseline v1, I foresee the impressive human impact of continuing Cancer Baseline (and not only Cancer, Alzheimer Baseline, or even Mortality Baseline, as a whole): as such I am ready to accompany Cancer Baseline v2. I can be at La Paillasse on Thursday evenings once to twice a month. Kevin Perez - Je peux aider pour l'analyse de données biomédicales, et réfléchir à des idées pour la v2. Dispo éventuellement le jeudi soir ou à distance. Agbolade omowole - I'm a part of the Longevity Nigeria team and I am committed to the success of Cancer Baseline 2. I can't be at La Paillasse on Thursday evenings once to twice a month. I hereby suggest a weekly text-based meeting via Skype where the Nigerian and French team can discuss weekly progress on the project. Allen Akhaumere - data scientist and deep learning specialist, part of the Longevity Nigeria team Peter-Mikhaël Richard - Can help with DB, software or web dev. (SQL, Python, C, Java and web-techs). Currently in Montpellier so only remote work.
Jibé - Data Analysis. Rather python & sci-kit learn. Machine learning. Collection and aggregation of data. Pierre Flores <add your details here> <add your details here> <add your details here>