TCC News



Participants follow along during the big data seminar series from the University of Utah. (Photo/Shirley Zhao)

Online seminar offers guide to fundamentals of data science

April 27th, 2017

Increasingly, “big data” is changing the world around us. According to the PBS special, “The Human Face of Big Data,” we’ve processed more data in the past two years than we have in the last 3,000 years. In particular, considering the vast amount of biomedical data being produced and processed each day, it is imperative that the data science community offers training programs to assist researchers in gaining the technical skill and tools to properly use, interpret and transform big data into information and knowledge.

The National Institutes of Health (NIH), the BD2K Centers-Coordination Centers (BD2K CCC), and the BD2K Training Coordinating Center (BD2K TCC) — based at USC Stevens Neuroimaging and Informatics Institute — have recognized this need and have collaborated to create the BD2K Guide to the Fundamentals of Data Science lecture series.... Click here for full article

The weekly webinar is held 9 a.m. Fridays through May 19 at bigdatau.org/data-science-seminars.

The lecture series aims to change the way researchers think about data. The lectures occur live online and are also recorded for posting on the BD2K TCC Lecture series page along with a brief biography of the presenter, lecture abstract and presentation slides. Topics covered in this series include Data Indexing and Retrieval, Data Representation, Computing, Data Modeling and Inference, Open Science and more.

Providing essential training suitable for individuals at all levels, the live webinar is open to the public and offers the opportunity for curious minds to learn and submit questions to high-profile researchers.

“Scientific data presents unique challenges for 21st century biomedical research and scholarship,” said John Darrell Van Horn, PhD, associate professor of neurology at the Keck School of Medicine of USC and director of the BD2K TCC. “This data science webinar series is ideally suited for anyone, from student to established scientist, wishing to learn more about the fundamentals of how modern biomedical data is turned into new knowledge about health and disease.”

— Crystal Stewart

Original article in HSCNews (hscnews.usc.edu/online-seminar-offers-guide-to-fundamentals-of-data-science/)


Big data could be the new hope for the future of health

January 6th, 2017

If more people were willing to share their health data on mobile devices, scientists could organize the gaggle of information into shared databases and perhaps bring about the next era of medical breakthroughs, researchers said in a social change film.

The curation and analysis of terabytes of health data may enable David B. Agus, a professor of medicine at the Keck School of Medicine of USC, to tell his patients that hope exists. ... Click here for full article

“The new hope in the room is not in medicine, but it’s in big data,” Agus, director of the Lawrence J. Ellison Institute for Transformative Medicine of USC, said in the film. “The real revelation in data is we can start to categorize cancer. So instead of just lumping it by where it came from, we’re going to start to personalize how we understand, how we categorize and how we treat disease.

“Two or three times I week, I look someone in the eye and say I have no more drugs to treat your disease, and I don’t want to do that anymore.”

Raising public awareness

Big Data: Biomedicine is a 22-minute film that aims to raise public awareness about how big data is having an effect on the future of medicine. The film illustrates how data is fast becoming a crucial component in treating patients and saving lives.

The film, available on YouTube, was written and produced by Michael Taylor, a professor at the USC School of Cinematic Arts and executive director of the Media Institute for Social Change.

“Almost all of us today have computers in our pockets,” Taylor said. “If we can get into the habit of using them to collect our personal health data, it will get us much closer to having a more personalized relationship with our doctor and ultimately benefit all of us.”

The second half of the two-part film will be posted on YouTube in the spring, Taylor said.

The USC Mark and Mary Stevens Neuroimaging and Informatics Institute is the world’s largest brain data repository, currently holding 2,867 terabytes of information from every continent except Antarctica.

“We can’t modify one’s age; we can’t modify one’s genetics,” Judy Pa, an assistant professor of neurology at the USC Stevens Neuroimaging Informatics Institute, said in the film. “But we know that there are other factors about the way one lives that can be changed. Let’s see if we can get these on a better trajectory for your future.”

Recommendations of healthy and unhealthy behaviors are made based on the available science at the time. Big data injects loads of new information into the system, potentially blowing up scientists’ previous misconceptions, said John Van Horn, an associate professor of neurology at the Keck School of Medicine.

“This is where new science gets done,” Van Horn said in the film, referring to the USC Stevens Neuroimaging and Informatics Institute. “This is where cures will be found.”

Original article in USCNews (news.usc.edu/114592/big-data-could-be-the-new-hope-for-the-future-of-health/)




The Data Science Rotations for Advancing Discovery (RoAD-Trip) program aims to match junior-level investigators with senior-level data scientists.

Young investigators sought for new program

Augus 23rd, 2016

The Big Data to Knowledge (BD2K) Training Coordinating Center (TCC) based at the USC Mark and Mary Stevens Neuroimaging and Informatics Institute has launched a new program called the Data Science Rotations for Advancing Discovery (RoAD-Trip) program, with the intention of encouraging new collaborations among junior biomedical researchers and more senior-level data scientists.

The program is seeking applications, due Sept. 2, from junior-level investigators with compelling biomedical data sets. These researchers will be matched with senior-level data scientists with mentoring skills, access to data science technology and computational resources to assist with novel data analyses. ... Click here for full article

With support provided by the TCC, the young biomedical investigators will “take to the road” for a minimum two-week scientific residency at leading U.S. research universities to work with senior data science faculty based there.

Only 10 fellows and 10 mentors will be selected for this highly selective program.

“The RoAD-Trip will bring young biomedical and established data scientists together to share individual expertise and collaborate on new areas of biomedical research which may lead to a variety of exciting opportunities: a presentation at a national or international conference, a new grant proposal, or a new peer-reviewed publication,” said John Van Horn, PhD, BD2K TCC principal investigator and associate professor of neurology at the Keck School of Medicine of USC.

Through the RoAD-Trip program, junior investigators will blaze new trails on vital and important new research journeys while senior data scientists encounter novel species of data ripe for advanced computational modeling, analysis and exploration.

Example datasets from potential fellows might include genetic, protein structure, molecular, neuroimaging or any other “-omic” data type which might relate to the understanding of a biological system, process, or give insights into health or disease. These might be subjected to new and sophisticated computer science approaches for machine learning, time-series analysis, network modeling or 3D visualization.

RoAD-Trip fellows will be reimbursed up to $4,000 in funding to pay for airfare, hotel and per diem and to help defray living expenses during the two-week-minimum data science research training experience. Mentors will receive a $1,000 honorarium for their participation after the research project has been successfully completed.

To apply for this opportunity, go to www.bigdatau.org/roadtrip.

— Crystal Stewart

Original article in HSCNews (hscnews.usc.edu/young-investigators-sought-for-new-program/)




From left, Katherine Kim of University of California, Davis; Yi Wang of Indiana University; Lucas Mentch of University of Pittsburgh; and Roummel Marcia of University of California, Merced, exchange ideas for new wearable or ambient mobile sensors at the Data Science Innovation Lab held June 15-19 in Lake Arrowhead.

Researchers innovate during annual Big Data workshop

July 21st, 2016

Nearly 30 investigators from across the country, with expertise in biomedical and data science, gathered at Lake Arrowhead beginning June 15 for the second annual Data Science Innovation Lab. Organized by the Big Data 2 Knowledge Training Coordination Center (BD2K TCC), based at the USC Mark and Mary Stevens Neuroimaging and Informatics Institute, the Innovation Lab is a five-day, facilitated, residential workshop where multidisciplinary investigators create new collaborations.

Progress in biomedical research depends greatly upon new innovation. While many seek to apply the latest technologies and analytics in assessing their research questions, being truly innovative without the right team of researchers in place can be challenging. ... Click here for full article

The main goal of the Innovation Lab is to form new collaborations between early-career professionals in the field of biomedical and quantitative science, which may lead to the invention of cutting-edge technologies, offer applications around novel biological research questions and provide profound insight regarding major public health concerns.

From left, Katherine Kim of University of California, Davis; Yi Wang of Indiana University; Lucas Mentch of University of Pittsburgh; and Roummel Marcia of University of California, Merced, exchange ideas for new wearable or ambient mobile sensors at the Data Science Innovation Lab held June 15-19 in Lake Arrowhead.


Senior faculty mentors and invited “provocateurs” provide insight and feedback on proposed projects from newly formed teams. This year’s theme involved addressing the data science needs arising from the use of wearable or ambient sensors to study health and disease.

Innovation Lab attendees formulated proposals for mobile sensor technology beneficial in monitoring health conditions such as obesity, mild brain trauma, asthma, chronic pain, social-emotional agnosia, inflammatory bowel disease and cardiovascular diseases. These proposed sensor technologies would be internet-connected devices aiming to assist in physical health monitoring and to promote research for disease treatment and prevention by collecting individual information (e.g. “big data”) regarding their physical activity, lifestyle and environment.

Prior experience and knowledge in the development of mobile sensors is valuable. For example, Nanshu Lu, PhD, an assistant professor in the Department of Aerospace Engineering and Engineering Mechanics at the University of Texas at Austin, included applications using her invention of an epidermal sensor “tattoo” which can be placed directly on the skin to measure heartbeat and cardiac electrical activity.

“Many researchers seek the most innovative research methodologies by which to address challenges in biomedicine,” said Jack Van Horn, PhD, BD2K TCC principal investigator and associate professor of neurology at the Keck School of Medicine of USC. “Bringing novelty to analysis methods is a critical element of any research project proposal. By holding this event, we seek to encourage the formation of new, multi-disciplinary teams who can bring their skill sets to vexing biomedical problems and maximize the innovation of their proposed approaches.”

The Data Science Innovation Lab is an annual event and will return in spring of 2017. For further details about the Data Science Innovation Lab and similar BD2K TCC programs, please visit: www.bigdatau.org.



— Crystal Stewart

Original article in HSCNews (hscnews.usc.edu/researchers-innovate-during-annual-big-data-workshop/)




USC President C. L. Max Nikias with Paul Thompson, left, and Arthur Toga, right.

USC’s Big Data U will teach researchers to analyze biomedical information

November 16th, 2015

Researchers need guidance as they navigate a jungle of biomedical data in their search for therapies, prevention techniques and cures to diseases.

To assist them, the National Institutes of Health has awarded USC a three-year, $6.3 million grant to build Big Data U, the nation’s first so-called Training Coordination Center aimed at teaching people with different backgrounds how to translate astronomical amounts of data into compatible and comparable statistics. The goal is to find trends, interesting relationships and clustering effects.... Click here for full article

“A lot of the big data we are dealing with haven’t even been collected yet,” said the project’s lead investigator, John Van Horn, PhD, associate professor of neurology and education, and director of the new Master of Science program in neuroimaging and informatics at the Keck School of Medicine of USC. “It’s still off in the future. What we do now and how we train people to be able to deal with that will prepare us for the time when getting many terabytes worth of data is considered trivial — a relatively small or even ‘cute’ little study.”

Big data science has moved away from a traditional reductionist model, where a hypothesis is formed and tested by including a single variable in a controlled experiment.

Disorders such as Alzheimer’s disease involve intricate components. Isolating a single variable when it comes to conditions involving the brain may provide one answer, but not necessarily the complete one, said Arthur Toga, PhD, a provost professor with joint appointments at the Keck School of Medicine and the USC Viterbi School of Engineering.

“We’re letting the data lead us to the discovery. It’s kind of an upside down way of thinking about things,” he said. “Big data allows us to look at all these variables simultaneously and put together a comprehensive picture. Only in concert do they produce the function and structure that you’re trying to understand. If you study only one variable at a time, you may never fully understand how it works.”

Big Data U, tentatively set to launch in the spring of next year, will be a hybrid of massive open online courses (MOOCs) and YouTube video tutorials. It’s a free resource for anyone who wants a self-guided or semi-structured study of topics relevant to biomedical science. Social media tools will provide ratings for course content and guide the selection of relevant training media.

“We will promote opportunities for big data research rotations, host ‘innovation labs’ for new grant proposal development, develop hackathons and other training activities,” Van Horn said. “Some of these activities will be up to the user to complete, but others will have an expectation of required completion and will entail a report or tangible product.”

The Training Coordination Center is a part of the NIH’s Big Data to Knowledge (BD2K) initiative, launched in 2012 to transform how science is done. BD2K has 11 Centers of Excellence for Big Data Computing, two of which are at USC: the Big Data for Discovery Science Center with Toga as principal investigator and ENIGMA Consortium with Paul Thompson, PhD, as principal investigator. Stanford University, Harvard University Medical School and UCLA also host Centers of Excellence.

While each Center of Excellence has its own training responsibilities, Big Data U at USC is the only center tasked with harmonizing these efforts into a concerted action.

— Zen Vuong

Original article in HSCNews (hscnews.usc.edu/uscs-big-data-u-will-teach-researchers-to-analyze-biomedical-information/)




Big data science has moved away from a traditional model where a hypothesis is formed by including a single variable in a controlled experiment. (Photo/Mario Klingemann)

USC to teach researchers how to analyze biomedical information -- The nation’s first Training Coordination Center aims to spot trends among enormous amounts of data

October 19th, 2015

Researchers need guidance as they untangle a massive jungle of biomedical data in their search for therapies, prevention techniques and cures to some of today’s most enigmatic diseases.

To assist them, the National Institutes of Health has awarded USC a three-year, $6.3 million grant to build Big Data U, the nation’s first Training Coordination Center aimed at teaching people with different backgrounds how to assemble astronomical amounts of data into compatible and comparable statistics. The goal is to find trends, interesting relationships and clustering effects.

“A lot of the big data we are dealing with haven’t even been collected yet,” said John Van Horn, the project’s lead investigator, associate professor of neurology and education, and director of a new Master of Science in neuroimaging and informatics at the Keck School of Medicine of USC. “It’s still off in the future. What we do now and how we train people to be able to deal with that will prepare us for the time when getting many terabytes worth of data is considered trivial — a relatively small or even ‘cute’ little study.” ... Click here for full article


A scientific revolution

Big data science has moved away from a traditional reductionist model, where a hypothesis is formed and tested by including a single variable in a controlled experiment.

Disorders such as Alzheimer’s disease, estimated to be the third leading cause of death in the United States, according to the NIH, involve intricate components.

Isolating a single variable when it comes to conditions involving the brain may provide one answer, but not necessarily the complete one, said Arthur Toga, a provost professor with joint appointments at the Keck School of Medicine and the USC Viterbi School of Engineering. We're letting the data lead us to the discovery.
- Authur Toga


“We’re letting the data lead us to the discovery. It’s kind of an upside down way of thinking about things,” he said. “Big data allows us to look at all these variables simultaneously and put together a comprehensive picture. Only in concert do they produce the function and structure that you’re trying to understand. If you study only one variable at a time, you may never fully understand how it works.”


What is a Training Coordination Center?

Big Data U, tentatively set to launch in the spring of next year, will be a hybrid of massive open online courses (MOOCs) and YouTube video tutorials. It’s a free resource for anyone who wants a self-guided or semi-structured study of topics relevant to biomedical science. Social media tools will provide ratings for course content and guide the selection of relevant training media.

“We will promote opportunities for big data research rotations, host ‘innovation labs’ for new grant proposal development, develop hackathons and other training activities,” Van Horn said. “Some of these activities will be up to the user to complete, but others will have an expectation of required completion and will entail a report or tangible product.”

Special tools need to be created because traditional ones such as Excel do not scale when astronomical collections of data points have to be crunched, Van Horn said.

The Training Coordination Center is a part of the NIH’s Big Data to Knowledge (BD2K) initiative, launched in 2012 to transform how science is done. The movement harvests biomedical big data to advance science’s understanding of human health and disease.

USC is at the forefront of biomedical big science and hopes to use it to address “wicked problems” — complex, 21st century dilemmas such as Alzheimer’s or traumatic brain injury.

“The purpose of the Training Coordination Center is to coordinate training activities both among the BD2K consortium members and with others engaging in similar efforts,” said Michelle Dunn, NIH senior adviser for data science training, diversity and outreach. “The outreach aspect of the TCC is important because BD2K awardees need to be aware of other efforts, whether funded by NIH or not, in order to make best use of limited funds. In addition to coordination, the TCC will develop resources to enable others to discover educational resources needed for biomedical data science.”

BD2K has 11 Centers of Excellence for Big Data Computing, two of which are at USC: the Big Data for Discovery Science Center with Toga as principal investigator and ENIGMA Consortium with Paul Thompson as principal investigator. Stanford University, Harvard University Medical School and UCLA also host Centers of Excellence.

While each Center of Excellence has its own training responsibilities, Big Data U at USC is the only center tasked with harmonizing these efforts into a concerted action.

Big Data U, which will have a major impact on all 11 of NIH’s Centers of Excellence, will include collaborators from USC’s Information Sciences Institute and the USC School of Cinematic Arts. Participating faculty include José Luis Ambite, Kristina Lerman and Michael Taylor.

“The DC Office of Research Advancement and Steve Moldin at USC were instrumental in obtaining this grant money, as the development of a proposal such as this is enormously complicated,” Toga said. “Their effort places USC in the coveted position of creating a free, online biomedical training center.”


How Big Data U will work

Part of the Training Coordination Center project includes harvesting the Web to automatically organize online resources into an Educational Resource Discovery Index (ERuDite).

Users will create free profiles on Big Data U, which will generate a personalized set of lessons to help scientists and other learners reach their learning goals. Senior investigators such as professors, junior investigators such as postdoctoral students, and graduate and undergraduate students alike will be able to hone their fluency in things such as genomics, the mapping genomes and phenomics — the measuring of physical and biochemical traits in organisms.

The Training Coordination Center will have boot camps, MOOCs, videos and one-off lessons in math, statistics, informatics, computer science and biomedical science — many of which will be created with a spectrum of learners in mind.

An intuitive and intelligent website will identify prerequisites needed before users could graduate to more complex topics and will suggest new topics they may be interested in based on the profiles of learners like them, Van Horn said.

Big Data U will also be a coordination center in that it will advertise training opportunities at any of the BD2K centers and live-stream events. It could even facilitate and partially finance big data-focused mini-projects that last for a few weeks, Van Horn said. Science is now a digital enterprise.
- John Van Horn


“Science is now a digital enterprise,” he added. “Big data sets are pretty much how science is being done. How you share and exchange that data to get as many eyes looking at these sets as possible will lead to new discoveries. It will lead to new insights into disease and hopefully help treat and cure them.”


History of commitment to big data research

In the past decade, USC has shown a commitment to informatics — the science of processing data for storage and retrieval.

In addition to having two BD2K Centers of Excellence, it also has recently advanced two new master’s programs relevant to big data biomedicine and founded the newly named USC Mark and Mary Stevens Neuroimaging and Informatics Institute, which includes the Laboratory of Neuro Imaging and the Imaging Genetics Center. Based on the Health Sciences Campus, the institute will be equipped with state-of-the-art computing systems as well as sophisticated MRI brain imaging systems.

USC’s comprehensive experience in biomedical big data, existing infrastructure and multidisciplinary teamwork mentality will aid in the success of Big Data U. The project is set to run through 2018.

Original article in USCNews (//news.usc.edu/87595/usc-to-teach-worldwide-researchers-how-to-analyze-biomedical-information-at-big-data-u/)