Newswise — Computing is one of the least diverse science, technology, engineering, and mathematics (STEM) fields, with an underrepresentation of women and minorities, including African Americans and Hispanics. Leveraging this largely untapped talent pool will help address our nation’s growing demand for data scientists. Computational approaches for extracting insights from big data require the creativity, innovation, and collaboration of a diverse workforce.

As part of its efforts to train the next generation of computational and computer scientists, this past summer, the Computational Science Initiative (CSI) at the U.S. Department of Energy’s (DOE) Brookhaven National Laboratory hosted a diverse group of high school, undergraduate, and graduate students. This group included students from Jackson State University and Lincoln University, both historically black colleges and universities. The Lincoln University students were supported through the National Science Foundation’s Louis Stokes Alliances for Minority Participation program, which provides research and other academic opportunities for minority students to advance in STEM. Two of the students are recipients of prestigious fellowship programs: the Graduate Education for Minorities (GEM) Fellowship, through which qualified students from underrepresented minorities receive funding to pursue STEM graduate education; and the DOE Computational Science Graduate Fellowship (CSGF), which supports doctoral research using mathematics and computers to solve problems in many scientific fields of study, including astrophysics, environmental science, and nuclear engineering.

“To address challenges in science, we need to bring together the best minds available,” said CSI Director Kerstin Kleese van Dam. “Great talents are rare but can be found among all groups, so we reach out to the broadest talent pools in search of our top researchers at every education level and career stage. In return, we offer them the opportunity to work on some of the most exciting problems with experts who are pushing the state of the art in computer science and applied mathematics.”

Pursuing diverse research topics

The students’ research spanned many areas, including visualization and machine learning techniques for big data analysis, modeling and simulation applications, and automated approaches to data validation and verification.

Quentarius Moore, who graduated this past spring from Jackson State University with a master’s degree in chemistry, spent five weeks implementing an electron correlation model in a computational chemistry code called NWChem for an ongoing DOE Exascale Computing Project, NWChemEx: Tackling Chemical, Materials and Biomolecular Challenges in the Exascale Era. In the fall, he will begin his doctoral studies in chemistry at Texas A&M University through DOE’s CSGF. Unlike most other students, Moore did not come to Brookhaven through a formal internship program—he was connected with computational chemist Hubertus van Dam after reaching out to Robert Harrison and Barbara Chapman, both experts in high-performance computing who hold leadership positions at Brookhaven Lab and teach at nearby Stony Brook University.

“I was born and raised in Jackson, Mississippi, and opportunities like conducting world-class research are scarce among the people I know and underrepresented groups in general,” said Moore. “I had never heard about Brookhaven or the national lab system, but now I hope to help minority students seek similar learning experiences.”

Stony Brook University undergraduatestudent Raffaele Miceli—a Science Undergraduate Laboratory Internships (SULI) program intern sponsored by the DOE Office of Science’s Office of Workforce Development for Teachers and Scientists (WDTS)—applied computer graphics to high-energy physics, including visualizing the potential energy of the Higgs field in beyond the Standard Model of particle physics and dark matter models. He was subsequently hired as a student assistant.

Four students joined a CSI team that is investigating methods and devices to perform computations on streaming data while they are in transit. Shilpi Bhattacharyya, a doctoral student in computer science at Stony Brook University, was hired as a student assistant to continue building a virtual environment for this “analysis on the wire” project.

“Having become quite fond of the novelty and challenges of analysis on the wire, Shilpi now wants to pursue her dissertation research on a related topic,” said mentor Dimitrios Katramatos, a technology architect who is part of the CSI team working on the project.

“I think CSI is an awesome place for computer scientists,” said Bhattacharyya, who will continue contributing to the project as a research assistant. “I am more confident, disciplined, focused, and motivated because I got the real feel of a research environment here. Talent and hard work is valued at Brookhaven Lab. I never felt any different as a woman pursuing computer science. Gender does not come into the picture at all.”

Undergraduate interns Alya Boumiza, a mathematics major at City University of New York Borough of Manhattan Community College; Cole Lewis, a computer engineering major at South Plains College; and Adam Martin, a computer science major at South Plains College had coordinated assignments to address the main challenge of analysis on the wire: efficiently plugging in and running a streaming algorithm. They collaborated to select and modify a suitable algorithm and examined ways to use hardware accelerators.

Joining the big data conversation

In addition to carrying out their research projects and presenting them during a closing ceremony at Brookhaven, all of the students had the opportunity to attend the CSI-led New York Scientific Data Summit(NYSDS) that was held at New York University from Aug. 7 through 9. This annual conference brings together data experts, scientists, application developers, and end users from national labs, universities, technology companies, utilities, and federal and state governments to share ideas for unlocking insights from scientific big data.

The students submitted papers to the conference and discussed their research with U.S. data science leaders during a poster session. Three students also presented their research in a talk: Ziqiao Guan, a doctoral student in computer science at Stony Brook University; Ronald Lashley, who graduated in May 2017 from Lincoln University with an undergraduate degree in computer science and a minor in visual arts; and Nicole Meister, a high-school student and participant in the Simons Summer Research Program at Stony Brook University. These students were part of a multi-organizational team involving Brookhaven Lab, Lincoln University, and the New Jersey Institute of Technology (NJIT) that designed a deep learning–based image classification software for analyzing the x-ray scattering images produced by scientists at the National Synchrotron Light Source II (NSLS-II)—a DOE Office of Science User Facility at Brookhaven. Each day at NSLS-II, up to four terabytes of images are generated. Approximately 50,000 trees made into paper would be needed to print out one terabyte of data. Classifying the images through deep learning—a type of machine learning in which the features important to classification, say symmetry or orientation, are automatically extracted from raw data—helps scientists recognize patterns in their samples, infer materials’ physical properties, and make decisions for follow-on experiments.

“The students more than held their own at such an in-depth scientific event,” said Kleese van Dam.

Computer scientist Dantong Yu, who holds a guest appointment in CSI’s Computer Science and Mathematics Department at Brookhaven Lab and serves as an associate professor at NJIT’s Martin Tuchman School of Management, mentored the students.

“I was very impressed with Nicole—after an introductory workshop on deep learning, she completed assigned tasks with minimal supervision and guidance. Her presentation at the NYSDS prompted many questions from the attendees on how the work could be applied to new areas,” said Yu. “Similarly, after quickly learning high-performance parallel computing, Ron applied a parallel programming paradigm to resolve the large memory footprint of our initial algorithm. Because of his contribution, our machine learning pipeline can directly process large images—a capability that ultimately enhances the pipeline’s prediction accuracy.”

For the students, NYSDS not only provided them with the opportunity to present their research and network with the larger data science community but also exposed them to current research topics. This year’s conference focused on streaming data analysis, autonomous experimental design, interactive exploration of petascale data, and performance for big data.

“While it is challenging to pursue computer science in a male-dominated environment, I was extremely lucky to work with colleagues who were very responsive to my questions,” said Meister. “Machine learning was a fairly new concept to me, so I had to overcome a steep learning curve. Presenting my research at the NYSDS was a surreal experience, and it was fascinating to see what other people in the field were working on. This research opportunity has sparked my interest in machine learning and inspired me to continue working in this area of computer science.”

Recruiting the next generation of researchers

By 2018, 51 percent of all STEM jobs are expected to be in computer science–related fields. Filling these jobs with qualified graduates will require attracting women and underrepresented minorities early on, engaging them throughout their education so that they maintain interest.

“Concentrating on my research during the summer while educating my students to be the next generation of researchers is an experience I can’t receive from my home institution,” said Bo Sun, an associate professor of computer science at Rowan University (formerly of Lincoln University) who performed research at CSI this summer with her Lincoln students through the DOE’s Visiting Faculty Program. This program brings together faculty members and students at institutions historically underrepresented in the research community with DOE laboratory research staff to collaborate on projects of mutual interest.

Summer is not the only time such connections are made. For example, Barbara Chapman—chair of CSI’s Computer Science and Mathematics Department at Brookhaven Lab and professor of applied mathematics and statistics at Stony Brook University’s Institute for Advanced Computational Science—has an ongoing collaboration with Prairie View A&M University, another historically black university, through a grant. This relationship began several years ago when one of Chapman’s former graduate students joined the faculty, who have expertise in big data and cloud computing.

Collaborative research is among the many ways to connect with women and members of underrepresented groups. Conferences and workshops, along with professional development opportunities, also play a critical role in the recruitment and retention of these groups by introducing them to cutting-edge research and role models in the field.

In addition to NYSDS, CSI participates in several events, including DOE’s CSGF Annual Program Review. Attended by program alumni, DOE staff, faculty and other members of the fellowship community, and congressional staff, this conference features the DOE Laboratory Poster Session—at which DOE labs showcase their computational science research and employment opportunities—and the Fellows’ Poster Session.

CSI has also participated in the Association for Computing Machinery–sponsored Tapia Conferences, which bring together undergraduate and graduate students, faculty, researchers, and professionals in computing from all backgrounds and ethnicities.

Last year, Kleese van Dam was a panelist at the 2016 Early- and Mid-Career Mentoring Workshops hosted by the Computing Research Association–Women, whose mission is to increase the number of women who advance to the top career tracks in education, research, industry, and government. In her panel talks, she described which research paths are available in industry and government labs and how to negotiate on the job and during salary discussions. 

In November, CSI will again be participating in the international SuperComputing Conference for high-performance computing, networking, storage, and analysis. Staff from CSI will have a table at the student and postdoc job fair, where they will discuss internships, fellowships, assistantships, and permanent employment opportunities. 

“Unfortunately, our applicant pool is not as diverse as we would like, and so we are always looking for ways to reach out to members of underrepresented groups and encourage them to consider a career with CSI,” said Kleese van Dam. “By raising our visibility among all talented groups through these various outreach efforts, we hope to increase diversity within CSI so that we are fully equipped to solve the scientific data challenges of today and tomorrow.”  

The Office of Educational Programs (OEP) manages Brookhaven Lab’s student and teacher programs, which are primarily funded by DOE’s Office of Science, other DOE offices, Brookhaven Science Associates, and other federal and nonfederal agencies. To learn more about these programs and to apply, please visit the OEP website. 

Brookhaven National Laboratory is supported by the Office of Science of the U.S. Department of Energy. The Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.