GRIIDC Staff Learn Scientific Computing with Python!

white wave effect

GRIIDC software developer Son Nguyen and researcher Inia Soto Ramos participated in the SciPy 2018 conference held in Austin, Texas from July 9-15, 2018. SciPy is a Python-based ecosystem of open-source software mainly used in mathematics, science, and engineering. The SciPy conference gathers a diverse group of people — from computer gurus to researchers — in almost any field you can imagine including business, medical services, climate research, astronomy, and oceanography. So, what attracts such a broad audience of attendees? The answer is data! Handling large and diverse datasets is becoming a priority for most science and engineering fields, as well as large business corporations. For GRIIDC data is everything, so the SciPy conference is a perfect fit for our team!

The conference is divided into three sections: tutorials, presentations, and sprints. Son participated in all sections and had the opportunity to learn about open source projects; software engineering; Dask, which provides tools to operate parallel environments while working with extremely large datasets; and Pandas, which is a powerful package to work with tabular data. Son also had the opportunity to work with the team building the “All of PLOS” Python library, which is designed to handle the repository for all Public Library of Science (PLOS) XML (Extensible Markup Language) article files. Every dataset in GRIIDC has its own XML file with metadata, so this opportunity will allow Son to bring new ideas to our group regarding XML manipulation, storing, and querying.

Inia participated in the tutorials and learned a lot about data manipulation using NumPy for data matrix manipulation; matplotlib for displaying and interacting with data; and Pandas for tabular data. Inia’s daily job is to review large datasets that primarily include matrices from model output or satellite data and tabular data from cruises and experiments. Python’s open source packages and widely diverse online community offer a fantastic tool to work with datasets submitted to GRIIDC. This was Inia’s first computer programming conference and she was fascinated by the diversity of researchers that were present. She is very excited to put her new skills into practice and show the rest of the team how to integrate Python into their daily data management tasks.