Artificial intelligence, a game-changer for studying stellar nurseries

November 19, 2020

Star formation is a complex riddle because molecular hydrogen, the material that forms stars, cannot be directly detected at the low temperatures that prevail in the molecular interstellar medium. Radio-astronomers and signal and image processing experts have recently joined forces to develop new techniques that shed light into this long-standing astrophysical problem. By applying a new pluri-disciplinary approach to a state-of-the art mapping survey of 20 molecules in the famous Orion molecular cloud with the IRAM 30-meter telescope, the ORION-B team has mapped the star-forming material in the Orion Nebula with unprecedented precision, and revealed other hidden physical parameters that control the formation of stars. Their results have been published today in Astronomy & Astrophysics.

For more than half a century, astronomers have attempted to decipher the physical processes that control the formation of stars. Stars are born in molecular clouds that are primarily composed of molecular hydrogen gas. But at the very low temperatures typical of molecular clouds (-250°C), molecular hydrogen cannot be detected. Jérôme Pety, researcher at IRAM comments: “It's a bit like the water in an aquarium. We know it is there in large quantities and that it is important for the fish. But we don't see it. It is transparent. When we want to study currents, we have to add a dye in very small quantities and mix it with the water. It is said that the dye traces the movements of the water.” While molecular hydrogen at low temperature cannot be detected, molecular clouds contain other tracers, such as dust (1% of the cloud mass) and other molecules present in minuscule amounts. The most commonly used tracer is carbon monoxide (CO). However, its concentration is about one molecule per 10,000 hydrogen molecules and its mixing with hydrogen is imperfect. CO therefore only provides a first estimate of the amount of gas present in molecular clouds.

Carbon monoxide emission in the Orion B giant molecular cloud as observed with the 30-meter telescope. Credits: J. Pety, the ORION-B Collaboration, IRAM.

The idea of combining CO with other observable tracer molecules to obtain a more accurate estimate of the amount of molecular hydrogen has been proposed before, but previous efforts were defeated by the complexity of the physical and chemical processes that link the different molecules together. The secret of success turns out to be a totally new approach, combining analysis methods from artificial intelligence and the capacity of the IRAM 30-meter telescope to map a few tens of molecules simultaneously over fields of view covering 25 times the moon area.

Laboratoire d’Astrophysique de Bordeaux researcher Pierre Gratier says: "We have shown that a machine learning algorithm called "random forests" can reveal the relationship that binds the different observable molecules to the total quantity of gas. It is thus possible to build a reliable and precise estimator for the quantity of molecular hydrogen using the emission of a reduced set (between 5 and 10) of different tracer molecules". Indeed, the precision increased by more than a factor 10, or, with other words, error on this quantity is reduced from 300% to 20%.

Accurately determining the amount of molecular hydrogen gas available to form stars is only the very first step for these studies. While molecular clouds are mostly neutral, the existence of free electrons in the gas is crucial to the molecular cloud’s evolution. Due to the presence of free electrons, the gas becomes sensitive to the presence of the cloud's global magnetic field, which channels the gas motions during the gravitational collapse that ultimately leads to star formation. These electrons are extremely rare - there is only one electron per 10 million hydrogen molecules - and the electron fraction of a molecular cloud cannot be measured directly. To overcome this challenge, the team used complex astrophysical models to explore thousands of different possible scenarios, each scenario producing thousands of potentially observable quantities. Emeric Bron, a researcher at the Paris Observatory, applied a machine learning approach to the massive dataset of model results, allowing him to identify the observational tracers that encode precious information on the electron fraction. The new tracers revealed by this work will allow astronomers to estimate the electron fractions in more than 60% of a giant molecular cloud like Orion B. In comparison, the traditional tracers used so far could only be observed in around 2% of such a cloud, in its very densest regions.

Once a theoretical relationship between an observation and a key physical parameter is revealed, astronomers face an additional challenge when applying it to actual data. The astronomical signals are quite faint. When trying to interpret these data using physical models, the noise often blurs the conclusions. In a third study, Antoine Roueff from Institut Fresnel uses information theory to precisely separate the physics from the artifacts caused by noise. This is essential because it allows researchers to know if they need to keep working to extract more information or if the data is too noisy to answer the question being asked.

The IRAM 30-meter telescope in the Spanish Sierra Nevada. Credits: IRAM

These three studies illustrate how closer links between astrophysicists and experts in signal processing and machine learning can lead to new powerful tools to study the birthplace of stars. Jérôme Pety comments: "As well as being essentially invisible, the places where stars are born are complex systems. Faced with a system of such complexity, subject to the vagaries of its environment and history, a complete causal and deterministic understanding is not possible. We must now seek to understand the evolution of molecular clouds towards the formation of new stars and their planets using statistical laws. Our project has precisely the enormous amount of data needed to identify these statistical laws, but it thus also requires going beyond classical data analysis methods in astronomy.”

For Pierre Chainais, a professor at CRIStAL laboratory, astrophysical problems are incredibly motivating because they push signal processing theories to their limits, in situations where no ground truth is available for validation. "They force us to create the tools of tomorrow to make predictions with the required level of confidence.” Jocelyn Chanussot, professor at GIPSA-LAB, confirms: “Machine learning techniques like deep neural networks have proved to be very efficient methods to predict quantities based on observations, such as the amount of molecular hydrogen from the emission of tracer molecules in this project. However, we now need to understand how they work in depth. Astrophysical applications are excellent for this purpose because they have both a high degree of complexity and a rigorous physical description. We will continue this successful joint venture to provide new insights in designing artificial intelligence algorithms that take into account the underlying physics.”

From an astrophysical viewpoint, these new methods not only deliver more accurate results about the amount of interstellar gas available to form stars but they also reveal new physical diagnostics, such as the electron fraction, to study the evolution of star-forming regions. While these first studies have been prototyped in nearby star-forming regions such as the famous Orion Cloud, the methods can eventually be applied to study star formation in nearby and distant galaxies, and even in the early universe.


Further information

An international team led by Jérôme Pety, Maryvonne Gerin, and Franck Le Petit, obtained the most complete millimeter-wave observations of the Orion cloud. This IRAM large program, named ORION-B (Outstanding Radio-Imaging of OrioN B), produced 240,000 images of 1100 x 750 pixels (enough data to make a 2h15 movie at 24 frames per second). ORION-B project website

The interdisciplinary collaboration with Pierre Chainais, Jocelyn Chanussot, and Antoine Roueff has been built thanks to a joint CNRS research programme named astro-informatique.

Link to the CNRS press release (in French).



Jérôme Pety (IRAM)
Jocelyn Chanussot (GIPSA-LAB)
Pierre Chainais (CRIStAL Laboratory)
Pierre Gratier (Laboratoire d’Astrophysique de Bordeaux )

Karin Zacher (IRAM)


Quantitative inference of the H2 column densities from 3 mm molecular emission: A case study towards Orion B, Pierre Gratier, Jérôme Pety, Emeric Bron, Antoine Roueff, Jan H. Orkisz, Maryvonne Gerin, Victor de Souza Magalhaes, Mathilde Gaudel, Maxime Vono, Sébastien Bardeau, Jocelyn Chanussot, Pierre Chainais, Javier R. Goicoechea, Viviana V. Guzmán, Annie Hughes, Jouni Kainulainen, David Languignon, Jacques Le Bourlot, Franck Le Petit, François Levrier, Harvey Liszt, Nicolas Peretto, Evelyne Roueff, Albrecht Sievers. Accepted for publication in A&A, Nov. 2020.

Tracers of the ionization fraction in dense and translucent gas: I. Automated exploitation of massive astrochemical model grids, Bron, Emeric; Roueff, Evelyne; Gerin, Maryvonne; Pety, Jérôme; Gratier, Pierre; Le Petit, Franck; Guzman, Viviana; Orkisz, Jan H.; de Souza Magalhaes, Victor; Gaudel, Mathilde; Vono, Maxime; Bardeau, Sébastien; Chainais, Pierre; Goicoechea, Javier R.; Hughes, Annie; Kainulainen, Jouni; Languignon, David; Le Bourlot, Jacques; Levrier, François; Liszt, Harvey Öberg, Karin; Peretto, Nicolas; Roueff, Antoine; Sievers, Albrecht. Accepted for publication in A&A, Nov. 2020.

C18O, 13CO, and 12CO abundances and excitation temperatures in the Orion B molecular cloud: An analysis of the precision achievable when modeling spectral line within the Local Thermodynamic Equilibrium approximation, Roueff, Antoine; Gerin, Maryvonne; Gratier, Pierre; Levrier, Francois; Pety, Jerome; Gaudel, Mathilde; Goicoechea, Javier R.; Orkisz, Jan H.; de Souza Magalhaes, Victor; Vono, Maxime; Bardeau, Sebastien; Bron, Emeric; Chanussot, Jocelyn; Chainais, Pierre; Guzman, Viviana V.; Hughes, Annie; Kainulainen, Jouni; Languignon, David; Le Bourlot, Jacques; Le Petit, Franck Liszt, Harvey S.; Marchal, Antoine; Miville-Deschenes, Marc-Antoine; Peretto, Nicolas; Roueff, Evelyne; Sievers, Albrecht. Accepted for publication in A&A, Nov. 2020.

Additional material and related research

Discover the IRAM 30 meter telescope!
Beyond the appearances: The anatomy of the Orion Jedi revealed by radio-astronomy
Filaments around the Horsehead Nebula are still too young to form stars
Zooming into the skin of the Orion hunter