Speaker
Description
Over the last decade, many European Photon and Neutron (PaN) facilities have adopted open data policies, making their data available for the benefit of the entire scientific community. This open data has a huge potential to be used for machine learning training, if and only if it is machine-accessible and FAIR.
To try and understand where we stand in the PaN community regarding the ML-readiness of our open data, we have organised a workshop in October 2023 at the SOLEIL synchrotron. This workshop included both sides of the table: data producers and data consumers.
In this presentation, we will present the status, challenges and opportunities identified during the workshop. We will also present the roadmap that emerged from these discussions, outlining a practical plan to improve our data policies, data and metadata quality checks, and current acknowledgment systems to be more tailored to ML applications.
Finally, we will highlight the potential impact of the roadmap on creating a more collaborative and efficient research environment where open data and ML work hand in hand.