Diamond’s software architecture has evolved over the last twenty years to facilitate world-leading science across the facility, balancing competing demands for flexibility and high-throughput automated experiments. It has nevertheless evolved organically and now displays significant technical debt and limitations and was not designed to cope with the demands of the flagship beamlines, modern data access requirements, user expectations for remote access and the new operational regimes of Diamond-II.
Diamond’s software and computing systems extend in time from when a “Diamond User” applies for beamtime, through pre-experiment planning, the lifecycle of conducting the experiment and post-experimental analysis. It extends from instrument control, experiment management and data analysis to deposition of data and metadata into the data archive and catalogue.
Significant advancement of Diamond's software, controls and computing capability will be required to extract the most from the Diamond-II upgrade, maximising the scientific opportunities it will afford and the knowledge which can be gained. Experiments will become more complex and will be conducted with both higher spatial and temporal resolutions. Developments will be needed to:
Handle ever increasingly faster detectors, and deliver rapid data processing and reduction.
Support greater automation of experiments and the automation of data reduction and analysis.
Introduce and develop new data processing techniques, including the exploitation of recent developments in Artificial Intelligence and Machine Learning toolkits.
Provide a more open software environment, facilitating a greater level of collaboration between software and scientist.
Address obsolescence and bring modernisation to the beamline software stack.
Adapt for the changing needs and expectations of Diamond's users.
The software, controls and computing project will take full advantage of modern technologies. Be that leveraging the scientific Python ecosystem at every stage of an experiment from data acquisition through data analysis, to facilitating streamlined coherent web-based user interfaces for component control right up to data visualisation and manipulation. Developments will be focused across key core development areas, identified by Diamond's Scientific Software, Controls and Computation department during the Summer of 2021 in a series of internal workshops, and explored in greater depth since:
High Performance Sample Stages: Some beamlines will experience a hugely significant increase in flux with very high detector frame rates. Sample stages need to accommodate these rates by moving fast enough to conduct the necessary scans. This will include faster servo loops in the motion controller. Next generation motor controllers will play a key role in delivery whilst new high-performance sample stages will require fast control of detector triggers.
Detector Readout, Data Compression and Reduction: Detector data rates are continuing to increase significantly. Developments will be needed to stream data to data-consumers and for detectors to write directly to memory data processors, without file systems which are becoming increasingly difficult to integrate reliably with high-speed detectors.
Modernisation of Data Acquisition Software Framework: The primary interface for users of Diamond is the data acquisition framework. It provides users with a science perspective for a given instrument. This framework both orchestrates experiments and manages data collection. Diamond-II users will have more diverse requirements and the framework must adapt to support complex, multimodal data acquisition. Real-time visualisation and reconstruction will be key, supporting new inexperienced users. Developments will move away from the existing solution to a modern service architecture and web based paradigm to address obsolescence and provide greater usability and integration. This will also enable better support for remote operations.
Science Specific Data Analysis Software Developments: Addressing limitations in the existing software for delivering computationally intensive science, applications will be accelerated to the benefit of the Ptychography, Tomography and Macromolecular Crystallography (MX) science domains. Speedier and robust algorithms, coupled with usability enhancements will benefit Ptychography and Tomography experiments, with a migration to in-memory architecture and the use of GPU/FPGA accelerations bringing the needed step change in performance to MX. All domains will benefit from real-time analysis and automatic processing of data through analysis pipelines.
Data Archiving: Diamond's current data archiving solutions will not scale effectively to cope with the order of magnitude change to data volumes expected with Diamond-II. For Diamond-II, the ingression capabilities of the archive need to be enhanced, considering the amount of data being stored and how it is moved in and out of the archive and accessed by consumers. This will be a key enabler of better data processing services, and the increasing need for FAIR and Open Data.
Post-visit Data Analysis Services: Transferring data from Diamond to home institutions for processing presents existing Diamond users with significant challenges: issues with inadequate computing resources and software for the size and complexity of their data. Increasingly, new users expect to see results and not the raw data alone. Post-visit support must be provided across all science domains, not just the computationally intensive science domains which have historically benefited from this provision. This couples with the expectation for users to fully remotely process and evaluate their data at Diamond.
User Administration and Information Management: Automated remote sessions will become the increasing norm. The metadata from the proposal process will need to be integrated with session allocation, sample registration and logistics, processing pipelines and the resulting visualisation and analysis of experimental results. Users will then be able to perform data mining, data analysis and have access to enhanced search capabilities to fully exploit the value stored within metadata repositories. These changes will in part be realised by greater integration of Laboratory Information Management Systems and User Administration into the Data Catalogue.
The software, controls and computing project will provide the underlying capability needed to support the machine and the beamlines. Whilst the ultimate ambition is to harness the brightness of the new Diamond-II machine and enable new and exciting flagship beamline capabilities, this project will gradually realise a continuous stream of incremental benefits to Diamond before the Diamond-II era, reducing technical debt and addressing critical obsolescence, unlocking new capabilities deployed with greater flexibility and extensibility.
Diamond Light Source® and the Diamond logo are registered trademarks of Diamond Light Source Ltd
Registered in England and Wales at Diamond House, Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom. Company number: 4375679. VAT number: 287 461 957. Economic Operators Registration and Identification (EORI) number: GB287461957003.