Like in all scientific disciplines research data in mathematics has become vast, it is complex and multifaceted, and, through the successful application of mathematics in interdisciplinary research, it is widespread in the scientific landscape. It ranges from information bases such as the standard reference data for special functions, tables and similar mathematical objects to highly complex...
Within the collaborative research center "CRC 1456 - Mathematics of Experiment" of the German Research Foundation (DFG) several research groups from the natural sciences and mathematics jointly work on measurement and extracting the most information from them. These measurement data come from different types of measurements ranging from nanoscale imaging to observations of the Sun. As the...
Numerical algorithms and computational tools are essential for managing and analyzing complex data processing tasks. With increasing meta-data awareness and parameter driven simulations, the demand for reliable and automated workflows to reproduce computational experiments across platforms has grown.
In general, computational workflows describe the complex multi-step methods that are used...
Dark data is data that is poorly managed [1, 2]. It is diametrically opposed to FAIR data because its epistemic status is unclear, and it is neither findable, accessible, interoperable, nor reusable. For example, research data may be uncurated, unavailable, unannotated, biased, or incomplete. Examples of dark data in scientific computing include the vast amounts of data that are held...
Computer experiments are becoming an essential part of pure math fields, such as combinatorics, commutative algebra and algebraic geometry. We discuss the arising challenges and the work of the task area on computer algebra of MaRDI.
It is often difficult to reproduce computational experiments from papers due to a lack of detailed in how such experiments are documented. Even when researchers publish their code along side a paper, key information is often not well documented: *What version of an external software library was used? What value should be given to an undocumented model parameter? Which specific version of the...
Ontologies store semantic knowledge in a machine-readable way and represent domain knowledge in controlled vocabulary. Scientific results often are published in text form, thus discouraging research data FAIRness. Using natural language processing (NLP), concept names and relations can be extracted from text datasets.
A workflow to process scientific textual text corpora is introduced...
We consider graph modeling for a knowledge graph for vehicle development, with a focus on
crash safety. An organized schema that incorporates information from various structured and
unstructured data sources is provided, which includes relevant concepts within the domain. In
particular, we propose semantics for crash computer aided engineering (CAE) data, which enables
searchability,...
With a strong reliance on research software projects in both industry
and for scientific simulations, research software sustainability is
increasingly becoming a major point of contention. A necessary but
nonsufficient aspect of software sustainability is Continuous
Integration and Benchmarking (CI/CB/Cx). In addition, software
flexibility to support newer HPC hardware as well as modern,...
Machine learning research should be easily accessible and reusable. OpenML is an open platform for sharing datasets, algorithms, and experiments. mlr3 is an open-source collection of R packages providing a unified interface for machine learning in the R language. One of the projects in the MaRDI task area 3 (statistics and machine learning) was the interface package mlr3oml which allows for...
With the complexity of the involved algorithms and software packages, reproducibility of numerical simulations is often difficult to achieve. This makes it harder to collaborate on research projects, since there can be a considerable ramp-up time for new project members before they are able to contribute to a joint code base. Julia is a modern, dynamic programming language designed for...
In my talk, I will present the library deal.II, an open-source software aiming at the rapid development of simulation codes for partial differential equations based on the finite element method. The guiding principle of deal.II is to provide functions for the main building blocks in a solver that a user code can then combine and extend in an application-specific way. I will then give insight...
The development of emerging extreme-scale architectures with higher performance potential provides developers of application codes, including multiphysics modeling, and the coupling of simulations and data analytics, unprecedented resources for larger simulations achieving more accurate solutions than ever before. Achieving high performance on these new heterogeneous architectures requires...
[preCICE ][1] is an open-source coupling software for partitioned multi-physics and multi-scale simulations. Thanks to the software's library approach (the simulations call the coupling) and its high-level API, only minimally-invasive changes are required to prepare an existing (legacy) simulation software for coupling. Moreover, ready-to-use adapters for many popular simulation software...
Convex hull computations are an essential part of many scientific calculations. We present an experiment written in Julia involving convex hull computations done with two different types of data, floats and rationals. A comparison of the results shows that using floats leads to the loss of the combinatorics of the experiment.
Reproducible research results are vital to safeguard scientific quality assurance and to build a reliable foundation for sustainable research. The discussions on this issue accelerated when investigations on reproducibility showed that few scientific publications across many research fields allow for reproducing the published results. This reproducibility crisis is well known within the...