Beschreibung
4 talks, 20+5 each
Präsentationsmaterialien
Numerical algorithms and computational tools are essential for managing and analyzing complex data processing tasks. With increasing meta-data awareness and parameter driven simulations, the demand for reliable and automated workflows to reproduce computational experiments across platforms has grown.
In general, computational workflows describe the complex multi-step methods that are used...
Dark data is data that is poorly managed [1, 2]. It is diametrically opposed to FAIR data because its epistemic status is unclear, and it is neither findable, accessible, interoperable, nor reusable. For example, research data may be uncurated, unavailable, unannotated, biased, or incomplete. Examples of dark data in scientific computing include the vast amounts of data that are held...
Computer experiments are becoming an essential part of pure math fields, such as combinatorics, commutative algebra and algebraic geometry. We discuss the arising challenges and the work of the task area on computer algebra of MaRDI.
It is often difficult to reproduce computational experiments from papers due to a lack of detailed in how such experiments are documented. Even when researchers publish their code along side a paper, key information is often not well documented: *What version of an external software library was used? What value should be given to an undocumented model parameter? Which specific version of the...
Ontologies store semantic knowledge in a machine-readable way and represent domain knowledge in controlled vocabulary. Scientific results often are published in text form, thus discouraging research data FAIRness. Using natural language processing (NLP), concept names and relations can be extracted from text datasets.
A workflow to process scientific textual text corpora is introduced...
We consider graph modeling for a knowledge graph for vehicle development, with a focus on
crash safety. An organized schema that incorporates information from various structured and
unstructured data sources is provided, which includes relevant concepts within the domain. In
particular, we propose semantics for crash computer aided engineering (CAE) data, which enables
searchability,...
Machine learning research should be easily accessible and reusable. OpenML is an open platform for sharing datasets, algorithms, and experiments. mlr3 is an open-source collection of R packages providing a unified interface for machine learning in the R language. One of the projects in the MaRDI task area 3 (statistics and machine learning) was the interface package mlr3oml which allows for...
With the complexity of the involved algorithms and software packages, reproducibility of numerical simulations is often difficult to achieve. This makes it harder to collaborate on research projects, since there can be a considerable ramp-up time for new project members before they are able to contribute to a joint code base. Julia is a modern, dynamic programming language designed for...
Convex hull computations are an essential part of many scientific calculations. We present an experiment written in Julia involving convex hull computations done with two different types of data, floats and rationals. A comparison of the results shows that using floats leads to the loss of the combinatorics of the experiment.
Reproducible research results are vital to safeguard scientific quality assurance and to build a reliable foundation for sustainable research. The discussions on this issue accelerated when investigations on reproducibility showed that few scientific publications across many research fields allow for reproducing the published results. This reproducibility crisis is well known within the...