The model is a causal language model trained to predict the next token based on its input. The model was trained using the multilingual BLOOM model and the CLP-Transfer method (Ostendorff and Rehm, 2023) on approx. 50 billion German tokens. The model checkpoint can be freely downloaded at Huggingface.
OpenGPT-X is part of the Gaia-X infrastructure and funded by the German Federal Ministry for Economic Affairs and Climate Action.