Tether has unveiled QVAC Genesis II, expanding the pool of publicly available datasets for artificial intelligence training with the introduction of 10 new educational domains.
The new launch builds on Tether’s earlier release, QVAC Genesis I, which introduced education-focused synthetic datasets for STEM disciplines across nine educational domains.
QVAC Genesis II extends the dataset into additional fields, including chemistry, computer science, statistics, machine learning, astronomy, geography, econometrics, and electrical engineering.
The latest release makes up 107 billion new tokens, bringing the combined QVAC Genesis I and II datasets to a total of 148 billion tokens across 19 educational domains.
Together, the Genesis I and II releases constitute one of the most comprehensive publicly available synthetic educational datasets.
More Than a Scale Release
QVAC Genesis II is not merely a large-scale data release but an introduction of high-quality datasets “designed to teach models how to think, reason, and explain, grounding intelligence in understanding rather than imitation.”
Paolo Ardoino, CEO of Tether, emphasized the importance of training AI systems on datasets that enable explainability and reasoning.
With this release, we’re pushing beyond volume toward structure, reasoning, and clarity. Intelligence should be built on understanding why something is true, not just predicting what sounds right. By making this dataset open, we’re giving researchers and builders the tools to develop AI that is more reliable, more explainable, and ultimately more useful to society.

Option-Level Reasoning
At the core of QVAC Genesis II’s dataset design is a new generation approach known as Option-Level Reasoning (OLR).
The approach is designed to engage in structural reasoning of every possible option, particularly in a multiple-choice scenario.
Rather than focusing on the outcomes, whether correct or wrong, OLR extracts structured insights from all available options, explicitly providing explanations for common misconceptions.
By doing so, the option-level reasoning approach enables deeper understanding, clarity, and logical consistency, not just surface-level correctness.
Dual-Method Generation Pipeline
The QVAC Genesis training datasets combine the option-level reasoning and the earlier adopted original Failure Analysis in QVAC Genesis I.
Failure Analysis is a systematic approach that generates educational content explaining why incorrect answers fail and how correct solutions can be derived. This method was designed to ensure that AI models trained on Genesis I consistently produce outputs with educational value.
However, the integration of option-level reasoning will help reduce the bias that may be occasioned by failure-focused approaches while increasing dataset diversity and reasoning depth.
Improved Accuracy and Valid Answer Rate
According to independent evaluations, models trained on QVAC Genesis II demonstrated higher reasoning accuracy and produced clearer and more unambiguous responses than those trained on earlier synthetic datasets.
Models trained exclusively using Option-Level Reasoning achieved an average accuracy score of 29.91, compared with 21.76 for models trained using failure analysis alone. When both failure analysis and OL reasoning approaches were combined, the accuracy level of the models improved, reaching an average accuracy of 30.40.
Upon conducting a Valid Answer Rate evaluation, models trained with Option-Level Reasoning achieved a near-perfect 98.44% Valid Answer Rate, indicating strong structural and semantic consistency.
Towards Empowering the Global AI Community
As AI models continue to gain increasing adoption, access to structured, high-quality training datasets is essential in ensuring reliable model behavior and production of accurate results.
The QVAC Genesis II dataset, like its predecessor, is publicly available to support researchers, academic institutions, and independent developers engaging in the research and development of AI models.
By releasing structured and publicly accessible datasets, Tether aims to reduce barriers to innovation and promote access to high-quality AI training resources for the global artificial intelligence community.