AFTA AI Somali Speech & Text Dataset License (SASR-L)
This license governs research-only access to AFTA AI’s Somali speech and text datasets, including audio, transcripts, lexicon data, and derived annotations. By requesting and using the dataset, you agree to the terms below.
1. Definitions & scope
“AFTA AI” refers to the owner of the Somali Speech & Text Corpus and related resources. “Dataset” refers to all Somali audio, transcripts, text corpora, lexicon entries, morphology tables, and annotations provided under this license. “You” refers to the institution or individual granted access.
This license grants a limited, non-exclusive, non-transferable right to use the Dataset strictly for non-commercial research and evaluation, subject to the restrictions below.
2. Permitted research use
- Academic and institutional research on Somali ASR, TTS, NLP, and related fields.
- Model evaluation, benchmarking, and error analysis using the Dataset.
- Internal experiments and prototypes that are not deployed to the public.
- Publication of aggregate results and metrics with proper attribution to AFTA AI.
3. Prohibited uses
You agree not to:
- Redistribute, resell, or sub-license the Dataset or any substantial portion of it.
- Upload the Dataset to public platforms (e.g. HuggingFace, GitHub, public cloud buckets).
- Use the Dataset to train or improve competing Somali foundation models without a separate commercial license.
- Use the Dataset in products, commercial APIs, or customer-facing services.
- Attempt to remove or obfuscate any embedded metadata, watermarks, or data-protection measures.
4. Models, derivatives & ownership
- AFTA AI retains ownership of the Dataset and all intellectual property in it.
- You may train research models on the Dataset, but such models and checkpoints must remain for non-commercial research unless a separate commercial agreement is signed.
- You may publish research papers and results based on experiments with the Dataset, provided that the Dataset itself is not shared and AFTA AI is clearly credited.
5. Storage, security & compliance
- Store the Dataset on secure systems with access limited to the approved research team.
- Implement reasonable technical and organizational measures to prevent unauthorized access.
- Delete or archive the Dataset if required by AFTA AI, applicable law, or your own institutional policies.
- Comply with all applicable data-protection, privacy, and export-control regulations in your jurisdiction.
6. Audit and reporting
AFTA AI may reasonably request a description of how the Dataset is stored and used, and may request logs or summaries to verify compliance with this license. You agree to cooperate in good faith with such requests.
7. Term & termination
- This license remains in effect until terminated by either party. AFTA AI may terminate the license if you breach any of its terms or if legal or policy changes require revocation.
- Upon termination, you must delete all copies of the Dataset and any direct derivatives that would allow re-construction of the original data.
8. Disclaimer & limitation of liability
The Dataset is provided “as is,” without warranty of any kind, express or implied. AFTA AI does not guarantee accuracy, completeness, or fitness for a particular purpose. To the maximum extent permitted by law, AFTA AI is not liable for any damages arising from your use of the Dataset.
9. Contact & commercial licensing
For research questions or clarifications about this license, contact data@aftaai.com.
For commercial usage, model training rights, or large-scale deployments based on the Dataset, contact enterprise@aftaai.com.
In any conflict between this web summary and a signed written agreement with AFTA AI, the signed agreement will prevail.