High quality, real world speech datasets

African Speech Collection Datasets

We spent the last 2.5 years creating proprietary speech collection datasets. During that time we perfected our process and decided to create a collection of our own, now available under licence agreement.

We decided for our first off-the-shelf collection to focus on under represented languages within the AI space. With strong foundations in Africa, we are invested in its future and incredible potential.

This speech data collection was planned, collected, annotated and curated with natural language processing best practice in mind.

Speech Collection for AI training

Off-the-shelf

African speech datasets available for purchase

For samples and for more information about our collections, select the language of your choice from the options below.

Afrikaans Call Recording
Scottish Accented English Speech Collection
Afrikaans Call Recording
Scottish Accented English Speech Collection

Why Use Way With Words

99%+ accurate transcripts

99%+ Accurate

We produce highly accurate transcripts.

We deliver on time

On Time

We complete your transcripts on time.

We are Data Compliant

Data Compliant

We are fully GDPR and DPA 2018 Compliant.

Client Support

Priority Support

We answer all your questions as a priority.

99%+ accurate transcripts

99%+ Accurate

We produce highly accurate transcripts.

We deliver on time

On Time

We complete your transcripts on time.
We are Data Compliant

Data Compliant

We are fully GDPR and DPA 2018 Compliant.
Client Support

Priority Support

We answer all your questions as a priority.

CONTACT SALES

Frequently Asked Questions about our

Speech Collection Services

How are your dataset recordings structured?

Our off-the-shelf dataset collections comprise of unscripted, natural conversations that are conducted by call recorders recruited, trained, and approved to simulate real-world conversations in common domains. This means recordings and transcripts include routine security verifications such as ID, email, and phone number validation.

How do you recruit for Speech Collection datasets?

Our priority is to create datasets that are unbiased and cover as wide a range of demographics as possible. This is the first consideration when we begin the planning and recruitment process of any Speech Collection dataset project. 

What kind of agreement is in place for the purchase of this Speech Collection dataset?

A Licence Agreement governs the sale and usage of this Speech Collection dataset. Our off-the-shelf options are available for clients to test and benchmark before larger, more custom commitments can be considered that are better suited to client requirements and conventions.

Why consider Way With Words for Speech Collection datasets?

Way With Words has produced thousands of hours of bespoke Speech Collection datasets, which are unfortunately not available under Licence Agreement. This off-the-shelf dataset was created to evidence our abilities as we believe we can offer tremendous value on custom collections delivered exactly to client specification.