High quality, real world speech datasets
Speech Collection Datasets
We create speech datasets including transcripts for machine learning purposes. Our service is used for technologies looking to create or improve existing automatic speech recognition models (ASR) using natural language processing (NLP) for select languages and various domains.
Each dataset can be created according to dialect, demographics, domain or any other required conditions.
Speech datasets for select languages and industries are available, or bespoke speech collection projects available on request.
Why Use Way With Words
99%+ Accurate
We produce highly accurate transcripts.
On Time
We complete your transcripts on time.
Data Compliant
We are fully GDPR and DPA 2018 Compliant.
Priority Support
We answer all your questions as a priority.
99%+ Accurate
We produce highly accurate transcripts.
On Time
Data Compliant
Priority Support
Speech Collection Process
How it works
Our speech collection process can be customised to suit your needs.
STEP 1
Request a speech dataset
Submit your speech dataset requirements using our form below. We review your request and send a proposed job and price plan for approval.
STEP 2
We create a speech dataset
On acceptance, we proceed with your job which involves recording your required speech and transcribing it.
STEP 3
Receive speech dataset
On completion, or at agreed intervals, we transfer your speech datasets (recordings with completed transcripts) to you.
Frequently Asked Questions about our
Speech Collection Services
Who uses your Speech Collection service?
Our Speech Collection service is available to clients that want to create or improve existing automatic speech recognition models. Off-the-shelf datasets are available for these purposes, which comprise of unscripted, natural conversations that are conducted by participants recruited, trained, and approved to simulate real-world conversations in common domains. For custom datasets that require specific dialects, languages, domains or conventions, please get in touch to learn more.
Do you specialise in any languages or dialects?
Way With Words has completed Speech Collection projects across a range of English dialects, including Australian, Irish, Scottish, South African and Welsh. With a strong presence in Africa, we have also completed Speech Collection projects in languages such as Afrikaans, isiZulu and seSotho.
Which domains have your Speech Collection services included?
Way With Words has created datasets across many domains, including healthcare, insurance, telecom, finance, retail, fast food, travel, airline, and many more. Custom domains can be commissioned to exact client requirements.
Do you sign Service Level Agreements?
For ongoing work, we prefer to work with an SLA. The SLA sets out a clear timetable that includes an initialisation period to set up the required team and logistics for client work. The SLA also covers terms and conditions related to the work and data privacy. If a client requires ongoing work, over an agreed period, Way With Words also usually provides a dedicated MTP team with management oversight, recruitment, selection, assessment, training processes and any other logistical assistance to aid the bespoke requirement.
Datasets Available for Purchase
Our speech data collection was planned, collected, annotated and curated with natural language processing best practice in mind.
Bespoke Speech Collection Projects Completed
Our Speech Collection service is used by clients to improve speech recognition and voice recognition technologies, services or platforms. Speech datasets are required to support and enable acoustic modelling and automated speech recognition.