Blog

Explore the blog

Field Notes is the Way With Words blog: long-form guidance on human transcription, broadcast and corporate captioning, interview and research audio, and speech dataset design for ASR and conversational AI. We write for programme managers, researchers, legal and compliance teams, and product leaders who care about accuracy, turnaround, and defensible data handling.

Newest posts appear first. Browse by topic to follow a theme across many articles, or use search on this page to filter titles, descriptions, authors, and tags. When you are ready to price work or talk through scope, the service pages and contact form are linked from the site header and footer.

Showing 12 posts on this page (527 total)

Building Secure, Inclusive, and Effective Speaker Verification Systems featured image

Building Secure, Inclusive, and Effective Speaker Verification Systems

By Way With Words Team

How Does Speaker Verification Rely on Speech Corpora? Building Secure, Inclusive, and Effective Speaker Verification Systems The sound of a voice is becomi...

Read article
Why Is Paralinguistic Speech Data Crucial in Emotion Detection? featured image

Why Is Paralinguistic Speech Data Crucial in Emotion Detection?

By Way With Words Team

As research continues and multilingual, real-world datasets expand, the potential of paralinguistic speech data will only grow.

Read article
Clinical Speech Data: The Voice of the Future in Medicine featured image

Clinical Speech Data: The Voice of the Future in Medicine

By Way With Words Team

From voice biomarkers, to automated transcription systems that free clinicians from paperwork, clinical speech data is unlocking new frontiers in diagnosis, monitoring, and patient care.

Read article
Challenge of Training Language Identification Speech Systems featured image

Challenge of Training Language Identification Speech Systems

By Way With Words Team

This article explores what speech data is used for language identification, the challenges of training such systems, and the industries that depend on them.

Read article
Use of Contextual Speech Corpora to Benefit Virtual Assistants featured image

Use of Contextual Speech Corpora to Benefit Virtual Assistants

By Way With Words Team

How Do Virtual Assistants Benefit from Contextual Speech Corpora? How to Create Virtual Assistants That Feel Truly Intelligent Voice assistants have moved...

Read article
Training Chatbots: The Critical Role of Speech Data featured image

Training Chatbots: The Critical Role of Speech Data

By Way With Words Team

Chatbots and voice assistants are woven into the fabric of daily life, from guiding us through customer service queries to helping us control smart devices with simple spoken commands.

Read article
Importance of Labelling Non-Verbal Events in Speech Data featured image

Importance of Labelling Non-Verbal Events in Speech Data

By Way With Words Team

Non-verbal audio events carry layers of meaning and labelling them properly is therefore a foundational task in modern speech data annotation.

Read article
How Do You Prevent Overfitting in Speech Dataset Design? featured image

How Do You Prevent Overfitting in Speech Dataset Design?

By Way With Words Team

One of the most persistent challenges for speech model developers and data scientists is preventing overfitting in speech data.

Read article
Audio Recording in the Field: Follow Proven Best Practices featured image

Audio Recording in the Field: Follow Proven Best Practices

By Way With Words Team

This article explores the key areas of field audio recording, from pre-recording planning and equipment selection to managing conditions, ensuring data safety, and respecting ethics.

Read article
Can Open-Source Tools Reliably Collect Quality Audio? featured image

Can Open-Source Tools Reliably Collect Quality Audio?

By Way With Words Team

This article explores the strengths and weaknesses of open-source tools, and evaluates their performance across different requirements.

Read article
Designing an Effective Semi-supervised Speech Data Pipeline featured image

Designing an Effective Semi-supervised Speech Data Pipeline

By Way With Words Team

In a semi-supervised speech data setup, a portion of the dataset is labelled by humans, while a much larger portion remains unlabelled.

Read article
How Do You Anonymise Voice Data Samples? featured image

How Do You Anonymise Voice Data Samples?

By Way With Words Team

To properly anonymise voice data, various categories must be considered including speaker identity, spoken content, contextual audio clues, and vocal biometrics.

Read article