DATA COLLECTION
DATA COLLECTION
DATA
COLLECTION
Speech datasets creation, evaluation, annotation and linguistic services
Linguistic Services
Dialect evaluation, phonology definition, lexicon and phoneme inventory creation, phonemic transcription, script proofing, audio QC... we do it all, thanks to our teams of native linguists.
Audio Data Collection
Whether casting, directing, and recording for virtual assistants, or gathering crowd-sourced speech from non-pro speakers, we deliver high-quality data that matches your requirements.
Annotation
Our teams of native speakers and linguists evaluate, annotate, label, and classify audio datasets to train your AI models to the highest standard, and help create cutting-edge voice applications.
With 10 years' experience supporting the world’s biggest tech companies, PTW offers a full range of dataset creation services for virtual assistants, TTS and ASR. We handle everything from linguistics, collection, annotation, and evaluation, in any language.
With over 17 offices worldwide and a large network of experts, we are the ideal partner for the tech industry.
For large-scale, crowd-sourced voice data collections across any demographic and locale, we have developed a fully-managed and scalable remote production model built around our proprietary platform. Speakers record themselves on their personal devices and submit the audio to the cloud, while our internal teams manage the whole process to ensure we reach the right speakers and gather quality audio data in a fast, efficient, reliable, and secure manner.
40+
Languages and Locales
250+
Unique TTS Voices Cast and Recorded
30 M+
Words Recorded
10 Yrs
Experience