Speech to text api

Speech to text api how to#
Speech to text api code#
Speech to text api professional#

Poll the status URL periodically until the build finishes.Request an audio build, which will provide you with a status URL.To create an audio file using the long content API, execute the following steps: This integration is much more complicated than the short content API, but it allows for better resilience and longer processing. Instead, you make several short requests. To be fault-tolerant, it does not require you to keeping a single HTTPS connection open for a longer period of time. The large content API allows running longer and larger jobs. The response type will be application/json, and the body of the response will be a JSON object containing more information about the error.

Speech to text api code#

The response will have status code 400 (for user errors) or 500 (for server errors). If there is an error during audio conversion, Short content API will contain the error in the immediate response.

Speech to text api how to#

NodeJS/JavaScript exampleįor a simple example of how to access the short content (streaming) API from JavaScript/NodeJS, check out. See Configuring Audio Tasks for information on selecting the voice and adjusting the reading speed. The snippet below will generate a M4A file using the text “Hi there, this is your API speaking”, and save it to result.m4a.Ĭurl -d "Hi there, this is your API speaking" -H 'Content-Type: text/plain' -H "x-api-key: $APIKEY" -H "accept: application/octet-stream" -output result.m4a In the request body, provide a UTF-8 encoded script text.

Specify an accept header with the value application/octet-stream.

Provide your API key in the x-api-key header.

To request an audio file build, use one of the endpoints, and: The short content API requires just one request, and returns the audio as a binary stream. You should provide the API key as a header to all requests to the public REST endpoints, using the x-api-key header. To use the API, you will need a Narakeet API key.

M4A and MP3 endpoints support both short content API (streaming) and long content API (JSON polling). Note that the WAV endpoint only works for the long content (polling) API. creates compressed MPEG-4 files (best combination of file size and quality).creates compressed MP3 files (smaller file, good quality).creates uncompressed 16-bit PCM wav files (highest quality, largest file).There are three endpoints for audio project build requests, which produce different output formats: If you do not provide the accept header, the long content (polling) API will be used, and you will get back a status URL that you can poll for results. If you provide application/octet-stream as the accept header, the short content (streaming) API will be used, and you will get the result back as a binary stream. When executing the requests, you select the API with the accept header. Here is a quick summary of the limitations and differences between the APIs.

Speech to text api professional#

To convert large documents, build audiobooks, or produce uncompressed output for professional videos, use the long content (polling) API. If you want to build audio on the fly for short sentences, such as synthesising individual paragraphs or labels for user interface elements, use the short content (streaming) API.

Long content (JSON polling) API is more complex but allows significantly larger and longer conversions.

Short content (streaming) API is simpler, faster, but restricted to relatively short content.

Narakeet has two ways of integrating with the Text to speech API:

Configuring other options (voice speed/volume…).

Choose between Streaming or Polling API.

This page contains information for people who want to build their own integration. NOTE: The easiest way to batch-convert text is to use our command-line utility. This page explains how to use our text to speech API to create audio files. You can batch-produce audio files from external content, integrate our realistic text to speech voices into your software, and a lot more. Our Text to Speech API allows you to automatically generate audio in 70+ languages, with 400+ voices.