Превышен предел в 5000 символов. Использование SSML и ввода текста: преобразование текста в речь Google (TTS)

Проблема

Следуя документации по Создание голосовых аудиофайлов с помощью Google Преобразование текста в речь API Cloud Platform возникает следующая ошибка при использовании Язык разметки синтеза речи (SSML), по сравнению с отсутствием ошибок при использовании того же содержания, отформатированного как стандартный текст.

Это ошибка при использовании SSML, которая кажется неточной, поскольку количество символов SSML значительно ниже предела 5000 при 2979:

Ошибка: 3 INVALID_ARGUMENT: превышен предел в 5000 символов.

Установка Node.js

const Speech = require('ssml-builder');
const textToSpeech = require('@google-cloud/text-to-speech');

...

const client = new textToSpeech.TextToSpeechClient();
const speech = new Speech();

...

Стандартный ввод текста

console.log('Convert Article ' + data.id + ': ' + data.text);

return client.synthesizeSpeech({
        input: { text: data.text},
        voice: {
          languageCode: '[language-code]',
          name: '[language-option]',
        },
        audioConfig: {
          audioEncoding: '[encoding-type]',
          pitch: "[pitch]",
          speakingRate: "[speaking-rate]"
        },
      })

Ввод SSML

С помощью пакета ssml-builder.

console.log('Convert Article ' + data.id + ': ' + speech.say(data.text).ssml());

return client.synthesizeSpeech({
        input: { ssml: speech.say(data.text).ssml()},
        voice: {
          languageCode: '[language-code]',
          name: '[language-option]',
        },
        audioConfig: {
          audioEncoding: '[encoding-type]',
          pitch: "[pitch]",
          speakingRate: "[speaking-rate]"
        },
      })

Вход

Статья: Сообщения о кончине Биткойн" сильно преувеличены "

Стандартный текст - Работает, как ожидалось

Количество символов: 2904

The current bitcoin bear market, labeled crypto winter for its debilitating effect on the broader market and industry, has seen more than $700 billion wiped from the total value of all cryptocurrencies so far this year, some 80% of its value since its all-time high.

Bitcoin has seen similar price percentage declines before, however, and has managed to recover from them. Now, researchers from the University of Cambridge Judge Business School have found the bitcoin industry will "likely" bounce back again.

"Statements proclaiming the death of the crypto-asset industry have been made after every global ecosystem bubble," researchers wrote in the second Global Cryptoasset Benchmarking Study. "While it is true that the 2017 bubble was the largest in bitcoin's history, the market capitalization of both bitcoin and the crypto-asset ecosystem still exceeds its January 2017 levels-prior to the start of the bubble.

"The speculation of the death of the market and ecosystem has been greatly exaggerated, and so it seems likely that the future expansion plans of industry participants will, at most, be delayed."

While the bitcoin industry still has many supporters despite the price collapse, others have been quick to brand bitcoin as dead, something that's happened more than 300 times according to the loosely-updated tracking website 99bitcoins.

Elsewhere, bitcoin bulls, such former Goldman Sachs partner and founder of cryptocurrency merchant bank Galaxy Digital Holdings Mike Novogratz, have sobered up since the giddy highs of late 2017.

Researchers also found that millions of new users have entered the ecosystem over the last 12 months, though most are passive -- buying bitcoin or other cryptocurrencies with newly created wallets and then not moving or using them.

Total user accounts at service providers now exceed 139 million with at least 35 million identity-verified users, the latter growing nearly four-fold in 2017 and doubling again in the first three quarters of 2018, according to the report.

Only 38% of all users can be considered active, although definitions and criteria of activity levels vary significantly across service providers.

Meanwhile, the study found that the top six proof-of-work cryptocurrencies (including bitcoin and ethereum) collectively consume between 52 TWh and 111 TWh of electricity per year: the mid-point of the estimate (82 TWh) is the equivalent of the total energy consumed by the entire country of Belgium -- but also constitutes less than 0.01% of the world's global energy production per year.

A "notable" share of the energy consumed by these facilities is supplied by renewable energy sources in regions with excess capacity, the researchers revealed.

The report also found that cryptocurrency mining appears to be less concentrated geographically, in hashing power ownership, and in manufacturer options, than is widely thought.

SSML - Ошибка

Количество символов: 2979

<speak>The current bitcoin bear market, labeled crypto winter for its debilitating effect on the broader market and industry, has seen more than $700 billion wiped from the total value of all cryptocurrencies so far this year, some 80% of its value since its all-time high.

Bitcoin has seen similar price percentage declines before, however, and has managed to recover from them. Now, researchers from the University of Cambridge Judge Business School have found the bitcoin industry will &quot;likely&quot; bounce back again.

&quot;Statements proclaiming the death of the crypto-asset industry have been made after every global ecosystem bubble,&quot; researchers wrote in the second Global Cryptoasset Benchmarking Study. &quot;While it is true that the 2017 bubble was the largest in bitcoin&apos;s history, the market capitalization of both bitcoin and the crypto-asset ecosystem still exceeds its January 2017 levels-prior to the start of the bubble.

&quot;The speculation of the death of the market and ecosystem has been greatly exaggerated, and so it seems likely that the future expansion plans of industry participants will, at most, be delayed.&quot;

While the bitcoin industry still has many supporters despite the price collapse, others have been quick to brand bitcoin as dead, something that&apos;s happened more than 300 times according to the loosely-updated tracking website 99bitcoins.

Elsewhere, bitcoin bulls, such former Goldman Sachs partner and founder of cryptocurrency merchant bank Galaxy Digital Holdings Mike Novogratz, have sobered up since the giddy highs of late 2017.

Researchers also found that millions of new users have entered the ecosystem over the last 12 months, though most are passive -- buying bitcoin or other cryptocurrencies with newly created wallets and then not moving or using them.

Total user accounts at service providers now exceed 139 million with at least 35 million identity-verified users, the latter growing nearly four-fold in 2017 and doubling again in the first three quarters of 2018, according to the report.

Only 38% of all users can be considered active, although definitions and criteria of activity levels vary significantly across service providers.

Meanwhile, the study found that the top six proof-of-work cryptocurrencies (including bitcoin and ethereum) collectively consume between 52 TWh and 111 TWh of electricity per year: the mid-point of the estimate (82 TWh) is the equivalent of the total energy consumed by the entire country of Belgium -- but also constitutes less than 0.01% of the world&apos;s global energy production per year.

A &quot;notable&quot; share of the energy consumed by these facilities is supplied by renewable energy sources in regions with excess capacity, the researchers revealed.

The report also found that cryptocurrency mining appears to be less concentrated geographically, in hashing power ownership, and in manufacturer options, than is widely thought.</speak>

comment
console.cloud.google.com/apis/api/translate. googleapis.com/, откуда исходит (или может быть) квота; возможно, дневной лимит уже достигнут, в то время как отображается фиктивное сообщение об ошибке. также рассмотрите числа и например. % в сумме с лимитом выходных символов из-за SSML.   -  person Martin Zeitler    schedule 15.12.2018
comment
@MartinZeitler, 1) Общая квота является справедливой, однако один и тот же контент при передаче в виде обычного текста переводится без ошибок в приведенном выше примере, что означает, что это не может быть проблемой ежедневной квоты, поскольку в обоих случаях используется одна и та же учетная запись . 2) Я ввел содержимое SSML в текстовый документ с включенными %, и количество символов значительно ниже порогового значения 5 000 * на уровне 2979.   -  person Adam Hurwitz    schedule 15.12.2018
comment
не уверен, что входные или выходные символы являются тем, что считается, потому что 139 не обязательно 3 символов, но one-hundred thirty-nine до 23 символов (также не уверен, учитываются ли пробелы в квоте) ... когда-либо пробовал более короткий SSML? фактический предел для этого может быть ниже 5000, но, тем не менее, будет выдано фиктивное сообщение об ошибке. можно было только представить, что он либо подсчитывается по-другому, либо присутствует другой предел, отличный от того, который он сообщает (для сообщения об ошибке превышения предела по умолчанию).   -  person Martin Zeitler    schedule 15.12.2018


Ответы (1)


Создайте новый объект речи для многократного использования.

ie:

console.log('Convert Article ' + data.id + ': ' + (new Speech()).say(data.text).ssml())

Источник проблемы

Объект Speech был создан один раз, но использовался дважды. После создания он используется как для записи содержимого текста в целях отладки (по иронии судьбы), так и снова для записи в файл, вызывающего достижение квоты.

Речевой объект определен

const speech = new Speech();

1-е использование

console.log('Convert Article ' + data.id + ': ' + speech.say(data.text).ssml())

2-е использование

  input: { ssml: speech.say(data.text).ssml()}
person Adam Hurwitz    schedule 15.12.2018