Skip to content

Shulammiteya/Tone-Helper-Backend

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Logo

Tone Helper (backend)

Backend of speech synthesis application.
Explore the docs »

View Demo · Report Bug · Request Feature

Table of Contents
  1. About
  2. Getting Started
  3. API Discription
  4. Contributing
  5. Contact

About

The back-end server for speech synthesis applications, Tone Helper.

With the API provided by the server, you can do:

  • speech recognition
  • pitch detection
  • file conversion
  • speech synthesis

Of course, since your needs may be different, you are very welcome to suggest changes by forking this repo and creating a pull request or opening an issue!

Getting Started

Prerequisites

  • Python 3.7
  • HTTPS Server

Installation

  1. Clone the repo

    git clone https://github.com/Shulammiteya/Tone-Helper-Backend.git
  2. Install libraries

     pip3 install pydub SoundFile pyworld google-cloud-core
  3. Create your own credential file in the root directory

    touch ./sample.json
  4. Run the server

     python3 backend.py

API Discription

API stt tune getURL
Method POST POST GET


/stt

  input:
Key Type Value Description
audio audio/m4a Audio files for speech recognition, pitch detection, and transcoding
  output:
Key Type Value Description
audio base64 ...BwbGVhc3VyZS4= WAV file encoded in base64
wordInfo array [
   {   
      "word": "一",   
      "start": 27002,   
      "end": 98000,   
      "f0": 440.15,   
      "f0Start": 54,   
      "f0End": 196   
   },   
   {   
      ...   
   },   
   ...   
]
"word": the Speech-recognized text   
"start": the starting position in the audio array   
"end": the end position in the audio array   
"f0": the average F0   
"f0Start": the starting position in the f0 array   
"f0End": the end position in the f0 array

/tune

  input:
Key Type Value Description
audio audio/wav Audio file to be tuned
f0 JSON {
   659.26,   
   659.26,   
   440.00,   
   ...   
}
Designated F0
  output:
Key Type Value Description
data base64 ...BwbGVhc3VyZS4= Synthesized audio encoded in base64

/getURL

  output:
Key Type Value Description
data string https://youtu.be/... Introductory video URL

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

Contact

About me: Hsin-Hsin, Chen - shulammite302332@gmail.com

Project Link: https://github.com/Shulammiteya/Tone-Helper-Backend

About

Back-end server for speech recognition and voice synthesis.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages