Audio Book Generator with python
Are you the one who wants to read but finds it too slow and boring? How would you like it if someone reads the book for you? Well, this python project with source code will recite any book you want. Provided you have the E-book version of the book i.e. PDF.
NOTE: This project strictly works only for PDF files.
In this article, I will be giving a summary of how I made this project using python. You can find this python project in my GitHub repository. I will also mention the essential steps to accomplish the project. This is one of the python projects in my GitHub repository which gained popularity. Here is the link to the project. Hit Me!
Let see the power of python!
Here’s a demonstration of Audio Book Generator with python.
Work Flow of the Project
- Audio Book Generator first converts each page of PDF file to image.
- It then writes the text extracted from these images to a .txt file.
- It reads the content from the .txt file and makes an mp3 file.
Steps to run this python project with source code
Step 1: Installation and configuration of the tesseract.
For Windows Users
- Install tesseract from this Github account.
- Add tesseract to your environment variable. (Add this path C:\Program Files (x86)\Tesseract-OCR)
- Type ‘tesseract’ in cmd you will get a lot of options. You can refer to this image from GitHub.
- Optional: Step 5 should be done only if you get the following error
Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory.
5. Add a new variable in environment variable TESSDATA_PREFIX and give it C:\Program Files (x86)\Tesseract-OCR\tessdata this value.
For Linux Users
- Install Tesseract with this tutorial. After installation jump to how to run.
Step 2: Run the following commands in a terminal (Linux users) or cmd (Windows users).
git clone https://github.com/globefire/Audio-book-generator.git cd speech-recognition-Ebook-Reader pip install -r requirements.txt python speechtotext.py python finalAudioBookGenerator.py
You’re good to go. Let’s see the libraries used in our project.
I have used various python libraries to implement this project. Trust me you will be astonished to see how such a few lines of code results in such an amazing project.
Following is the list of libraries used for this:
1) Tesseract (Pytesseract)
This is the most important part of the project. It is an OCR (Optical Character Recognition) engine that converts the written content of an image or PDF to text. This engine is so powerful that it can convert handwritten notes to text.
Fun Fact: Tesseract is a free tool and its development is sponsored by Google. Wikipedia
This is the library accountable for converting the PDF to images.
As the name insinuates it creates a temporary file. This folder contains the .ppm image version of each PDF page.
4) PIL (Python Imaging Library)
Python Imaging Library is a free library for the Python programming language that adds support for opening, manipulating, and saving many different image file formats.
5) gTTS (Google Text-to-Speech)
A Python library and CLI tool to interface with Google Translate’s text-to-speech API. Write spoken mp3 data to a file, a file-like object (byte string) for further audio manipulation
We’re done with the libraries, now I will explain to you the code.
Explanation of the code
1) Importing Python Libraries and setting the file name
2) Setting up the variables
Here we determine a variable for the auxiliary folder where the PDF images will be stored. A variable that bears our filename.
3) Generate temporary images
This code generates a temporary image of ppm format for each page and saves them in save_dir.
4) Setting up PyTesseract
You have to specify the path where tesseract.exe is saved.
5) Extracting Text
In this code, we save the .ppm images to.JPG images. We do this since PyTesseract does not support ppm images for extracting text. We save the extracted text to a .txt file.
6) Generating the mp3 File
We read the text file where the content was saved and save it as mp3 using gTTS library.
Are you bored with taking notes in your lecture? Do you feel drowsy in class while taking notes? Well, I have a script that comes to your redemption. The speechtotext.py in my Github repository converts the audio to text. It writes all the audio it can listen and apprehend in a .txt file. Similarly, I have many such python project in github.
So the next time you are attending a class make sure to execute the speechtotext.py script. Relax the whole lecture like a boss XD.
I also have another script that extracts the text from a single image. I have marked this script as SingleImageReader.py.
Being a python developer is pretty cool. Apart from getting things done, it also gives a feeling of successfully developing something great.
Being a python developer is pretty cool. Apart from getting things done, it also gives a feeling of successfully developing something great. This boosts your confidence to the next level!
In conclusion, I would say to always remember that whenever you’ve got an idea, Python, a sturdy tool will help you implement it successfully. I had many ideas and to bring them in real life python has helped me a lot. I have many python projects in GitHub that I developed to solve my daily life predicaments.
Thank you for reading this article. If you savored the idea behind this article and if you found this article effective do share a word on Facebook, Instagram, LinkedIn. Also, we would appreciate if you leave a star on the Github repository it drives people like me to contribute more in the open-source world.
Do read my previous article on how to make a live color detector with python.
This category consists of many cool python projects for beginners, java projects, and projects in other programming languages. It also contains source code on GitHub, a complete explanation of code and a Youtube demonstration video. Isn’t a great platform to learn and implement mini-projects at the same time? We hope these simple projects would be useful for giving a quick glance for mini project ideas in your institutes too.
Click mini-project to explore more such projects.
Also to read about the latest technologies in computer science, you can visit Technology News section on our website.
If you have queries regarding this python project with source code, you can write it down in the comment section below.
Feel free to contact us!