Buildng an open source video creation program for reddit posts
Sunday, 20 June, 2021 All posts
Backstory
Occasionally when I am on youtube late at night, the algorithm will recommend me these types of text to speech videos that essentially read a reddit post and some comments. For some reference they are usually something like this. I was intrigued by these videos when I would watch them because I thought if they were not automated, they should be. This prompted me to see if I could create a program that could automate making these videos. This blog will outline the design process of my code but you can find the source code here
Design
I was brainstorming ways to completely automate this process, but I came to the conclusion it would be better for the user to determine the videos they want to make videos on instead of creating an algorithm to find posts on its own. I opted for the user to create a .txt file, filled with lines of URL's of different reddit posts to make videos on. The .txt file looks like this:
(link to reddit post) (number of comments) (title of video)
(link to reddit post) (number of comments) (title of video)
...
where it holds a link to the post, the number of comments to pull and create text-to-speech, and what to title the video after creating and editing the video.
Connecting to Reddit's Praw API to scrape post
To efficiently scrape content from reddit posts, I used the Praw library. I wrapped the logic in a scraping class that takes the URL of the post and the number of comments to scrape. When I have connected to the post, I scrape the title and the top n comments... where n is the number of comments to scrape. My scraping content is taking the text from the post and comments and adding it to a list of strings. These will be used later to create the video.
Next step: text-to-speech
After scraping all the strings from the post, I run each string through a free text-to-speech library that interfaces with google text to speech: gTTS. It will create a .mp3 object that I can save to a specific folder. I programmed my project to check if a directory audio exists. If it doesnt, it will create one and save the audio to it. The names of the files will go as such: "title.mp3", "comment0.mp3", "comment1.mp3", etc. By saving the files in this namespace, I can easilly iterate through the audio files still in chronological order, and by using the argument of how many comments to scrape.
Creating the image background for each post
In some of the youtube videos for reddit text-to-speech videos, it appears that a screenshot of the actual post was used for the background image of the text-to-speech audio.
I opted to create my own image and overlay the text to it. To do this, I used the Pillow library. I created a utils.py
file that holds the logic of building images.
The file will pull a default image from the images/
directory and use it as a background image. It will then take the saved text from the post and render it to the screen.
Each image will be saved in the images/
directory with the same namespace as the audio: "title.jpg", "comment0.jpg", etc.
Creating an mp4 from the images and audio
I decided to use the moviepy library to create the overall mp4 video.
I built a VideoEdit.py
file that will loop through the images and audio in their respected directorys and create mp4 clips out of them and add them to a list.
Lastly I would loop the video list and concatenate all of the mp4 videos into a single file, saving it with the name the user adds in their txt file shown above to an edited_videos/
directory.
The result
I created a demo youtube video to show the basics of what this program can do at this point. I am content with how my project turned out since I created it to see if these videos can be automated and I have proven to myself that it can be done. this video was created by my program.
The future of this project
I recently made a post on reddits r/python page displaying my program and got some good feedback. I have around 30 stars on my project, a fork, and some people who want to contribute.
There is definitely improvements that can be made to this program and I am excited about the future of it.
Some ideas I have had are:
- Option to add intro/outro mp4 to video
- Automated thumbnail generation
- Automated youtube upload
- Cleaning / refactoring code - ovbiously :)