How to jump-cut silent parts of your videos automatically with Python?

Kıvanç Yüksel
4 min readApr 26, 2020

--

My journey with recording and editing videos started a year ago after I had decided to open my own YouTube channel (the main content is in the Turkish language, however, there will also be some English content in the future. If you are interested in, check it out!). I was thinking: “Okay, I work as a Data Scientist and I know some stuff about it. There are many resources where you can learn more about it in English, however, there aren’t that many in my own language. So, wouldn’t it be nice if someone who doesn’t speak English from my country also could find some resources to learn more about it?”. Since I am not a very good speaker, I was making a lot of mistakes while I was talking, and there were a lot of “silent” gaps between my sentences. Thus, it was usually taking me much longer to edit a video than actually recording it. After some time I couldn’t handle it anymore and I stopped recording videos. Well, at least until I got an idea about automating some part of my video editing process.

In this post, I would like to share with you a project that I made recently to automatically jump-cut silent parts of a video using Python programming language. It is an open-source project that you can freely download and use (if you find it helpful, please leave a ⭐ 🤓). I hope it will help you to reduce the time it takes you to edit your videos.

How it works?

The idea of the program is actually very simple: “Read the audio signal from a video, and if its value is below some threshold, cut it out”.

An audio signal with a demonstration of threshold-cutting

I used a library called “moviepy” in order to both read the audio signal and automatically jump-cut silent parts. I won’t take too much of your time explaining how exactly the script works, however, I would like to go through some of the parameters you can play with while you are using it. Before we delve into these parameters, let’s see a short demonstration of what the program is capable of:

Okay, let’s start… :) There are 8 command line arguments you can run the program with. Before explaining them (scaring you away), I would like to say that most of these parameters have a default value that “just works”. So, if you don’t want you don’t need to specify (or know) almost any of these parameters. You will be just fine with the default values.

  1. --input, -i: Path to the video that you want to jump-cut.
  2. --output, -o: Path to where you want to save the output video.
  3. --magnitude-threshold-ratio, -m: The percentage of the maximum value of your audio signal that you would like to consider as a silent signal (default: 0.02).
  4. --duration-threshold, -d: Minimum number of required seconds in silence to cut it out. For example, if this parameter is 0.5, it means that the silence parts have to last a minimum of 0.5 seconds, otherwise they won’t be jump-cut (default: 0.5).
  5. --failure-tolerance-ratio, -f: Most of the time, there are 44100 audio signal values in 1 second of a video. Let’s say the “--duration-threshold” was set to 0.5. This means that we need to check the minimum 22050 signal values to see if there is a silent part or not. What happens if we find 22049 values that we consider as silent, but there is 1 value that is above our threshold? Should we just consider this part of the video as a loud signal? I think we shouldn’t. This parameter leaves some room for failure, it tolerates high signal values until some point. Let’s say it is set to 0.1, it means that 10% of the signal that is currently being investigated can have values that are higher than our threshold, but they are still going to be considered as a silent signal (default: 0.1).
  6. --spaces-on-edges, -s: Leaves some space on the edges of a silence cut. E.g. if it is found that there is silence between 10th and 20th second of the video, then instead of cutting it out directly, we cut out (10+space_on_edges)th and (20-space_on_edges)th seconds of the clip (default: 0.1).
  7. --silence-part-speed, -x: If this parameter is given, instead of cutting the silent parts out, the script will speed them up “x” times.
  8. --min-loud-part-duration, -l: If this parameter is given, loud parts of the video that are shorter than this parameter will also be cut.

Alright! So, how to run the program? As I said before, you don’t need to specify any value for the parameters that already have a default value. Nonetheless, let me give a few examples.:

# The simplest way you can run the program
python main.py -i input_video.mp4 -o output_video.mp4
# If you want, you can also set the other parameters that was mentioned
python main.py -i input_video.mp4 -o output_video.mp4 -m 0.05 -d 1.0 -f 0.2 -s 0.2 -x 2000 -l 1.0

That’s it! I hope this program will be useful to you, and make your workload a bit less.

Don’t forget to 👏🏻 if you liked this post, and please leave a comment below if you have any feedback, criticism, or something that you would like to discuss. I can also be reached on social media: linkedin, twitter, instagram

--

--