Welcome folks today in this blog post we will be splitting pdf
document into multiple pages
in browser using pypdf2
library in browser using flask
. All the full source code of the application is shown below.
Get Started
In order to get started you need to install the below libraries
using the pip
command as shown below
pip install flask
pip install pypdf2
And after that you need to make an app.py
file and copy paste the following code
app.py
1 2 3 4 5 6 7 8 9 10 11 12 13 |
from flask import Flask, render_template, request, send_file from werkzeug.utils import secure_filename import os import PyPDF2 app = Flask(__name__) @app.route('/') def index(): return render_template('index.html') if __name__ == '__main__': app.run() |
As you can see we are importing the flask
library and then we are also importing the pypdf2
library at the top to split
pdf document into multiple pages
. And then we have started the flask
app at the port number 5000. And then we have initialized the /
route where we are loading the index.html
template file. For this you need to create the templates
folder and inside it you need to create the index.html
file and copy paste the following code
templates/index.html
1 2 3 4 |
<form method="post" action="/split" enctype="multipart/form-data"> <input type="file" name="file"> <input type="submit" value="Split"> </form> |
As you can see we have the simple input field
where we allow the user to select multiple pdf
files and then we have the submit
button to merge the pdf files. Now if you start the flask
app using the below command
python app.py
Now we need to write the post
request to actually merge
multiple pdf files which is selected by the user using the input
field as shown below
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
@app.route('/split', methods=['POST']) def split(): # Get the uploaded file file = request.files['file'] filename = secure_filename(file.filename) # Save the file to a temporary directory file.save(os.path.join('/tmp', filename)) # Open the PDF and split it into multiple PDFs with open(os.path.join('/tmp', filename), 'rb') as f: pdf = PyPDF2.PdfFileReader(f) for i in range(pdf.getNumPages()): output = PyPDF2.PdfFileWriter() output.addPage(pdf.getPage(i)) with open(os.path.join('/tmp', f'split_{i}.pdf'), 'wb') as f: output.write(f) # Return links to download the split PDFs links = [] for i in range(pdf.getNumPages()): links.append(f'<a href="/download/{i}">Split PDF {i}</a>') return '<br>'.join(links) |
As you can see inside the above function we are first of all comparing the request
if it’s equal to POST
and then we are using the request
module to get access to all the files selected by the user. And then we are initializing the PDFFileReader()
method to read the content of the pdf document which is selected and then we are using the for loop
to split
the pages into separate pages and then we are displaying the splitted
pages as separated hyperlinks to download the page
as pdf document. And then we are allowing the user to download
the splitted pdf documentpages
as an attachment inside the browser.
Now we need to write the download
request inside the app.py
to allow the user to download
the splitted pdf document pages as separated pdf
documents as shown below
1 2 3 4 |
@app.route('/download/<int:index>') def download(index): # Send the split PDF as a response return send_file(f'/tmp/split_{index}.pdf', as_attachment=True) |
As you can see in the above request we are receiving the page
number of the pdf document as query
parameter and depending upon which page is selected by user we are downloading the pdf
page as a separate file as shown below
Full Source Code
app.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
from flask import Flask, render_template, request, send_file from werkzeug.utils import secure_filename import os import PyPDF2 app = Flask(__name__) @app.route('/') def index(): return render_template('index.html') @app.route('/split', methods=['POST']) def split(): # Get the uploaded file file = request.files['file'] filename = secure_filename(file.filename) # Save the file to a temporary directory file.save(os.path.join('/tmp', filename)) # Open the PDF and split it into multiple PDFs with open(os.path.join('/tmp', filename), 'rb') as f: pdf = PyPDF2.PdfFileReader(f) for i in range(pdf.getNumPages()): output = PyPDF2.PdfFileWriter() output.addPage(pdf.getPage(i)) with open(os.path.join('/tmp', f'split_{i}.pdf'), 'wb') as f: output.write(f) # Return links to download the split PDFs links = [] for i in range(pdf.getNumPages()): links.append(f'<a href="/download/{i}">Split PDF {i}</a>') return '<br>'.join(links) @app.route('/download/<int:index>') def download(index): # Send the split PDF as a response return send_file(f'/tmp/split_{index}.pdf', as_attachment=True) if __name__ == '__main__': app.run() |
templates/index.html
1 2 3 4 |
<form method="post" action="/split" enctype="multipart/form-data"> <input type="file" name="file"> <input type="submit" value="Split"> </form> |