Welcome folks today in this blog post we will be talking about pypdf2 library to extract text from pdf documents and render them in tkinter desktop app in python. All the full source code of the example is shown below
Get Started
In order to get started you need the tkinter library which is gui framework to build python applications which is built in python. No need to install it. And secondly you need to install pypdf2 library by using the below command
pip install pypdf2
Now after installing this you need to make an app.py
file inside the root directory and initialize a simple tkinter window on the screen as shown below
app.py
1 2 3 4 5 6 7 8 9 |
#Import the required Libraries import PyPDF2 from tkinter import * from tkinter import filedialog #Create an instance of tkinter frame win= Tk() #Set the Geometry win.geometry("750x450") win.mainloop() |
As you can see we are importing the required libraries pypdf2 and tkinter at the very top. And also we are importing the filedialog library from tkinter to basically show a popup window for the user to select files from the local file system. And then we are initializing a new instance of the tkinter class. And setting a basic window providing a fixed width and height using the geometry() method. And lastly we are showing the window using the mainloop() method. The result is shown below
As you can see this is an empty window nothing is present here. Now we have to add some widgets on the screen. In this case we will add a simple menu where we will add some buttons to open the file.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
#Import the required Libraries import PyPDF2 from tkinter import * from tkinter import filedialog #Create an instance of tkinter frame win= Tk() #Set the Geometry win.geometry("750x450") #Create a Menu my_menu= Menu(win) win.config(menu=my_menu) #Add dropdown to the Menus file_menu=Menu(my_menu,tearoff=False) my_menu.add_cascade(label="File",menu= file_menu) file_menu.add_command(label="Open",command=open_pdf) file_menu.add_command(label="Clear",command=clear_text) file_menu.add_command(label="Quit",command=quit_app) win.mainloop() |
As you can see in the above block of code we are adding a simple menu with items. In this we are initializing a new file menu which contains three subitems such as open, clear and quit. The menu items is added using the add_command() function providing the label for the sub-menu item and the actual command which is onclick function when we click the menu item what happens. We have defined the methods we need to write these methods also. Now if you run the tkinter app you will see your menu as shown below
Now we need to define both the three methods of the menu items. Open , clear and quit. But before that we will define a simple text
widget inside tkinter which will actually hold all the data contained inside the pdf document. The pdf document will be rendered inside this text widget.
1 2 |
text= Text(win,width= 80,height=30) text.pack(pady=20) |
So here as you can see we are initializing a new Text Widget where we have provided the custom width and height. And also now to place this widget on to the window we are using the pack() method of tkinter library and also adding some padding in y direction of 20. It will look something like this
Now we need to define those three methods guys first of all we will define the method when we click the open menu item. Here we need to shown a popup window to the user for selecting pdf files
1 2 3 4 5 6 7 8 9 10 11 |
def open_pdf(): file= filedialog.askopenfilename(title="Select a PDF", filetype=(("PDF Files","*.pdf"),("All Files","*.*"))) if file: #Open the PDF File pdf_file= PyPDF2.PdfFileReader(file) #Select a Page to read page= pdf_file.getPage(0) #Get the content of the Page content=page.extractText() #Add the content to TextBox text.insert(1.0,content) |
Here in this function we are first of all using the askOpenFilename() inside filedialog module of tkinter to show a popup window to the user where the user can only select pdf files we have passed the filter inside this function. And also after that if the file is valid and not null we are reading or extracting the text contents of pdf file and rendering it inside the text widget.
And now if you run the app your application will work and the pdf file will render inside the text widget as shown below
Now if we click the clear button inside the menu we need to remove the contents of the text widget. The function is as follows
1 2 |
def clear_text(): text.delete(1.0, END) |
And also if we click the quit button inside the menu we need to exit out of the app as shown below
1 2 |
def quit_app(): win.destroy() |
Here we are using the destroy() method to exit the app
FULL SOURCE CODE
Wrapping it all just see all the source code of this application
app.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
#Import the required Libraries import PyPDF2 from tkinter import * from tkinter import filedialog #Create an instance of tkinter frame win= Tk() #Set the Geometry win.geometry("750x450") #Create a Text Box text= Text(win,width= 80,height=30) text.pack(pady=20) #Define a function to clear the text def clear_text(): text.delete(1.0, END) #Define a function to open the pdf file def open_pdf(): file= filedialog.askopenfilename(title="Select a PDF", filetype=(("PDF Files","*.pdf"),("All Files","*.*"))) if file: #Open the PDF File pdf_file= PyPDF2.PdfFileReader(file) #Select a Page to read page= pdf_file.getPage(0) #Get the content of the Page content=page.extractText() #Add the content to TextBox text.insert(1.0,content) #Define function to Quit the window def quit_app(): win.destroy() #Create a Menu my_menu= Menu(win) win.config(menu=my_menu) #Add dropdown to the Menus file_menu=Menu(my_menu,tearoff=False) my_menu.add_cascade(label="File",menu= file_menu) file_menu.add_command(label="Open",command=open_pdf) file_menu.add_command(label="Clear",command=clear_text) file_menu.add_command(label="Quit",command=quit_app) win.mainloop() |