<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Yuvan Shankar</title>
    <description>The latest articles on Forem by Yuvan Shankar (@yuvan_shankar_20d7cf9302c).</description>
    <link>https://forem.com/yuvan_shankar_20d7cf9302c</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3761654%2F3424a9f8-fe9c-41c6-bcbc-8991b46c409b.jpg</url>
      <title>Forem: Yuvan Shankar</title>
      <link>https://forem.com/yuvan_shankar_20d7cf9302c</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/yuvan_shankar_20d7cf9302c"/>
    <language>en</language>
    <item>
      <title>Implementing Tamil OCR Using Python and Tesseract</title>
      <dc:creator>Yuvan Shankar</dc:creator>
      <pubDate>Thu, 12 Feb 2026 03:47:16 +0000</pubDate>
      <link>https://forem.com/yuvan_shankar_20d7cf9302c/implementing-tamil-ocr-using-python-and-tesseract-3fkn</link>
      <guid>https://forem.com/yuvan_shankar_20d7cf9302c/implementing-tamil-ocr-using-python-and-tesseract-3fkn</guid>
      <description>&lt;p&gt;INTRODUCTION:&lt;/p&gt;

&lt;p&gt;Optical Character Recognition (OCR) is a technology that converts images containing text into machine-readable digital text. In this project, I implemented a Tamil OCR system using Python and Tesseract OCR engine. The goal was to test how accurately the system detects text from two different sources:&lt;br&gt;
Handwritten text on white paper&lt;br&gt;
Printed text from a newspaper&lt;br&gt;
This blog explains the complete setup process and how the system works.&lt;/p&gt;

&lt;p&gt;Part 1: Installing Python&lt;br&gt;
Step 1: Download Python&lt;/p&gt;

&lt;p&gt;First, download Python from the official website:&lt;br&gt;
&lt;a href="https://www.python.org/downloads/" rel="noopener noreferrer"&gt;https://www.python.org/downloads/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;While installing, it is very important to check the box:&lt;br&gt;
“Add Python to PATH”&lt;br&gt;
This allows Python to be accessed from the Command Prompt.&lt;br&gt;
After installation, verify it by opening Command Prompt and typing:&lt;/p&gt;

&lt;p&gt;If Python is installed correctly, it will display the installed version number.&lt;/p&gt;

&lt;p&gt;Part 2: Installing Tesseract OCR&lt;br&gt;
Python alone cannot perform OCR. We need an OCR engine, which is Tesseract.&lt;/p&gt;

&lt;p&gt;Step 2: Download Tesseract for Windows&lt;br&gt;
Download the Windows installer from:&lt;br&gt;
&lt;a href="https://github.com/UB-Mannheim/tesseract/wiki" rel="noopener noreferrer"&gt;https://github.com/UB-Mannheim/tesseract/wiki&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Install it in the default location:&lt;/p&gt;

&lt;p&gt;C:\Program Files\Tesseract-OCR&lt;br&gt;
After installation, verify it by typing in Command Prompt:&lt;/p&gt;

&lt;p&gt;tesseract --version&lt;/p&gt;

&lt;p&gt;If the version details are displayed, it means Tesseract is installed correctly.&lt;/p&gt;

&lt;p&gt;Part 3: Adding Tamil Language Support&lt;/p&gt;

&lt;p&gt;To detect Tamil text, we must ensure that the Tamil trained data file is available.&lt;/p&gt;

&lt;p&gt;Go to:&lt;/p&gt;

&lt;p&gt;C:\Program Files\Tesseract-OCR\tessdata&lt;/p&gt;

&lt;p&gt;Check if the file:&lt;/p&gt;

&lt;p&gt;tam.traineddata&lt;br&gt;
exists.&lt;/p&gt;

&lt;p&gt;If not, download it from:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/tesseract-ocr/tessdata" rel="noopener noreferrer"&gt;https://github.com/tesseract-ocr/tessdata&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;and place it inside the tessdata folder.&lt;/p&gt;

&lt;p&gt;Part 4: Installing Required Python Libraries&lt;/p&gt;

&lt;p&gt;Open Command Prompt and install the required libraries:&lt;/p&gt;

&lt;p&gt;pip install pytesseract opencv-python pillow&lt;/p&gt;

&lt;p&gt;These libraries are used for:&lt;br&gt;
pytesseract → Connecting Python with Tesseract&lt;/p&gt;

&lt;p&gt;opencv-python → Image processing&lt;/p&gt;

&lt;p&gt;pillow → Image handling&lt;/p&gt;

&lt;p&gt;Part 5: Project Setup&lt;br&gt;
Create a project folder named:&lt;/p&gt;

&lt;p&gt;OCR_Project&lt;br&gt;
Inside the folder, create:&lt;br&gt;
ocr_test.py (Python file)&lt;br&gt;
test.jpg (Input image)&lt;/p&gt;

&lt;p&gt;Part 6: Python OCR Code&lt;br&gt;
Below is the Python code used for Tamil text detection:&lt;/p&gt;

&lt;p&gt;Python&lt;/p&gt;

&lt;p&gt;import cv2&lt;br&gt;
import pytesseract&lt;/p&gt;

&lt;h1&gt;
  
  
  Specify Tesseract path
&lt;/h1&gt;

&lt;p&gt;pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"&lt;/p&gt;

&lt;h1&gt;
  
  
  Load image
&lt;/h1&gt;

&lt;p&gt;img = cv2.imread("test.jpg")&lt;/p&gt;

&lt;h1&gt;
  
  
  Convert to grayscale
&lt;/h1&gt;

&lt;p&gt;gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)&lt;/p&gt;

&lt;h1&gt;
  
  
  Apply thresholding
&lt;/h1&gt;

&lt;p&gt;_, thresh = cv2.threshold(gray, 150, 255, cv2.THRESH_BINARY)&lt;/p&gt;

&lt;h1&gt;
  
  
  Perform OCR in Tamil
&lt;/h1&gt;

&lt;p&gt;text = pytesseract.image_to_string(thresh, lang='tam')&lt;/p&gt;

&lt;p&gt;print("Detected Text:")&lt;br&gt;
print(text)&lt;/p&gt;

&lt;p&gt;Part 7: Running the Program&lt;br&gt;
Navigate to the project folder in Command Prompt:&lt;/p&gt;

&lt;p&gt;cd Desktop\OCR_Project&lt;br&gt;
Run the program:&lt;/p&gt;

&lt;p&gt;python ocr_test.py&lt;br&gt;
The detected Tamil text will be printed in the console.&lt;/p&gt;

&lt;p&gt;HOW THE OCR SYSTEM WORKS INTERNALLY:&lt;/p&gt;

&lt;p&gt;*The system follows these steps:&lt;br&gt;
The image is loaded.&lt;/p&gt;

&lt;p&gt;*It is converted to grayscale to simplify processing.&lt;/p&gt;

&lt;p&gt;*Thresholding is applied to separate text from background.&lt;/p&gt;

&lt;p&gt;*Tesseract detects text regions.&lt;br&gt;
The Tamil language model recognizes characters.&lt;/p&gt;

&lt;p&gt;*The final detected text is returned as output.&lt;/p&gt;

&lt;p&gt;Accuracy Testing: White Paper vs Newspaper&lt;/p&gt;

&lt;p&gt;WHITE PAPER TEST :&lt;/p&gt;

&lt;p&gt;Clean background&lt;br&gt;
Clear handwriting&lt;br&gt;
Good contrast&lt;br&gt;
Result:&lt;br&gt;
Accuracy is usually high (around 80–95%) because the text is clearly separated from the background.&lt;/p&gt;

&lt;p&gt;NEWS PAPER TEST:&lt;/p&gt;

&lt;p&gt;Small font size&lt;br&gt;
Multiple columns&lt;br&gt;
Images and advertisements&lt;br&gt;
Background noise&lt;br&gt;
Result:&lt;br&gt;
Accuracy decreases (around 60–80%) because of complex layout and noise.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>beginners</category>
      <category>python</category>
      <category>career</category>
    </item>
    <item>
      <title>EXPLORING OCR MODEL AND BACKEND SUPPORT IN PYTHON</title>
      <dc:creator>Yuvan Shankar</dc:creator>
      <pubDate>Wed, 11 Feb 2026 03:48:10 +0000</pubDate>
      <link>https://forem.com/yuvan_shankar_20d7cf9302c/exploring-ocr-model-and-backend-support-in-python-dbj</link>
      <guid>https://forem.com/yuvan_shankar_20d7cf9302c/exploring-ocr-model-and-backend-support-in-python-dbj</guid>
      <description>&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Optical Character Recognition (OCR) is a technology that converts images, scanned documents, or PDFs into machine-readable text. In Python, there are many powerful OCR libraries and models available that support different backends and use cases.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In this blog, I explore the most popular OCR modules available in Python and how they are used in real-world applications.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;1.TESSERACT OCR (pytesseract):&lt;/p&gt;

&lt;p&gt;Backend: Google Tesseract Engine (C++ based)&lt;/p&gt;

&lt;p&gt;Python Wrapper: pytesseract&lt;br&gt;
Tesseract is one of the most widely used open-source OCR engines. It is maintained by Google and supports multiple languages.&lt;/p&gt;

&lt;p&gt;HOW TO INSTALL:&lt;/p&gt;

&lt;p&gt;pip install pytesseract&lt;/p&gt;

&lt;p&gt;SAMPLE CODE:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import pytesseract
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;from PIL import Image&lt;/p&gt;

&lt;p&gt;img = Image.open("sample.png")&lt;br&gt;
text = pytesseract.image_to_string(img, lang="eng")&lt;br&gt;
print(text)&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;EASYOCR&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Backend: PyTorch (Deep Learning based)&lt;/p&gt;

&lt;p&gt;EasyOCR is a deep learning-based OCR library. It works well with complex images and multiple languages.&lt;/p&gt;

&lt;p&gt;HOW TO INSTALL:&lt;br&gt;
    pip install easyocr&lt;/p&gt;

&lt;p&gt;SAMPLE CODE:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  import easyocr
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;reader = easyocr.Reader(['en','ta'])&lt;br&gt;
result = reader.readtext('sample.png')&lt;/p&gt;

&lt;p&gt;for r in result:&lt;br&gt;
    print(r[1])&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;PADDLE OCR:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Backend: PaddlePaddle (Deep Learning Framework)&lt;/p&gt;

&lt;p&gt;PaddleOCR is a powerful industrial-level OCR toolkit developed by Baidu.&lt;/p&gt;

&lt;p&gt;HOW TO INSTALL:&lt;br&gt;
     pip install paddleocr paddlepaddle&lt;/p&gt;

&lt;p&gt;SAMPLE CODE:&lt;br&gt;
      from paddleocr import PaddleOCR&lt;/p&gt;

&lt;p&gt;ocr = PaddleOCR(use_angle_cls=True, lang='en')&lt;br&gt;
result = ocr.ocr('sample.png')&lt;/p&gt;

&lt;p&gt;for line in result[0]:&lt;br&gt;
    print(line[1][0])&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;KERAS OCR:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Backend: TensorFlow / Keras&lt;br&gt;
Keras-OCR is built using deep learning models and provides both text detection and recognition&lt;/p&gt;

&lt;p&gt;HOW TO INSTALL:&lt;br&gt;
    pip install keras-ocr&lt;/p&gt;

&lt;p&gt;SAMPLE CODE:&lt;br&gt;
     import keras_ocr&lt;/p&gt;

&lt;p&gt;pipeline = keras_ocr.pipeline.Pipeline()&lt;br&gt;
images = [keras_ocr.tools.read('sample.png')]&lt;br&gt;
prediction = pipeline.recognize(images)&lt;/p&gt;

&lt;p&gt;print(prediction)&lt;/p&gt;

&lt;p&gt;HOW IT WORKS IN BACKEND:&lt;/p&gt;

&lt;p&gt;Image / PDF&lt;br&gt;
↓&lt;br&gt;
Preprocessing (OpenCV)&lt;br&gt;
↓&lt;br&gt;
Text Detection (DL/CNN)&lt;br&gt;
↓&lt;br&gt;
Text Recognition (CRNN / Transformer)&lt;br&gt;
↓&lt;br&gt;
Output Text (JSON / String)&lt;/p&gt;

&lt;p&gt;AVAILABLE MODULES :&lt;/p&gt;

&lt;p&gt;1.pytesseract&lt;/p&gt;

&lt;p&gt;2.easyocr&lt;/p&gt;

&lt;p&gt;3.paddleocr&lt;/p&gt;

&lt;p&gt;4.opencv-python&lt;/p&gt;

&lt;p&gt;5.pillow&lt;/p&gt;

&lt;p&gt;6.pdf2image&lt;/p&gt;

</description>
      <category>programming</category>
      <category>python</category>
      <category>learning</category>
      <category>performance</category>
    </item>
    <item>
      <title>OCR SOFTWARE AND TOOLS</title>
      <dc:creator>Yuvan Shankar</dc:creator>
      <pubDate>Tue, 10 Feb 2026 12:28:47 +0000</pubDate>
      <link>https://forem.com/yuvan_shankar_20d7cf9302c/ocr-software-and-tools-387l</link>
      <guid>https://forem.com/yuvan_shankar_20d7cf9302c/ocr-software-and-tools-387l</guid>
      <description>&lt;p&gt;What is OCR?&lt;br&gt;
OCR (Optical Character Recognition) is a technology that reads text from images or scanned documents and converts it into editable and searchable text.&lt;br&gt;
It helps us change paper documents or image files into digital text that we can copy, edit, and store easily.&lt;/p&gt;

&lt;p&gt;PAID OCR SOFTWARE:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;ABBYY FineReader PDF&lt;br&gt;
ABBYY FineReader is a professional OCR software mainly used for document digitization. It can convert scanned PDFs and images into editable formats while keeping the original layout, tables, and formatting intact.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Adobe Acrobat Pro DC (OCR-Enabled)&lt;br&gt;
Adobe Acrobat Pro DC includes OCR functionality that allows users to recognize text in scanned PDFs and export them into formats like Word, Excel, or PowerPoint. It is widely used in offices and enterprises.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Nanonets OCR&lt;br&gt;
Nanonets OCR is an AI-based OCR software used for automated document processing. It is commonly used in business workflows such as invoice processing, form extraction, and data automation.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;COMMAN DRAWBACKS OF PAID OCR SOFTWARE:&lt;/p&gt;

&lt;p&gt;1.High cost: Paid OCR software usually requires monthly or yearly subscriptions, which can be expensive.&lt;/p&gt;

&lt;p&gt;2.Complex to use: Advanced features make the software powerful, but also increase the learning curve for new users.&lt;/p&gt;

&lt;p&gt;3.Resource heavy: Some paid OCR tools need powerful systems or cloud credits to work efficiently.&lt;/p&gt;

&lt;p&gt;FREE OCR SOFTWARE:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Tesseract OCR&lt;br&gt;
Tesseract OCR is a free and open-source OCR engine. It is used to extract text from images and PDFs and is commonly integrated into custom applications and student projects.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;OCRFeeder&lt;br&gt;
OCRFeeder is a free OCR software that provides a graphical interface and works as a front-end for OCR engines like Tesseract. It helps users manage and process scanned documents more easily.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;GOCR&lt;br&gt;
GOCR is a free GNU OCR tool mainly used for basic text extraction from simple images. It is an older OCR technology but still useful for simple OCR tasks.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;COMMAN DRAWBACKS OF FREE OCR TOOLS:&lt;/p&gt;

&lt;p&gt;1.Lower accuracy: Free OCR tools usually struggle with complex layouts and handwritten text compared to paid software.&lt;br&gt;
2.Limited support: Free tools have less technical support and fewer updates than commercial products.&lt;br&gt;
3.Fewer advanced features: Features like batch processing and table extraction are limited or missing.&lt;br&gt;
4.Technical setup required: Some free OCR tools do not have a proper graphical interface and require technical knowledge to use.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>beginners</category>
      <category>python</category>
      <category>career</category>
    </item>
    <item>
      <title>A journey through a code</title>
      <dc:creator>Yuvan Shankar</dc:creator>
      <pubDate>Mon, 09 Feb 2026 10:19:27 +0000</pubDate>
      <link>https://forem.com/yuvan_shankar_20d7cf9302c/a-journey-through-a-code-g66</link>
      <guid>https://forem.com/yuvan_shankar_20d7cf9302c/a-journey-through-a-code-g66</guid>
      <description>&lt;p&gt;This blog is a reflection of my learning journey in the tech field. I write about the projects I work on, the concepts I’m improving, and practical lessons gained through daily coding. Each post is driven by real experience, as I continue to learn step by step and grow through consistent practice.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>ai</category>
      <category>javascript</category>
      <category>beginners</category>
    </item>
  </channel>
</rss>
