DEV Community

Cover image for πŸ” ContentLens | AI-Powered Document Insights
Vikas Awasthi
Vikas Awasthi

Posted on

1

πŸ” ContentLens | AI-Powered Document Insights

Building ContentLens: My Journey Creating an AI-Powered Document Processing App

Introduction

Over the past weekend, I embarked on an exciting project to build ContentLens - a web application that uses AI to analyze and transform documents. In this blog post, I'll share my experience building this application, the technologies I used, challenges I faced, and what I learned along the way.

What is ContentLens?

ContentLens is a simple yet powerful application that:

  • Accepts various document formats (text, markdown, JSON, DOCX, and images)
  • Processes them using Google's Gemini AI
  • Returns the results in markdown format that you can download

Whether you need to summarize a long document, extract key points, translate content, or transform it into a different format, ContentLens can help. The application is designed with simplicity and privacy in mind - all uploaded files are processed and immediately deleted.

The Technology Stack

For this project, I chose to work with:

  • FastHTML and MonsterUI: These frameworks provided a clean way to build server-rendered interfaces with minimal JavaScript
  • Python: As the backend language, handling file processing and API integration
  • Google Gemini API: For the AI capabilities that power the document analysis
  • Railway: For deployment and hosting

Building the Application: Step by Step

1. Planning the Architecture

I began by planning a clean object-oriented architecture with these main components:

  • Document class for handling different file types
  • Processor class for interacting with the Gemini API
  • Web routes for handling user requests

2. File Processing Challenges

One of the more challenging aspects was handling different file types. Each format required a different approach:

  • Text and markdown files needed simple reading
  • DOCX files required parsing with python-docx
  • Images needed special handling for the AI

I implemented a strategy pattern where the Document class would handle extraction differently based on file type.

3. Privacy and Security Considerations

From the beginning, I wanted to ensure user privacy. I implemented:

  • Immediate deletion of uploaded files after processing
  • Removal of processed results after download
  • Environment variables for API keys
  • Input validation and error handling

4. User Experience Enhancements

Based on feedback from early testers, I added:

  • File upload indicators
  • Processing status feedback
  • Dark mode compatibility
  • Helpful example instructions

Lessons Learned

This project taught me several valuable lessons:

  1. The power of separation of concerns: By keeping document handling, AI processing, and web interfaces separate, the code remained clean and maintainable.

  2. The importance of user feedback: Adding visual indicators for uploads and processing made the application much more user-friendly.

  3. Deployment considerations: Ensuring environment variables were properly set up in Railway and that file paths worked correctly in the deployed environment.

  4. The value of iterative development: Starting with a minimal viable product and adding features based on feedback proved effective.

Future Enhancements

While ContentLens is functional, there are several enhancements I'm considering:

  • Support for more file formats (PDF, EPUB)
  • Batch processing of multiple files
  • Custom AI model selection
  • User accounts for saving processing history
  • Additional output formats beyond markdown

Try It Out!

You can try ContentLens yourself at contentlens-production.up.railway.app or check out the code on GitHub.

I welcome any feedback or suggestions for improvement!

Conclusion

What I Learned & What's Next

Building ContentLens taught me a lot about integrating AI APIs and creating clean architecture. I'm planning to add PDF support and batch processing next.

Questions for you:

  • What other file formats would you find useful in a tool like this?
  • Have you worked with the Gemini API? How does it compare to other LLMs?
  • What challenges have you faced when deploying Python web apps?

I'd love to hear your thoughts and suggestions in the comments!

Links:

AWS Q Developer image

Your AI Code Assistant

Ask anything about your entire project, code and get answers and even architecture diagrams. Built to handle large projects, Amazon Q Developer works alongside you from idea to production code.

Start free in your IDE

Top comments (0)

AWS Q Developer image

Your AI Code Assistant

Automate your code reviews. Catch bugs before your coworkers. Fix security issues in your code. Built to handle large projects, Amazon Q Developer works alongside you from idea to production code.

Get started free in your IDE

πŸ‘‹ Kindness is contagious

Engage with a wealth of insights in this thoughtful article, valued within the supportive DEV Community. Coders of every background are welcome to join in and add to our collective wisdom.

A sincere "thank you" often brightens someone’s day. Share your gratitude in the comments below!

On DEV, the act of sharing knowledge eases our journey and fortifies our community ties. Found value in this? A quick thank you to the author can make a significant impact.

Okay