Vivek V.

for AWS Community Builders

Posted on May 3 • Edited on May 27

Crushing the Command Line with Amazon Q: Building a Medium to DEV.to Converter

#devchallenge #awschallenge #ai #webdev

What I Built

I built medium2dev, a command-line tool that automatically converts Medium blog posts to DEV.to-compatible markdown format and optionally publishes them directly to DEV.to as draft posts. This tool solves a common problem for technical writers who publish on multiple platforms: maintaining consistent formatting across different publishing platforms.

As someone with over 75 Medium posts, I needed a reliable, repeatable way to move them to DEV.to to boost visibility in the AWS developer community. Manual copy-pasting was error-prone and time-consuming. DEV.to also has markdown-native support, and allows programmatic publishing via API. Unlike platforms like Hashnode or Substack, DEV.to has a cleaner integration flow for automation workflows

The tool preserves:

Article title and headings
Text formatting (bold, italic, etc.)
Code blocks with proper syntax highlighting
Lists and other structural elements
Generates proper DEV.to frontmatter

The tool also generates a canonical link to your original Medium article, allowing you to republish the same content on DEV without any negative impact on SEO — a true win-win!

Demo

Here's a demonstration of converting a Medium article to DEV.to format:

# Basic conversion
python3 medium2dev.py https://medium.com/aws-in-plain-english/aws-resource-tag-compliance-with-automation-64ae16e42a11

# Publish directly to DEV.to as a draft
python3 medium2dev.py https://medium.com/aws-in-plain-english/aws-resource-tag-compliance-with-automation-64ae16e42a11 --publish --api-key YOUR_DEVTO_API_KEY

# Or use environment variable for API key
export DEVTO_API_KEY=your_api_key
python3 medium2dev.py https://medium.com/aws-in-plain-english/aws-resource-tag-compliance-with-automation-64ae16e42a11 --publish

Output:

2025-05-03 15:07:49,696 - medium2dev - INFO - Fetching article from https://medium.com/aws-in-plain-english/aws-resource-tag-compliance-with-automation-64ae16e42a11
2025-05-03 15:07:51,100 - medium2dev - INFO - Original Medium content word count: 463
2025-05-03 15:07:51,100 - medium2dev - INFO - Downloaded 0 content images
2025-05-03 15:07:51,116 - medium2dev - INFO - Conversion complete! Output saved to /Users/vivekvelso/Documents/opensource/medium2dev/new/aws-resource-tag-compliance-with-automation-64ae16e42a11.md

Conversion successful! Output saved to: /Users/vivekvelso/Documents/opensource/medium2dev/aws-resource-tag-compliance-with-automation-64ae16e42a11.md
Images saved to: /Users/vivekvelso/Documents/opensource/medium2dev/images

When publishing to DEV.to, it also displays a word count comparison:

Successfully published as draft to DEV.to!

Word Count Comparison:
| Platform | Word Count |
|----------|------------|
| Medium   | 463        |
| DEV.to   | 458        |

This minor difference in word count shows how accurately the tool preserves original content structure while converting HTML to markdown. The reduced word count on DEV.to is typically due to removed metadata and UI clutter, making the post leaner without losing meaning.

The tool successfully:

Fetches the Medium article
Cleans up author metadata and unnecessary UI elements
Removes Medium-specific UI elements like "Listen", "Share", etc.
Converts the content to markdown
Generates a properly formatted DEV.to markdown file
(Optionally) Publishes the article as a draft to DEV.to
Provides a word count comparison between platforms

Use Cases:

Migrating blog archives from Medium to DEV.to
Maintaining consistent branding across platforms
Preparing drafts offline before publishing
Archiving Medium content in Markdown format

Code Repository

The code is available on GitHub: medium2dev

Key files:

medium2dev.py: The main Python script
requirements.txt: Dependencies
README.md: Documentation

How I Used Amazon Q Developer

I used Amazon Q Developer CLI to create this entire project from scratch. Here's how the conversation went:

Initial Prompt

I started by explaining my idea to Amazon Q:

I am submitting an idea for this challenge. Crushing the Command Line
Build an automation with Amazon Q Developer CLI that makes your work easier, faster, or better.

My idea is to create a tool to convert medium posts to dev.to markdown posts preserving the title heading, formatting and inline images the same way. Please give exact steps and prompts by referring to collection from promptz tool if you find anything to enhance the solution, and also what prompts were given to automate the whole command line solution to q and how q responded, log all that in this blog post for submission, along with the code that is generated, then test it on the url https://medium.com/aws-in-plain-english/aws-resource-tag-compliance-with-automation-64ae16e42a11 and verify the markdown generated, and confirm that all text, image, formatting is same.

Amazon Q's Approach

One of the most impressive parts of using Amazon Q was its ability to translate abstract requirements into actionable code suggestions. It provided me with an initial project scaffold and even suggested pip packages like html2text, requests, and beautifulsoup4 based on the problem statement.

Amazon Q first checked for relevant prompts in the promptz.dev collection that might help with this task:



["CLI", "Markdown"]

After finding no specific prompts for Markdown conversion, Amazon Q proceeded to design a solution from scratch. It created a Python script that would:

Fetch Medium articles using web scraping
Extract content while preserving structure
Download and reference images
Convert HTML to Markdown format
Generate DEV.to compatible frontmatter

Solution Implementation

Amazon Q generated the complete medium2dev.py script with these key components:

class Medium2Dev:
    def __init__(self, url, output_dir=None, image_dir=None, api_key=None):
        """Initialize the converter with the Medium post URL."""
        self.url = url
        self.output_dir = output_dir or os.getcwd()
        self.image_dir = image_dir or os.path.join(self.output_dir, 'images')
        self.api_key = api_key
        # ...

    def fetch_article(self):
        """Fetch the Medium article content."""
        # ...

    def extract_content(self, html_content):
        """Extract the article content from the HTML."""
        # ...

    def download_images(self, content):
        """Download images and update their references in the content."""
        # ...

    def convert_to_markdown(self, content):
        """Convert HTML content to Markdown format suitable for DEV.to."""
        # ...

    def generate_frontmatter(self, title, date):
        """Generate DEV.to frontmatter."""
        # ...

    def convert(self):
        """Convert the Medium post to DEV.to markdown format."""
        # ...

    def publish_to_devto(self, title, markdown_content):
        """Publish the converted markdown as a draft post to DEV.to."""
        # ...

Amazon Q also created the necessary supporting files:

requirements.txt with the required dependencies
README.md with usage instructions

Improving the Solution

After initial testing, I noticed some issues with the generated markdown:

Author metadata was still being included
The script wasn't properly extracting content images
Medium-specific UI elements like "Listen", "Share", etc. were appearing in the output
Images in the post were not getting extracted

After sharing the test results and markdown discrepancies with Amazon Q by just pasting them directly in Q-CLI, it refined the scraping logic, eliminating unnecessary and tags and improving heading preservation. The CLI flag parsing logic was also streamlined for better UX.

I asked Amazon Q to improve the solution again as a prompt, and it made several key enhancements:

Better Content Extraction: Improved the HTML parsing to focus only on the actual article content

   # Create a new div to hold only the content we want
   content_div = soup.new_tag('div')

   # Find all the content sections (paragraphs, headings, code blocks, images)
   content_elements = article_tag.find_all(['p', 'h2', 'h3', 'h4', 'pre', 'figure', 'img', 'blockquote', 'ul', 'ol'])

Metadata Removal: Added more robust filtering to remove author bylines, claps, and other Medium-specific UI elements

   # Skip elements with author info, claps, etc.
   if element.find(string=re.compile(r'clap|follow|min read|sign up|bookmark|Listen|Share')):
       continue

   # Skip elements that just contain "--" or numbers at the beginning
   if element.name == 'p' and re.match(r'^\s*--\s*$|^\s*\d+\s*$', element.text.strip()):
       continue

Enhanced Image Handling: Improved the image extraction and downloading process

   # For Medium images, try to get the full-size version
   if 'miro.medium.com' in img_url:
       # Remove size constraints from URL to get original image
       img_url = re.sub(r'/resize:[^/]+/', '/', img_url)

Better Markdown Cleanup: Added post-processing to clean up Medium-specific elements

   # Remove Medium-specific footer text and links
   markdown = re.sub(r'\n\s*\[.*?\]\(https?://medium\.com/.*?\)\s*\n', '\n\n', markdown)

   # Remove clap indicators and other Medium UI elements
   markdown = re.sub(r'\d+\s*claps?', '', markdown)

CLI Enhancements: Added command-line options for publishing directly to DEV.to

   parser.add_argument('-p', '--publish', action='store_true', help='Publish to DEV.to as draft')
   parser.add_argument('-k', '--api-key', help='DEV.to API key (if not set via DEVTO_API_KEY environment variable)')

DEV.to Integration: Added functionality to publish directly to DEV.to as a draft post

   def publish_to_devto(self, title, markdown_content):
       """Publish the converted markdown as a draft post to DEV.to."""
       if not self.api_key:
           logger.error("No DEV.to API key provided. Skipping publish.")
           return False

       logger.info("Publishing to DEV.to as draft...")

       api_url = "https://dev.to/api/articles"
       headers = {
           "api-key": self.api_key,
           "Content-Type": "application/json"
       }

       # Prepare the article data
       article_data = {
           "article": {
               "title": title,
               "body_markdown": markdown_content,
               "published": False  # Set as draft
           }
       }

       try:
           response = requests.post(api_url, headers=headers, json=article_data)
           response.raise_for_status()
           article_data = response.json()
           logger.info(f"Successfully published draft to DEV.to!")
           return True
       except requests.RequestException as e:
           logger.error(f"Error publishing to DEV.to: {e}")
           return False

Word Count Comparison: Added functionality to compare word counts between Medium and DEV.to versions

   # Calculate the word count of the original content
   content_text = ' '.join([element.get_text() for element in content_div.contents])
   self.medium_word_count = len(content_text.split())
   logger.info(f"Original Medium content word count: {self.medium_word_count}")

   # Calculate DEV.to word count
   devto_word_count = len(re.sub(r'---.*?---\n', '', markdown_content, flags=re.DOTALL).split())

   # Display comparison when publishing
   print("\nWord Count Comparison:")
   print("| Platform | Word Count |")
   print("|----------|------------|")
   print(f"| Medium   | {converter.medium_word_count} |")
   print(f"| DEV.to   | {devto_word_count} |")

Testing the Solution

I tested the improved solution with the provided Medium article URL:

python3 medium2dev.py https://medium.com/aws-in-plain-english/aws-resource-tag-compliance-with-automation-64ae16e42a11

The script successfully:

Downloaded the article content
Removed author metadata and Medium-specific UI elements
Converted the HTML content to clean markdown
Generated a properly formatted DEV.to markdown file with valid tags

I also tested with a different URL format to ensure the solution is robust:

python3 medium2dev.py https://medium.com/@vivek-aws/aws-resource-tag-compliance-with-automation-64ae16e42a11

The tool worked perfectly with both URL formats, demonstrating its flexibility.

This solution can significantly benefit technical bloggers or developer advocates who frequently cross-post. Instead of manually copying and adjusting formatting, they now have a plug-and-play CLI that ensures consistency and correctness.

Challenges and Solutions

During development, Amazon Q addressed several challenges:

Medium's Dynamic Content: Used proper headers and request parameters to ensure content was fully loaded

   headers = {
       'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
       'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
       # Additional headers...
   }

Content Extraction: Developed a robust approach to extract only the relevant content

   # Try multiple approaches to find the content
   article_tag = soup.find('article')
   if not article_tag:
       article_tag = soup.select_one('div.section-content')
   if not article_tag:
       article_tag = soup.find('div', class_='postArticle-content')

Metadata Removal: Created comprehensive filters to remove Medium-specific UI elements

   # Remove Medium-specific UI elements and metadata
   for element in content.select('.postMetaLockup, .graf--pullquote, .section-divider'):
       if element:
           element.decompose()

DEV.to Integration: Added support for publishing directly to DEV.to using their API

   # Get API key from environment or command line
   api_key = args.api_key or os.environ.get('DEVTO_API_KEY')

DEV.to Tag Format: Fixed tag format to comply with DEV.to requirements

   # Convert "aws in plain english" to "awsinplainenglish"
   potential_tag = re.sub(r'[^a-zA-Z0-9]', '', potential_tag)

Word Count Comparison: Added functionality to compare word counts between platforms without modifying the output file

   # Calculate word counts and display comparison
   medium_word_count = converter.medium_word_count
   devto_word_count = len(re.sub(r'---.*?---\n', '', markdown_content, flags=re.DOTALL).split())

Future enhancements may include converting image captions, supporting other platforms like Hashnode, and adding a preview mode before publishing. Also, support for batch processing multiple Medium URLs is on the roadmap.

Planned Features:

Bulk conversion from a list of Medium URLs
Tag mapping between Medium and DEV.to
Image hosting alternatives (e.g., S3 or Imgur)
GitHub Action for scheduled sync

Conclusion

Using Amazon Q Developer CLI, I was able to quickly create a functional tool that solves a real problem for technical writers. The entire process from idea to implementation took just minutes, demonstrating how Amazon Q can accelerate development workflows.

This tool saves significant time for writers who publish on multiple platforms, eliminating the need for manual reformatting and ensuring consistent presentation across platforms. The addition of direct publishing to DEV.to makes the workflow even more streamlined, allowing writers to go from Medium article to DEV.to draft with a single command.

Streaming live from AWS re:Inforce

What’s next in cybersecurity? Find out live from re:Inforce on Security LIVE!

Learn More

Top comments (0)

Best Practices for Running Container WordPress on AWS (ECS, EFS, RDS, ELB) using CDK

This post discusses the process of migrating a growing WordPress eShop business to AWS using AWS CDK for an easily scalable, high availability architecture. The detailed structure encompasses several pillars: Compute, Storage, Database, Cache, CDN, DNS, Security, and Backup.

Read full post

DEV Community