DEV Community

Cover image for 🧠🥷How to make Image generation and editing MCP (Gemini API + Cline and Cursor)
Web Developer Hyper
Web Developer Hyper

Posted on • Edited on

6 2 2

🧠🥷How to make Image generation and editing MCP (Gemini API + Cline and Cursor)

Update (2025/05/17): Wrote about Ver 3.0 of Image generation and editing MCP
🧠🥷MCP Transports (Image generation and editing MCP Ver 3.0 (WebSocket + Next.js + Gemini API + Cline and Cursor))

Update (2025/05/12): I made a Ver 2.0 to solve the problems. I will write a post about it soon.

Update (2025/05/10): Wrote about Meme Generating MCP
🧠🥷How to make Meme Generating MCP (Cline and Cursor)

Intro

Hello! I'm a Ninja Web Developer. Hi-Yah!🥷

Although, I have been playing with studying MCP lately.↓
🧠🥷How to make AI controled Avatar 2 (Vroid MCP + Cline and Cursor + Unity)
🧠🥷How to make cool Ninja game (Unity MCP + Blender MCP (Cline and Cursor))
🧠🥷How to make cool Ninja (Blender MCP (Cline and Cursor))

I made an Image generation and editing Web App last time.↓
🧠🥷Gemini API 2 (Image generation and editing (free and fast))
So this time, I will change the Web App to MCP.
OK! Let's begin!🚀

Outline of system

This system has a simple three layer structure.
Cline or Cursor → MCP → Next.js Web App
1️⃣ Cline or Cursor part will send the instruction to MCP.
2️⃣ MCP part will relay the instruction to Next.js Web App.
3️⃣ Next.js Web App part will call Gemini API, and generate or edit an image, and display it on the browser.

How to set Next.js Web App

The code differs from the last time, because this time MCP is used.
1️⃣ Make the API key for Gemini API
https://aistudio.google.com/app/apikey

2️⃣ Make a Next.js project

npx create-next-app@latest
Enter fullscreen mode Exit fullscreen mode

https://nextjs.org/docs/app/getting-started/installation

3️⃣ Install Gemini API library

npm install @google/genai
Enter fullscreen mode Exit fullscreen mode

4️⃣ Set the codes
Code of frontend (app/page.tsx)

"use client";
import React, { useState, useEffect } from "react";

export default function HomePage() {
  const [generateImage, setGenerateImage] = useState("");
  const [editImage, setEditImage] = useState("");

  useEffect(() => {
    async function generateImage() {
      try {
        const response = await fetch("/api/generate-image");
        const data = await response.json();
        if (data.imageUrl) {
          setGenerateImage(data.imageUrl);
        }
      } catch (error) {
        console.error("Error generating image:", error);
      }
    }

    async function editImage() {
      try {
        const response = await fetch("/api/edit-image");
        const data = await response.json();
        if (data.imageUrl) {
          setEditImage(data.imageUrl);
        }
      } catch (error) {
        console.error("Error editing image:", error);
      }
    }

    generateImage();
    editImage();

    const intervalId = setInterval(() => {
      generateImage();
      editImage();
      window.location.reload();
    }, 4000);

    return () => clearInterval(intervalId as NodeJS.Timeout);
  }, [generateImage, editImage]);

  return (
    <div>
      <h1>Generate and Edit Image</h1>
      {generateImage && (
        <img
          src={generateImage}
          alt="Generated Image"
          style={{ maxWidth: "500px" }}
        />
      )}
      {editImage && (
        <img src={editImage} alt="Edit Image" style={{ maxWidth: "500px" }} />
      )}
    </div>
  );
}
Enter fullscreen mode Exit fullscreen mode

Code of image generation (app/api/generate-image/route.ts)

import { NextRequest, NextResponse } from "next/server";
import { GoogleGenAI, Modality } from "@google/genai";
import fs from "fs/promises";
import path from "path";

let cachedImage = "";

export async function POST(req: NextRequest) {
  try {
    const { prompt } = await req.json();

    const ai = new GoogleGenAI({
      apiKey: process.env.GEMINI_API_KEY!,
    });

    const response = await ai.models.generateContent({
      model: "gemini-2.0-flash-exp-image-generation",
      contents: prompt,
      config: {
        responseModalities: [Modality.TEXT, Modality.IMAGE],
      },
    });

    let imageData = "";
    if (
      response.candidates &&
      response.candidates[0] &&
      response.candidates[0].content &&
      response.candidates[0].content.parts
    ) {
      for (const part of response.candidates[0].content.parts) {
        if (part.inlineData) {
          imageData = part.inlineData.data || "";
          break;
        }
      }
    }

    if (imageData) {
      let base64Data = imageData;
      if (imageData.startsWith("data:image")) {
        base64Data = imageData.split(",")[1];
      }
      if (base64Data) {
        const imageBuffer = Buffer.from(base64Data, "base64");
        const imagePath = path.join(
          process.cwd(),
          "public",
          "generated-image.png"
        );
        await fs.writeFile(imagePath, imageBuffer);

        const imageUrl = "/generated-image.png";
        cachedImage = imageUrl;
      } else {
        console.error("No base64 data found in image data");
        return NextResponse.json(
          { error: "No base64 data found in image data" },
          { status: 500 }
        );
      }
    } else {
      console.error("No image data found in response");
      return NextResponse.json(
        { error: "No image data found in response" },
        { status: 500 }
      );
    }
    return NextResponse.json({ imageUrl: cachedImage });
  } catch (error: unknown) {
    console.error("Error:", error);
    return NextResponse.json(
      { error: (error as Error).message },
      { status: 500 }
    );
  }
}

export async function GET() {
  return NextResponse.json({ imageUrl: cachedImage });
}

Enter fullscreen mode Exit fullscreen mode

Code of image editing (app/api/edit-image/route.ts)

import { NextRequest, NextResponse } from "next/server";
import { GoogleGenAI, Modality } from "@google/genai";
import fs from "fs/promises";
import path from "path";

let cachedImage = "";

export async function POST(req: NextRequest) {
  try {
    const { prompt } = await req.json();
    const imagePath = path.join(process.cwd(), "public", "generated-image.png");
    const imageBuffer = await fs.readFile(imagePath);

    const ai = new GoogleGenAI({
      apiKey: process.env.GEMINI_API_KEY!,
    });

    const contents = [
      { text: prompt },
      {
        inlineData: {
          mimeType: "image/png",
          data: imageBuffer.toString("base64"),
        },
      },
    ];

    const response = await ai.models.generateContent({
      model: "gemini-2.0-flash-exp-image-generation",
      contents,
      config: {
        responseModalities: [Modality.TEXT, Modality.IMAGE],
      },
    });

    let imageData = "";
    if (
      response.candidates &&
      response.candidates[0] &&
      response.candidates[0].content &&
      response.candidates[0].content.parts
    ) {
      for (const part of response.candidates[0].content.parts) {
        if (part.inlineData) {
          imageData = part.inlineData.data || "";
          break;
        }
      }
    }

    if (imageData) {
      let base64Data = imageData;
      if (imageData.startsWith("data:image")) {
        base64Data = imageData.split(",")[1];
      }
      if (base64Data) {
        const imageBuffer = Buffer.from(base64Data, "base64");
        const imagePath = path.join(
          process.cwd(),
          "public",
          "edited-image.png"
        );
        await fs.writeFile(imagePath, imageBuffer);

        const imageUrl = "/edited-image.png";
        cachedImage = imageUrl;
      } else {
        console.error("No base64 data found in image data");
        return NextResponse.json(
          { error: "No base64 data found in image data" },
          { status: 500 }
        );
      }
    } else {
      console.error("No image data found in response");
      return NextResponse.json(
        { error: "No image data found in response" },
        { status: 500 }
      );
    }

    return NextResponse.json({ imageUrl: cachedImage });
  } catch (error: unknown) {
    console.error("Error:", error);
    return NextResponse.json(
      { error: (error as Error).message },
      { status: 500 }
    );
  }
}

export async function GET() {
  return NextResponse.json({ imageUrl: cachedImage });
}
Enter fullscreen mode Exit fullscreen mode

Environment Variable (.env.local)

GEMINI_API_KEY=your_real_api_key_here
Enter fullscreen mode Exit fullscreen mode

How to set Image generation and editing MCP

1️⃣ Make a folder for Image MCP Server and open it from your editor.

2️⃣ Make package.json.↓

npm init
Enter fullscreen mode Exit fullscreen mode

3️⃣ Install MCP SDK.↓

npm install @modelcontextprotocol/sdk
Enter fullscreen mode Exit fullscreen mode

4️⃣ Make tsconfig.json.↓

tsc --init
Enter fullscreen mode Exit fullscreen mode

5️⃣ Add "build": "tsc", to scripts of package.json.

6️⃣ Add index.ts(Typescript) of Image MCP Server.↓

#!/usr/bin/env node
import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import {
  CallToolRequestSchema,
  ErrorCode,
  ListToolsRequestSchema,
  McpError,
} from "@modelcontextprotocol/sdk/types.js";
import axios from "axios";

class ImageServer {
  private server: Server;

  constructor() {
    this.server = new Server(
      {
        name: "image-mcp-server",
        version: "0.1.0",
      },
      {
        capabilities: {
          resources: {},
          tools: {},
        },
      }
    );

    this.setupToolHandlers();

    this.server.onerror = (error) => console.error("[MCP Error]", error);
    process.on("SIGINT", async () => {
      await this.server.close();
      process.exit(0);
    });
  }

  private setupToolHandlers() {
    this.server.setRequestHandler(ListToolsRequestSchema, async () => ({
      tools: [
        {
          name: "generate_image",
          description: "Generates an image using Gemini API",
          inputSchema: {
            type: "object",
            properties: {
              prompt: {
                type: "string",
                description: "The prompt to use for image generation",
              },
            },
            required: ["prompt"],
          },
        },
        {
          name: "edit_image",
          description: "Edits an image using Gemini API",
          inputSchema: {
            type: "object",
            properties: {
              prompt: {
                type: "string",
                description: "The prompt to use for image editing",
              },
            },
            required: ["prompt"],
          },
        },
      ],
    }));

    this.server.setRequestHandler(CallToolRequestSchema, async (request) => {
      if (request.params.name === "generate_image") {
        const { prompt } = request.params.arguments as { prompt: string };

        try {
          await axios.post("http://localhost:3000/api/generate-image", {
            prompt: prompt,
          });

          return {
            content: [
              {
                type: "text",
                text: "Image generated successfully",
              },
            ],
          };
        } catch (error: any) {
          console.error(error);
          return {
            content: [
              {
                type: "text",
                text: `Error generating image: ${error.message}`,
              },
            ],
            isError: true,
          };
        }
      } else if (request.params.name === "edit_image") {
        const { prompt } = request.params.arguments as { prompt: string };

        try {
          await axios.post("http://localhost:3000/api/edit-image", {
            prompt: prompt,
          });

          return {
            content: [
              {
                type: "text",
                text: "Image edited successfully",
              },
            ],
          };
        } catch (error: any) {
          console.error(error);
          return {
            content: [
              {
                type: "text",
                text: `Error editing image: ${error.message}`,
              },
            ],
            isError: true,
          };
        }
      }
      throw new McpError(
        ErrorCode.MethodNotFound,
        `Unknown tool: ${request.params.name}`
      );
    });
  }

  async run() {
    const transport = new StdioServerTransport();
    await this.server.connect(transport);
    console.error("Image MCP server running on stdio");
    console.log("mcp ok!");
  }
}

const server = new ImageServer();
server.run().catch(console.error);

Enter fullscreen mode Exit fullscreen mode

7️⃣ Build index.ts to index.js.↓

run npm build
Enter fullscreen mode Exit fullscreen mode

8️⃣ Set cline_mcp_settings.json for Cline and mcp.json for Cursor.↓

{
  "mcpServers": {
    "image-mcp-server": {
      "command": "node",
      "args": ["path to index.js"]
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

How to use them

1️⃣ npm run dev, and start the Next.js App, and access http://localhost:3000.

2️⃣ Ask your Cline or Cursor to generate an image.
For example,

Use "generate_image" tool of "image-mcp-server",
 and send "Create a plumber wearing a red overall
 and a red hat with a mustache. 
It shouldn't look too much like Mario."
Enter fullscreen mode Exit fullscreen mode

Image description

3️⃣ Ask your Cline or Cursor to edit an image.
For example,

Use "edit_image" tool of "image-mcp-server",
 and send "Change the overall and hat to green.".
Enter fullscreen mode Exit fullscreen mode

Image description

4️⃣ Yippee! We can generate and edit images from Cline and Cursor.🎉

Problem of this junk system🙅

Last time the system was just a simple Next.js Web App, so it was simple to make.
Input the prompt from Frontend → Gemini API makes an Image at Backend → Return the Image to Frontend

However, this time I wanted to use MCP, so the structure changed.
Input the prompt from MCP → Gemini API makes an Image at Backend → Huh? How can Frontend display the image?

I wanted to make the system simple, so I used setInterval and fetched the image for 4 seconds interval.
I think it is not good...😟

Also, first time of using MCP will display the image OK, but after that the image will not be displayed unless I reload the page.
So I also add, reload for 4 seconds interval.
I think this is bad...😟

I tried to fix these problems, but they did not work well.
And unfortunateley I run out of time, so maybe I will fix these problems next time.🙇

Update (2025/05/17): Wrote about Ver 3.0 of Image generation and editing MCP
🧠🥷MCP Transports (Image generation and editing MCP Ver 3.0 (WebSocket + Next.js + Gemini API + Cline and Cursor))

Update (2025/05/12): I made a Ver 2.0 to solve the problems. I will write a post about it soon.

Outro

By changing it from Web App to MCP, we can use it from Cline and Cursor easily.
Now it will be easier to make Meme than before.
Oh! maybe next time, I will change this MCP to a Meme generating MCP.

I hope you will learn something from this post, or maybe enjoy even a little.
Thank you for reading.
Happy AI coding!🤖 Hi-Yah!🥷

Top comments (2)

Collapse
 
nevodavid profile image
Nevo David

honestly this hit home for me, cuz i always end up duct-taping stuff last minute too and things break in dumb ways - you ever feel like the quick fixes just bite back harder later?

Collapse
 
webdeveloperhyper profile image
Web Developer Hyper

Thank you for checking my post.
I wish I could solve the problems faster...
I will keep learning to improve my skills, and create a good post.

ACI image

ACI.dev: Fully Open-source AI Agent Tool-Use Infra (Composio Alternative)

100% open-source tool-use platform (backend, dev portal, integration library, SDK/MCP) that connects your AI agents to 600+ tools with multi-tenant auth, granular permissions, and access through direct function calling or a unified MCP server.

Check out our GitHub!

Join the Runner H "AI Agent Prompting" Challenge: $10,000 in Prizes for 20 Winners!

Runner H is the AI agent you can delegate all your boring and repetitive tasks to - an autonomous agent that can use any tools you give it and complete full tasks from a single prompt.

Check out the challenge

DEV is bringing live events to the community. Dismiss if you're not interested. ❤️