Web Developer Hyper

Posted on May 4 • Edited on May 21

🧠🥷How to make Image generation and editing MCP (Gemini API + Cline and Cursor)

#ai #react #chatgpt #webdev

Update (2025/05/17): Wrote about Ver 3.0 of Image generation and editing MCP
🧠🥷MCP Transports (Image generation and editing MCP Ver 3.0 (WebSocket + Next.js + Gemini API + Cline and Cursor))

Update (2025/05/12): I made a Ver 2.0 to solve the problems. I will write a post about it soon.

Update (2025/05/10): Wrote about Meme Generating MCP
🧠🥷How to make Meme Generating MCP (Cline and Cursor)

Intro

Hello! I'm a Ninja Web Developer. Hi-Yah!🥷

Although, I have been ~~playing with~~ studying MCP lately.↓
🧠🥷How to make AI controled Avatar 2 (Vroid MCP + Cline and Cursor + Unity)
🧠🥷How to make cool Ninja game (Unity MCP + Blender MCP (Cline and Cursor))
🧠🥷How to make cool Ninja (Blender MCP (Cline and Cursor))

I made an Image generation and editing Web App last time.↓
🧠🥷Gemini API 2 (Image generation and editing (free and fast))
So this time, I will change the Web App to MCP.
OK! Let's begin!🚀

Outline of system

This system has a simple three layer structure.
Cline or Cursor → MCP → Next.js Web App
1️⃣ Cline or Cursor part will send the instruction to MCP.
2️⃣ MCP part will relay the instruction to Next.js Web App.
3️⃣ Next.js Web App part will call Gemini API, and generate or edit an image, and display it on the browser.

How to set Next.js Web App

The code differs from the last time, because this time MCP is used.
1️⃣ Make the API key for Gemini API
https://aistudio.google.com/app/apikey

2️⃣ Make a Next.js project

npx create-next-app@latest

https://nextjs.org/docs/app/getting-started/installation

3️⃣ Install Gemini API library

npm install @google/genai

4️⃣ Set the codes
Code of frontend (app/page.tsx)

"use client";
import React, { useState, useEffect } from "react";

export default function HomePage() {
  const [generateImage, setGenerateImage] = useState("");
  const [editImage, setEditImage] = useState("");

  useEffect(() => {
    async function generateImage() {
      try {
        const response = await fetch("/api/generate-image");
        const data = await response.json();
        if (data.imageUrl) {
          setGenerateImage(data.imageUrl);
        }
      } catch (error) {
        console.error("Error generating image:", error);
      }
    }

    async function editImage() {
      try {
        const response = await fetch("/api/edit-image");
        const data = await response.json();
        if (data.imageUrl) {
          setEditImage(data.imageUrl);
        }
      } catch (error) {
        console.error("Error editing image:", error);
      }
    }

    generateImage();
    editImage();

    const intervalId = setInterval(() => {
      generateImage();
      editImage();
      window.location.reload();
    }, 4000);

    return () => clearInterval(intervalId as NodeJS.Timeout);
  }, [generateImage, editImage]);

  return (
    <div>
      <h1>Generate and Edit Image</h1>
      {generateImage && (
        <img
          src={generateImage}
          alt="Generated Image"
          style={{ maxWidth: "500px" }}
        />
      )}
      {editImage && (
        <img src={editImage} alt="Edit Image" style={{ maxWidth: "500px" }} />
      )}
    </div>
  );
}

Code of image generation (app/api/generate-image/route.ts)

import { NextRequest, NextResponse } from "next/server";
import { GoogleGenAI, Modality } from "@google/genai";
import fs from "fs/promises";
import path from "path";

let cachedImage = "";

export async function POST(req: NextRequest) {
  try {
    const { prompt } = await req.json();

    const ai = new GoogleGenAI({
      apiKey: process.env.GEMINI_API_KEY!,
    });

    const response = await ai.models.generateContent({
      model: "gemini-2.0-flash-exp-image-generation",
      contents: prompt,
      config: {
        responseModalities: [Modality.TEXT, Modality.IMAGE],
      },
    });

    let imageData = "";
    if (
      response.candidates &&
      response.candidates[0] &&
      response.candidates[0].content &&
      response.candidates[0].content.parts
    ) {
      for (const part of response.candidates[0].content.parts) {
        if (part.inlineData) {
          imageData = part.inlineData.data || "";
          break;
        }
      }
    }

    if (imageData) {
      let base64Data = imageData;
      if (imageData.startsWith("data:image")) {
        base64Data = imageData.split(",")[1];
      }
      if (base64Data) {
        const imageBuffer = Buffer.from(base64Data, "base64");
        const imagePath = path.join(
          process.cwd(),
          "public",
          "generated-image.png"
        );
        await fs.writeFile(imagePath, imageBuffer);

        const imageUrl = "/generated-image.png";
        cachedImage = imageUrl;
      } else {
        console.error("No base64 data found in image data");
        return NextResponse.json(
          { error: "No base64 data found in image data" },
          { status: 500 }
        );
      }
    } else {
      console.error("No image data found in response");
      return NextResponse.json(
        { error: "No image data found in response" },
        { status: 500 }
      );
    }
    return NextResponse.json({ imageUrl: cachedImage });
  } catch (error: unknown) {
    console.error("Error:", error);
    return NextResponse.json(
      { error: (error as Error).message },
      { status: 500 }
    );
  }
}

export async function GET() {
  return NextResponse.json({ imageUrl: cachedImage });
}

Code of image editing (app/api/edit-image/route.ts)

import { NextRequest, NextResponse } from "next/server";
import { GoogleGenAI, Modality } from "@google/genai";
import fs from "fs/promises";
import path from "path";

let cachedImage = "";

export async function POST(req: NextRequest) {
  try {
    const { prompt } = await req.json();
    const imagePath = path.join(process.cwd(), "public", "generated-image.png");
    const imageBuffer = await fs.readFile(imagePath);

    const ai = new GoogleGenAI({
      apiKey: process.env.GEMINI_API_KEY!,
    });

    const contents = [
      { text: prompt },
      {
        inlineData: {
          mimeType: "image/png",
          data: imageBuffer.toString("base64"),
        },
      },
    ];

    const response = await ai.models.generateContent({
      model: "gemini-2.0-flash-exp-image-generation",
      contents,
      config: {
        responseModalities: [Modality.TEXT, Modality.IMAGE],
      },
    });

    let imageData = "";
    if (
      response.candidates &&
      response.candidates[0] &&
      response.candidates[0].content &&
      response.candidates[0].content.parts
    ) {
      for (const part of response.candidates[0].content.parts) {
        if (part.inlineData) {
          imageData = part.inlineData.data || "";
          break;
        }
      }
    }

    if (imageData) {
      let base64Data = imageData;
      if (imageData.startsWith("data:image")) {
        base64Data = imageData.split(",")[1];
      }
      if (base64Data) {
        const imageBuffer = Buffer.from(base64Data, "base64");
        const imagePath = path.join(
          process.cwd(),
          "public",
          "edited-image.png"
        );
        await fs.writeFile(imagePath, imageBuffer);

        const imageUrl = "/edited-image.png";
        cachedImage = imageUrl;
      } else {
        console.error("No base64 data found in image data");
        return NextResponse.json(
          { error: "No base64 data found in image data" },
          { status: 500 }
        );
      }
    } else {
      console.error("No image data found in response");
      return NextResponse.json(
        { error: "No image data found in response" },
        { status: 500 }
      );
    }

    return NextResponse.json({ imageUrl: cachedImage });
  } catch (error: unknown) {
    console.error("Error:", error);
    return NextResponse.json(
      { error: (error as Error).message },
      { status: 500 }
    );
  }
}

export async function GET() {
  return NextResponse.json({ imageUrl: cachedImage });
}

Environment Variable (.env.local)

GEMINI_API_KEY=your_real_api_key_here

How to set Image generation and editing MCP

1️⃣ Make a folder for Image MCP Server and open it from your editor.

2️⃣ Make package.json.↓

npm init

3️⃣ Install MCP SDK.↓

npm install @modelcontextprotocol/sdk

4️⃣ Make tsconfig.json.↓

tsc --init

5️⃣ Add "build": "tsc", to scripts of package.json.

6️⃣ Add index.ts(Typescript) of Image MCP Server.↓

#!/usr/bin/env node
import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import {
  CallToolRequestSchema,
  ErrorCode,
  ListToolsRequestSchema,
  McpError,
} from "@modelcontextprotocol/sdk/types.js";
import axios from "axios";

class ImageServer {
  private server: Server;

  constructor() {
    this.server = new Server(
      {
        name: "image-mcp-server",
        version: "0.1.0",
      },
      {
        capabilities: {
          resources: {},
          tools: {},
        },
      }
    );

    this.setupToolHandlers();

    this.server.onerror = (error) => console.error("[MCP Error]", error);
    process.on("SIGINT", async () => {
      await this.server.close();
      process.exit(0);
    });
  }

  private setupToolHandlers() {
    this.server.setRequestHandler(ListToolsRequestSchema, async () => ({
      tools: [
        {
          name: "generate_image",
          description: "Generates an image using Gemini API",
          inputSchema: {
            type: "object",
            properties: {
              prompt: {
                type: "string",
                description: "The prompt to use for image generation",
              },
            },
            required: ["prompt"],
          },
        },
        {
          name: "edit_image",
          description: "Edits an image using Gemini API",
          inputSchema: {
            type: "object",
            properties: {
              prompt: {
                type: "string",
                description: "The prompt to use for image editing",
              },
            },
            required: ["prompt"],
          },
        },
      ],
    }));

    this.server.setRequestHandler(CallToolRequestSchema, async (request) => {
      if (request.params.name === "generate_image") {
        const { prompt } = request.params.arguments as { prompt: string };

        try {
          await axios.post("http://localhost:3000/api/generate-image", {
            prompt: prompt,
          });

          return {
            content: [
              {
                type: "text",
                text: "Image generated successfully",
              },
            ],
          };
        } catch (error: any) {
          console.error(error);
          return {
            content: [
              {
                type: "text",
                text: `Error generating image: ${error.message}`,
              },
            ],
            isError: true,
          };
        }
      } else if (request.params.name === "edit_image") {
        const { prompt } = request.params.arguments as { prompt: string };

        try {
          await axios.post("http://localhost:3000/api/edit-image", {
            prompt: prompt,
          });

          return {
            content: [
              {
                type: "text",
                text: "Image edited successfully",
              },
            ],
          };
        } catch (error: any) {
          console.error(error);
          return {
            content: [
              {
                type: "text",
                text: `Error editing image: ${error.message}`,
              },
            ],
            isError: true,
          };
        }
      }
      throw new McpError(
        ErrorCode.MethodNotFound,
        `Unknown tool: ${request.params.name}`
      );
    });
  }

  async run() {
    const transport = new StdioServerTransport();
    await this.server.connect(transport);
    console.error("Image MCP server running on stdio");
    console.log("mcp ok!");
  }
}

const server = new ImageServer();
server.run().catch(console.error);

7️⃣ Build index.ts to index.js.↓

run npm build

8️⃣ Set cline_mcp_settings.json for Cline and mcp.json for Cursor.↓

{
  "mcpServers": {
    "image-mcp-server": {
      "command": "node",
      "args": ["path to index.js"]
    }
  }
}

How to use them

1️⃣ npm run dev, and start the Next.js App, and access http://localhost:3000.

2️⃣ Ask your Cline or Cursor to generate an image.
For example,

Use "generate_image" tool of "image-mcp-server",
 and send "Create a plumber wearing a red overall
 and a red hat with a mustache. 
It shouldn't look too much like Mario."

3️⃣ Ask your Cline or Cursor to edit an image.
For example,

Use "edit_image" tool of "image-mcp-server",
 and send "Change the overall and hat to green.".

4️⃣ Yippee! We can generate and edit images from Cline and Cursor.🎉

Problem of this junk system🙅

Last time the system was just a simple Next.js Web App, so it was simple to make.
Input the prompt from Frontend　→　Gemini API makes an Image at Backend　→　Return the Image to Frontend

However, this time I wanted to use MCP, so the structure changed.
Input the prompt from MCP　→　Gemini API makes an Image at Backend　→　Huh? How can Frontend display the image?

I wanted to make the system simple, so I used setInterval and fetched the image for 4 seconds interval.
I think it is not good...😟

Also, first time of using MCP will display the image OK, but after that the image will not be displayed unless I reload the page.
So I also add, reload for 4 seconds interval.
I think this is bad...😟

I tried to fix these problems, but they did not work well.
And unfortunateley I run out of time, so maybe I will fix these problems next time.🙇

Update (2025/05/17): Wrote about Ver 3.0 of Image generation and editing MCP
🧠🥷MCP Transports (Image generation and editing MCP Ver 3.0 (WebSocket + Next.js + Gemini API + Cline and Cursor))

Update (2025/05/12): I made a Ver 2.0 to solve the problems. I will write a post about it soon.

Outro

By changing it from Web App to MCP, we can use it from Cline and Cursor easily.
Now it will be easier to make Meme than before.
Oh! maybe next time, I will change this MCP to a Meme generating MCP.

I hope you will learn something from this post, or maybe enjoy even a little.
Thank you for reading.
Happy AI coding!🤖 Hi-Yah!🥷

Top comments (2)

Nevo David • May 5

honestly this hit home for me, cuz i always end up duct-taping stuff last minute too and things break in dumb ways - you ever feel like the quick fixes just bite back harder later?

Web Developer Hyper • May 5

Thank you for checking my post.
I wish I could solve the problems faster...
I will keep learning to improve my skills, and create a good post.

ACI.dev: Fully Open-source AI Agent Tool-Use Infra (Composio Alternative)

100% open-source tool-use platform (backend, dev portal, integration library, SDK/MCP) that connects your AI agents to 600+ tools with multi-tenant auth, granular permissions, and access through direct function calling or a unified MCP server.

Check out our GitHub!