Forem: Ahirton Lopes

🏈 TensorCraft Playbook: From Classroom CNNs to Cloud TPUs with Keras

Ahirton Lopes — Fri, 01 May 2026 00:15:30 +0000

📖 Chapter 1: The Offensive Formation (CNN Architecture)

Every winning strategy requires solid fundamentals. In Deep Learning, this means designing a well-balanced Convolutional Neural Network (CNN) architecture. Our starting point is a classic structure, designed for hierarchical extraction of spatial patterns from 32x32 pixel images.

Tactical Architecture Analysis

The baseline model follows a sequence of operations that mirrors the hierarchy of the visual cortex:

Convolution (Feature Extraction): Conv2D layers apply filters to detect both low-level patterns (edges) and high-level features (objects such as cars or birds).

Downsampling (Spatial Reduction): MaxPooling2D layers reduce spatial dimensions while preserving the most relevant features, making the model invariant to small translations.

Regularization: The use of Dropout is critical to prevent overfitting, forcing the network to avoid reliance on specific neurons during training.

Decision Zone (Classification): Fully connected layers combine all extracted features to deliver the “touchdown”: the final classification through a softmax activation function across 10 classes.

# Arquitetura Estratégica - Referência: Demo_CNN_CIFAR10.ipynb
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout

def build_attack_formation():
    model = Sequential()
    # Primeiro bloco: 32 filtros para capturar texturas básicas
    model.add(Conv2D(filters=32, kernel_size=2, padding='same', activation='relu', input_shape=(32, 32, 3)))
    model.add(MaxPooling2D(pool_size=2))

    # Segundo bloco: aumentando a profundidade para 64 filtros
    model.add(Conv2D(filters=64, kernel_size=2, padding='same', activation='relu'))
    model.add(MaxPooling2D(pool_size=2))

    # Terceiro bloco: 128 filtros para abstrações complexas
    model.add(Conv2D(filters=128, kernel_size=2, padding='same', activation='relu'))
    model.add(MaxPooling2D(pool_size=2))

    # Camada de transição e regularização
    model.add(Dropout(0.3))
    model.add(Flatten())

    # Fully Connected: A "Red Zone" antes do touchdown
    model.add(Dense(100, activation='relu'))
    model.add(Dropout(0.4))
    model.add(Dense(10, activation='softmax')) # O Touchdown Final

    return model

📖 Chapter 2: Knowing the Opponent (CPU vs. GPU vs. TPU)

Before stepping onto the field, a strategist must understand the strengths and weaknesses of each processing unit. During CIFAR-10 training, switching across devices reveals the classic compute-bound vs. memory-bound trade-off.

1.CPU: The Master Builder (Low Latency)

The Central Processing Unit (CPU) is designed for complex logic and sequential execution. It has fewer cores, but they are highly optimized, fast, and versatile.

In the Playbook: The CPU is ideal for data preprocessing and loading (ETL). However, when it comes to large-scale matrix multiplications, it becomes a bottleneck due to its inherently sequential nature.

2. GPU: The Workforce Army (Massive Parallelism)

The Graphics Processing Unit (GPU) has been adapted for AI workloads due to its thousands of cores operating in parallel.

In the Playbook: GPUs excel at CNN workloads because each filter can be processed as an independent parallel task. The main challenge here is the Memory Wall: moving data between VRAM and compute cores can become slower than the computation itself.

3. TPU: The Linear Algebra Specialist (Systolic Array)

The Tensor Processing Unit (TPU) is a purpose-built AI accelerator (ASIC). Unlike GPUs, it leverages a systolic array architecture.

In the Playbook: Think of data flowing like waves through a grid of multipliers, without needing to return to main memory at every step. This eliminates latency bottlenecks and allows throughput to scale almost linearly with increasing batch sizes.

📖 Chapter 3: The Scaling Strategy (TPUStrategy & XLA)

If Chapter 1 was about building the team and Chapter 2 about understanding the field, Chapter 3 is about tactical communication. Integrating TPUs into a Keras workflow requires only minimal code changes, but it triggers deep transformations in how execution happens under the hood.

1.TPUStrategy: The Team Captain

In a Cloud TPU setup (such as a v3-8), you are not working with a single accelerator, but with 8 processing cores operating in unison. The tf.distribute.TPUStrategy API enables synchronous distributed training through model replication.

How it works: The model is replicated across all 8 cores. Each core receives a different shard of the input data (the global batch size), computes gradients independently, and then all cores synchronize these gradients before updating the model weights. This orchestration ensures that all “players” remain perfectly aligned during training.

2.XLA: Runtime Optimization

XLA (Accelerated Linear Algebra) is the domain-specific compiler responsible for optimizing TensorFlow operations.

Operation Fusion: Instead of executing a Conv2D operation followed by a ReLU activation as separate steps—each requiring memory access—XLA fuses them into a single hardware-level kernel. This drastically reduces memory bandwidth usage and significantly improves raw computational performance.

3.Code Evolution: From Classroom to Cluster

Here is how a simple educational model evolves into a high-performance distributed training setup:

import tensorflow as tf

# 1. Localizar o recurso de computação (TPU)
resolver = tf.distribute.cluster_resolver.TPUClusterResolver()
tf.config.experimental_connect_to_cluster(resolver)
tf.tpu.experimental.initialize_tpu_system(resolver)
strategy = tf.distribute.TPUStrategy(resolver)

# 2. Definir o modelo dentro do escopo da estratégia
with strategy.scope():
    # Aqui entra sua arquitetura autoral do CIFAR-10
    model = build_attack_formation() 

    # O otimizador e a perda também são distribuídos automaticamente
    model.compile(
        loss='categorical_crossentropy', 
        optimizer='rmsprop', 
        metrics=['accuracy']
    )

📖 Chapter 4: Field Logistics (Data Pipeline & I/O)

In football, having the best quarterback means nothing if the ball does not reach their hands at the right time. In high-performance computing, the main adversary of TPUs is data starvation. If your input pipeline is slow, the TPU will sit idle waiting for the CPU to read and process data, wasting expensive computational resources.

Continuous Flow Strategy

To ensure the “engine” never stalls, the TensorCraft Playbook implements three key logistics tactics:

TFRecords and Protocol Buffers:

Instead of reading thousands of individual JPG files, the CIFAR-10 dataset is serialized into TFRecord binary files. This enables high-throughput sequential reads and significantly reduces filesystem overhead.

Parallel ETL with tf.data:

We leverage map operations with num_parallel_calls=tf.data.AUTOTUNE, delegating image preprocessing tasks such as resizing and normalization across multiple CPU cores simultaneously.

Software Pipelining (Prefetch):
The key component is .prefetch(). While the TPU is processing the current batch (Step N), the CPU is already preparing and asynchronously sending the next batch (Step N+1) to TPU memory.

# Logística de Alta Performance
def prepare_pipeline(dataset, batch_size):
    ds = dataset.cache() # Cache em memória para datasets pequenos como CIFAR-10
    ds = ds.shuffle(buffer_size=1000)
    ds = ds.batch(batch_size, drop_remainder=True)
    # O segredo da logística: desacoplar a produção do consumo
    ds = ds.prefetch(buffer_size=tf.data.AUTOTUNE)
    return ds

📖 Chapter 5: The Real Scoreboard (Performance, Constraints, and Field Lessons)

The true validation of any playbook lies in the scoreboard. However, rather than presenting an idealized benchmark, this chapter reflects the real-world behavior of the system under practical execution conditions.

Experimental Methodology

All experiments were conducted using Google Colab Pro, ensuring access to higher-performance infrastructure compared to the standard Colab environment.

To maintain experimental consistency, the following variables were strictly controlled:

Same notebook and execution pipeline
Same CNN architecture
Same CIFAR-10 dataset
Same preprocessing and tf.data pipeline
Same batch size and number of epochs

The only variable changed across experiments was the execution environment:

CPU runtime
GPU runtime
TPU runtime

This setup isolates infrastructure as the independent variable, enabling a direct comparison of performance across hardware configurations.

Scenario	Effective Hardware	Time (s)	Throughput (img/s)	Validation Accuracy
CPU Runtime	CPU	167.84	1,489	72.42%
GPU Runtime	GPU	34.66	7,213	73.36%
TPU Runtime	CPU (TPU host)	30.42	8,218	72.27%

Key Observations

1.GPU Acceleration Works as Expected

The GPU execution delivered a significant performance improvement:

~4.8× faster than CPU
Highest validation accuracy
Efficient parallelization of convolutional operations

This confirms the expected advantage of GPUs in CNN workloads.

2.TPU Runtime Did Not Execute on TPU

Despite selecting a TPU runtime, the system reported:

effective_hardware = CPU
verified_tpu = false
failure in TPUClusterResolver initialization

This indicates that the workload did not run on TPU hardware, but instead on the host CPU of the TPU environment.

3.Unexpected Result: TPU Runtime Outperformed GPU

Even without TPU acceleration, the TPU runtime delivered:

Highest throughput (8,218 img/s)
Fastest execution time (30.42s)

This performance exceeded the GPU run.

Technical Interpretation

This result highlights a non-obvious but critical aspect of modern ML systems:

Execution environment matters as much as hardware type.

When selecting a TPU runtime in Colab Pro, the execution environment changes in two ways:

Access to a TPU accelerator (not successfully initialized in this case)
Allocation of a different host machine, typically a more powerful CPU infrastructure

As a result, the model ran on:

High-performance TPU host CPU

rather than:

Standard Colab CPU

This explains the ~5.5× improvement over the baseline CPU and the slight advantage over GPU.

Why TPU Failed to Initialize

The inability to use TPU was not related to the model, but to environment constraints:

Missing or incompatible TensorFlow runtime
Dependency conflicts (ml_dtypes, JAX, TensorFlow)
Runtime inconsistencies in Colab TPU environments
Failure to resolve TPU via TPUClusterResolver

This reinforces the operational complexity of TPU-based workflows.

Engineering Lessons

This experiment surfaces several important insights:

Infrastructure is part of the model;
Performance depends on the execution environment, not only on architecture;
Hardware labels can be misleading;
Selecting TPU does not guarantee TPU execution;
TPUs require mature environments;
They are not yet fully plug-and-play in notebook-based workflows;
Reproducibility is fragile in managed environments
Even controlled experiments can yield unexpected behaviors;
Data pipeline efficiency remains critical;
High-performance hardware still depends on efficient data delivery;

Conclusion

The scoreboard is no longer incomplete, it is revealing.

This experiment demonstrates that performance in deep learning is not solely determined by model design or nominal hardware selection. Instead, it emerges from the interaction between model, infrastructure, and execution environment.

Perhaps the most important takeaway is this:

In real-world machine learning systems, infrastructure decisions can outweigh hardware assumptions. In this case, a TPU runtime without TPU acceleration still outperformed a GPU execution, not because of the TPU itself, but because of the underlying system it provided.

You've Got Mail📨 (and Recommendations!): Delivering Recs with Keras, JAX & KerasRS

Ahirton Lopes — Tue, 01 Jul 2025 09:19:58 +0000

Recommendations are everywhere — from your email inbox to the shopping carts you abandon (but never escape 😅). Whether it’s suggesting a movie, a product, or a next best action, recommender systems have become fundamental to today’s digital experience.

Until recently, building robust recommender pipelines meant stitching together lots of custom layers, custom losses, and custom evaluation metrics by hand. That’s why the introduction of KerasRS is such a game-changer.

What is KerasRS?

KerasRS (Keras Recommenders) is an open-source extension for Keras 3 that delivers building blocks specifically designed for recommender systems, including:

✅ retrieval layers
✅ ranking layers
✅ specialized recommender losses
✅ ranking metrics

Best of all, KerasRS is multi-backend, working seamlessly with TensorFlow, PyTorch, and JAX. That means you can combine the familiar Keras API with the high-performance JAX compiler and TPU acceleration for your recsys workflows.

☝️ Did you know? The Google Play feed uses KerasRS behind the scenes! (source)

Why JAX + KerasRS?

If you want fast training, JAX is your secret weapon:

JIT compilation for speed
XLA acceleration
Automatic vectorization
TPU support

Pairing JAX with KerasRS means you get production-grade recommender building blocks with superior performance. It’s like having your cake and eating it too. 🍰

Installing KerasRS

pip install keras-rs
# or, for the nightly:
pip install --upgrade keras-rs-nightly

Then set your backend:

import os
os.environ["KERAS_BACKEND"] = "jax"

A Quick Retrieval Example

Let’s build a minimal retrieval recommender in just a few lines.

import keras
import keras_rs
import numpy as np
import tensorflow as tf

# Dummy user-item pairs
user_ids = np.array([0, 1, 2, 3])
item_ids = np.array([10, 20, 30, 40])

user_embedding = keras.layers.Embedding(input_dim=4, output_dim=32)
item_embedding = keras.layers.Embedding(input_dim=50, output_dim=32)

retrieval = keras_rs.layers.BruteForceRetrieval(k=5)

query_model = keras.Sequential([
    keras.Input(shape=(), dtype=tf.int32, name="user_id"),
    user_embedding,
    keras.layers.Flatten()
])

candidate_model = keras.Sequential([
    keras.Input(shape=(), dtype=tf.int32, name="item_id"),
    item_embedding,
    keras.layers.Flatten()
])

retrieval.index_from_dataset(
    tf.data.Dataset.from_tensor_slices(item_ids).map(candidate_model)
)

model = keras_rs.RetrievalModel(
    query_model=query_model,
    candidate_model=candidate_model,
    retrieval=retrieval
)

model.compile(
    optimizer=keras.optimizers.Adagrad(3e-4),
    loss=keras_rs.losses.PairwiseHingeLoss(),
    metrics=[keras_rs.metrics.NDCG(k=5)],
)

features = {
    "user_id": user_ids,
    "item_id": item_ids,
}

model.fit(features, item_ids, epochs=5, verbose=2)

That’s a minimal retrieval system — you can expand this with categorical features, embeddings, or even sequence models.

Going Further: Transformers & Two-Tower

KerasRS supports more advanced recommender architectures too:

Deep & Cross Networks (DCN)
Two-Tower models
Sequence-based recommenders with transformers
SASRec-style sequence recommenders

If you want to see a transformer-based recommender in action, check out this Movielens demo.

What’s Next for KerasRS?

🚀 KerasRS is on a fast-moving roadmap, with upcoming features such as:

DistributedEmbedding for large-scale TPU sharded embedding tables
SparseCore support
Ultra-scalable retrieval across billions of items

It’s a great time to build recsys pipelines with these tools.

Try it Yourself

If you’re excited to build your own recommender with KerasRS, check out:

KerasRS makes scalable, production-grade recommendation models delightfully easy. Give it a try, and share your experiments!

📨 So next time you see that “You might like…” email, remember: there’s probably a KerasRS model working hard behind the scenes. 😉

If you liked this article, feel free to leave a comment or share! 🚀

Dynamic Knowledge Retrieval: Creating Real-Time RAG Solutions with Gemini and Vector Search

Ahirton Lopes — Sun, 02 Mar 2025 23:53:22 +0000

Introduction

During my time at TIVIT, I was involved in the early stages of a mergers and acquisitions (M&A) process, where I assessed several companies in the Brazilian data and AI ecosystem.

One of the biggest challenges in such processes is the need to quickly analyze vast amounts of financial, legal, and technical documents, while also keeping up with real-time market trends and regulatory changes.

This experience inspired me to explore how RAG (Retrieval-Augmented Generation) combined with Gemini and Vector Search can revolutionize knowledge retrieval for high-stakes decision-making.

The Challenge: M&A Knowledge Bottleneck

M&A transactions involve multiple stakeholders—financial analysts, legal teams, and industry experts—all working with fragmented knowledge sources. Key pain points include:

Scattered Information: Documents are spread across data rooms, legal databases, and regulatory reports.

Time-Sensitive Decisions: Delays in retrieving relevant insights can impact negotiations and risk assessments.

Contextual Complexity: Extracting meaningful patterns from different formats, such as contracts, financial statements, and news reports, requires advanced AI-driven contextual understanding.

The Solution: A RAG-Powered M&A Intelligence Assistant

By integrating Gemini with RAG and Vertex AI’s Vector Search, we can build an M&A Intelligence Assistant that dynamically retrieves and synthesizes knowledge from structured and unstructured data sources.

How It Works

Ingest and Index Data:

Upload contracts, financial reports, and regulatory filings into a vector database.

Convert text into embeddings using Gemini to enable semantic search.

Integrate real-time news and market data feeds.

Dynamic Query Processing:

Users ask high-level questions (e.g., “What are the biggest compliance risks for acquiring this company?”).

The RAG system retrieves relevant documents using Vector Search.

Gemini generates an AI-powered response, citing the sources dynamically.

Actionable Insights for Decision-Makers:

Risk assessment: Identify contractual red flags and financial anomalies.

Regulatory compliance tracking: Monitor global and local regulations in real time.

Competitor intelligence: Extract market positioning insights from industry reports.

Implementation: Building the System on Vertex AI

Step 1: Setting Up Vector Search on Vertex AI

from google.cloud import aiplatform

# Initialize Vertex AI
project_id = "my-gcp-project"
aiplatform.init(project=project_id, location="us-central1")

# Create a Vector Search Index
vector_index = aiplatform.MatchingEngineIndex.create(
    display_name="mna-docs-index",
    contents_delta_uri="gs://my-bucket/embedding-data/"")

Step 2: Generating Embeddings with Gemini

from vertexai.language_models import TextEmbeddingModel

gemini = TextEmbeddingModel.from_pretrained("gemini-1")

def generate_embedding(text):
    return gemini.get_embeddings([text])[0]

Step 3: Querying Documents with RAG

def retrieve_knowledge(query):
    query_embedding = generate_embedding(query)
    results = vector_index.find_neighbors(query_embedding, top_k=5)
    return results

Step 4: Answering with Gemini

def generate_response(query):
    relevant_docs = retrieve_knowledge(query)
    context = "\n".join([doc["content"] for doc in relevant_docs])

    prompt = f"Based on the following documents, answer the query: {query}\n\n{context}"
    response = gemini.predict(prompt)
    return response

Conclusion

By leveraging Gemini, RAG, and Vector Search, we can transform M&A processes by eliminating knowledge retrieval bottlenecks and providing real-time, AI-powered insights. This approach not only enhances decision-making but also accelerates due diligence, making transactions more efficient and data-driven.

What other applications of RAG-powered AI do you see in your industry? Let’s discuss! 🚀

AI-Powered Automation: Fine-tuning Gemma for Function Calling

Ahirton Lopes — Sun, 02 Mar 2025 23:33:40 +0000

Starting a new job always comes with a mix of excitement and challenges. As I joined Accenture as a Data & AI Senior Manager, I experienced firsthand the complexity of onboarding processes—countless forms, account setups, and access provisioning across multiple systems. This got me thinking:

"How can AI-powered automation streamline employee onboarding and make the experience seamless, for both, new hires and HR teams?"

This project explores how fine-tuning Gemma on Vertex AI can enhance its function-calling capabilities, enabling seamless API integration and workflow automation. By leveraging Gemma's ability to interpret function descriptions, we can dynamically trigger external actions such as fetching data, interacting with databases, and automating enterprise tasks.

This project fine-tunes Gemma to enhance its function-calling capabilities, enabling seamless API integration and workflow automation. Using Vertex AI, we optimize Gemma to interpret function descriptions and dynamically trigger external actions such as fetching data, interacting with databases, and automating enterprise tasks.

What is Function Calling?

Function calling allows language models to identify and execute functions based on user inputs. This is useful for automating workflows and integrating AI into real-world applications such as chatbots, virtual assistants, and enterprise process automation.

Automating the Employee Onboarding Process with AI

A practical use case for function calling is automating the onboarding of new employees. With a fine-tuned Gemma model, we can create an automated workflow that performs essential tasks such as:

Registering the new employee in the CRM
Creating accounts and provisioning access
Sending a welcome email
Scheduling initial training sessions

Setting Up the Environment on Vertex AI

Before we start, we need to configure our environment on Vertex AI. Make sure you have the Google Cloud SDK installed and authenticated:

pip install google-cloud-aiplatform

Then, initialize the Vertex AI client:

from google.cloud import aiplatform

aiplatform.init(project="my-project", location="us-central1")

Creating a Dataset for Fine-Tuning

To train the model on executing onboarding functions, we create a JSONL dataset:

{"input": "Register new employee John Doe with email john@email.com and role Software Engineer", "function": "create_employee_record", "arguments": {"name": "John Doe", "email": "john@email.com", "role": "Software Engineer"}}

{"input": "Provision access for Mary Johnson to the ERP system", "function": "provision_access", "arguments": {"employee_name": "Mary Johnson", "system": "ERP"}}

{"input": "Send welcome email to Carla Smith", "function": "send_welcome_email", "arguments": {"recipient": "Carla Smith"}}

{"input": "Schedule security training for Mark Lewis", "function": "schedule_training", "arguments": {"employee": "Mark Lewis", "training": "Security"}}

This dataset is uploaded to Cloud Storage for training:

bucket_name = "my-bucket"
dataset_path = f"gs://{bucket_name}/dataset.jsonl"

# Upload the dataset
aiplatform.gcs_upload_file("dataset.jsonl", dataset_path)

Fine-Tuning Gemma on Vertex AI

Now, we initiate the fine-tuning of the Gemma model using Vertex AI's API:

tuning_job = aiplatform.CustomJob(
    display_name="fine-tuning-gemma",
    script_path="train.py",  # Training script
    container_uri="us-docker.pkg.dev/vertex-ai/training/tf-cpu.2-8:latest",
    args=["--dataset", dataset_path]
)

tuning_job.run()

The train.py script contains the logic for adjusting the model weights, utilizing LoRA (Low-Rank Adaptation) for efficient optimization.

Testing the Fine-Tuned Model

After training, we can deploy the model and test it:

endpoint = aiplatform.Endpoint.create(
    display_name="gemma-function-calling",
    model_name=tuning_job.model_name
)

response = endpoint.predict(instances=[{"input": "Register new employee John Doe with email john@email.com and role Software Engineer"}])
print(response)

The model now automatically returns the correct function to be called with the appropriate arguments.

Conclusion

Fine-tuning Gemma on Vertex AI is an efficient way to enable function calling, allowing intelligent automation of tasks in real applications. In this example, we created an automated workflow for employee onboarding, demonstrating how AI can optimize enterprise processes.

Want to learn more? Drop your questions in the comments.

Let's go! 🚀

Fashion Forward: Leveraging Keras 3.0 for Beginner-Friendly Deep Learning

Ahirton Lopes — Mon, 30 Sep 2024 22:50:32 +0000

Exploring the New Features of Keras 3.0 with CNNs on Fashion MNIST

In this post, we will explore the new features introduced in Keras 3.0, which was showcased at Google I/O 2024. We will also dive into a practical example of using Convolutional Neural Networks (CNNs) on the Fashion MNIST dataset, illustrating how to implement these updates effectively.

Ref. https://blog.gopenai.com/tensorflow-vs-pytorch-vs-keras-9161988c19b9

What’s New in Keras 3.0?

Keras 3.0 brings several improvements and updates that enhance the usability and performance of the library. Here are some of the key features:

Pros of Keras 3.0 Updates:

Integration with TensorFlow: Keras is now fully integrated with TensorFlow, allowing for better performance and streamlined workflows. This makes it easier to access the latest TensorFlow features while using Keras.
Improved API: The updated API provides more clarity and consistency, making it easier for developers to implement models without extensive boilerplate code.
Enhanced Performance: Keras 3.0 includes optimizations that improve training speed and resource management, allowing for more efficient model training.
New Features: New functionalities, such as improved model serialization and advanced preprocessing utilities, make it easier to manage data and models.

Cons of Keras 3.0 Updates:

Learning Curve: For users accustomed to the previous versions, transitioning to Keras 3.0 may require some adjustment, especially with the new API and integrated functionalities.
Potential Compatibility Issues: Existing codebases may face compatibility issues when migrating to the new version, necessitating code revisions and testing.
Documentation Updates: While the documentation is continuously improving, some users may find that certain sections lag behind in terms of clarity or examples for the new features.

Before we dive into the code, let’s take a look at the images from the Fashion MNIST dataset. Below are some samples that illustrate the different categories of clothing we will be classifying.

Practical Example: CNN on Fashion MNIST

Let's implement a simple CNN using Keras 3.0 to classify images from the Fashion MNIST dataset. Below is the complete code, including data preprocessing and model creation.

Importing Libraries

import tensorflow as tf
from tensorflow.keras.datasets import fashion_mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, Dense, Flatten, MaxPooling2D
from tensorflow.keras.utils import to_categorical
from matplotlib import pyplot as plt
from random import randint

Loading and Preprocessing the Data

# Load the Fashion MNIST dataset
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()

# Reshape the data to fit the model's input shape
img_rows, img_cols = 28, 28
x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)

# Convert class vectors to binary class matrices (one-hot encoding)
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

Visualizing Sample Images

# Display a random sample image from the training set
plt.imshow(x_train[randint(0, x_train.shape[0])], cmap='Blues_r')
plt.show()

Building the CNN Model

# Define the CNN model
model = Sequential()

# First convolutional layer
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input_shape))

# First pooling layer
model.add(MaxPooling2D(pool_size=(2, 2)))

# Second convolutional layer
model.add(Conv2D(64, (3, 3), activation='relu'))

# Second pooling layer
model.add(MaxPooling2D(pool_size=(2, 2)))

# Flatten the output
model.add(Flatten())

# Fully connected layer
model.add(Dense(128, activation='relu'))

# Output layer with 10 neurons for 10 classes
model.add(Dense(10, activation='softmax'))

# Compile the model
model.compile(optimizer='adam', 
              loss='categorical_crossentropy', 
              metrics=['accuracy'])

# Model summary
model.summary()

Conclusion

Keras 3.0 introduces exciting advancements that enhance the deep learning workflow, making it more efficient and user-friendly. While there may be some challenges during the transition, the benefits of improved performance, better integration, and a clearer API make it worthwhile. By leveraging the new features, we can build powerful models with greater ease.

Feel free to try the code above, and let me know your thoughts on Keras 3.0!

"For those interested in exploring the code further, you can find the complete example on Google Colab: Fashion MNIST CNN Example. Feel free to run the notebook and experiment with the model!"