Taiab's Blog

Testing Microservices: Lessons from the beShera EdTech Platform

Md. Taiab

Md. Taiab

2024-11-08 ยท 3 mins read


beShera is a social EdTech platform built on a microservices architecture. User onboarding, course management, social interactions, examinations, and AI-powered learning features โ€” all as separate services communicating over APIs.

This was the most complex system I've tested. Here's what I learned.

Why Microservices Testing is Different

In a monolith, data flows through one codebase. You break something, you see it. In microservices, a bug in the notification service might only appear when the exam service fires an event, which only happens under specific user state conditions.

The real bugs are at the integration points.

My Testing Strategy

1. Map the Service Boundaries First

    Before writing a single test case, I spent two days mapping:
  • Which service owns which data
  • How services communicate (REST, events, both?)
  • What happens when one service is slow or down

This map became the source of truth for all my edge-case tests.

2. Contract Testing for APIs

    When Service A calls Service B, I tested:
  • Does Service B return what Service A expects?
  • What happens when Service B returns an unexpected field?
  • What happens when Service B adds a new required field?

I found a real bug here: the exam service expected user_id as an integer, but the user service started returning it as a string after a migration. Tests passed in isolation. The integration broke.

3. End-to-End Critical Paths

I defined 5 critical user journeys and tested them end-to-end:

  • New user onboarding โ†’ register โ†’ verify email โ†’ complete profile โ†’ enroll in first course
  • Course completion โ†’ watch all modules โ†’ pass exam โ†’ receive certificate
  • Social interaction โ†’ post โ†’ comment โ†’ like โ†’ notification delivery
  • AI-assisted learning โ†’ start AI session โ†’ complete โ†’ progress saved
  • Account recovery โ†’ forgot password โ†’ reset โ†’ re-login โ†’ session intact

Each journey touched 3โ€“7 microservices. Any one of them could break the entire flow.

4. Chaos Testing (Informal)

    I didn't have access to formal chaos engineering tools, but I simulated failure manually:
  • What happens to an active exam session if the network drops mid-test?
  • What if the user closes the browser during payment?
  • What if the same user logs in from two devices simultaneously?

The simultaneous session test found a real bug: both sessions stayed active, and actions from one session sometimes overwrote the other's progress.

The Bug I'm Most Proud of Finding

During social interaction testing, I found that unliking a post you'd never liked actually decremented the like count below zero. The like count could go to -1, -2, etc.

The root cause: the frontend sent the unlike API call optimistically before confirming whether the user had actually liked the post. The backend didn't validate current like state before decrementing.

Impact: High. A public-facing counter showing negative numbers is both a UX bug and a data integrity issue.

Key Takeaways

  • Test services in isolation first, then together โ€” find the unit bugs before the integration bugs
  • Idempotency matters โ€” every write operation should be safe to call twice
  • Event-driven bugs are time-delayed โ€” not everything shows up immediately
  • State management across services is the hardest problem โ€” test every state transition explicitly
Md. Taiab

Written by Md. Taiab

Follow

Md. Taiab is a Software QA Engineer and security enthusiast based in Dhaka, Bangladesh. He interned as a QA Engineer at Battery Low Interactive Ltd. and competes in CTFs and programming contests โ€” ranked Top 3% globally on TryHackMe and Champion of GUB Junior IDPC 2023.

Comments disabled โ€” add your CommentBox.io project ID to .env.local as NEXT_PUBLIC_COMMENTBOX_ID