Testing Retrofit & OkHttp with MockWebServer — From Setup to Network-Condition Simulation
The first time I had a production bug in a Retrofit API client — a header missing on a specific endpoint, only on Android — I went looking at our test suite and realized we were testing the repository by mocking the API interface. Of course the tests passed. We weren’t testing the actual HTTP serialization, we were testing that our Kotlin code called a method we ourselves had stubbed.
That bug is what got me to take MockWebServer seriously. Mocking the Retrofit interface tests almost nothing useful. What you actually want to verify is that your API client constructs the right HTTP request, parses the actual JSON correctly, handles the real error codes, and survives the gross network conditions that mobile users live in. MockWebServer lets you do all of that without hitting a real server.
This post: how to set up MockWebServer, the patterns for testing requests/responses/errors/auth/timeouts, and how to integrate it with coroutines and Turbine for Flow-based APIs. By the end you’ll have a testing setup that actually catches network bugs.
Why Not Just Mock the Retrofit Interface?
Imagine a weather app with this API:
interface WeatherApi {
// WeatherApi is a RETROFIT INTERFACE
@GET(“v2/forecast”)
suspend fun getForecast(
@Query(“lat”) lat: Double,
@Query(“lng”) lng: Double,
@Header(“X-API-Key”) apiKey: String
): Response<ForecastDto>
}
The naive test mocks WeatherApi:
// ❌ Tests almost nothing
@Test fun `forecast loads successfully`() = runTest {
val mockApi = mockk<WeatherApi>()
coEvery { mockApi.getForecast(any(), any(), any()) } returns
Response.success(ForecastDto(...))
val repo = ForecastRepository(mockApi)
val result = repo.loadForecast(40.7, -74.0)
assertTrue(result.isSuccess)
}
What this verifies: that ForecastRepository calls a method, gets a result, and returns success. What it doesn’t verify: that the right URL gets built, that the API key header is sent, that the JSON for ForecastDto actually deserializes from real server response shapes, that 4xx responses become errors, that timeouts behave correctly. Every one of those is a real bug class that’s shipped to production from passing test suites.
MockWebServer fixes this by being a real HTTP server. Your Retrofit client makes real HTTP calls to localhost:randomPort, you queue real responses, and you assert on real intercepted requests. Same code path as production, no mocks of your own classes.
Setup — Less Than You’d Expect
// build.gradle.kts — in androidTest or test, depending on your needs
testImplementation(“com.squareup.okhttp3:mockwebserver:4.12.0”)
testImplementation(“com.squareup.retrofit2:retrofit:2.11.0”)
testImplementation(“com.squareup.retrofit2:converter-moshi:2.11.0”)
testImplementation(“app.cash.turbine:turbine:1.1.0”)
testImplementation(“org.jetbrains.kotlinx:kotlinx-coroutines-test:1.8.1”)
The test base. Every test that uses MockWebServer follows roughly this shape:
class WeatherApiTest {
private lateinit var server: MockWebServer
// MockWebServer is a CLASS from okhttp3.mockwebserver
private lateinit var api: WeatherApi
@Before
fun setup() {
server = MockWebServer()
server.start()
// Picks a random free port
api = Retrofit.Builder()
.baseUrl(server.url(“/”))
// server.url(“/”) returns the localhost URL with the actual port
.addConverterFactory(MoshiConverterFactory.create())
.build()
.create(WeatherApi::class.java)
}
@After
fun teardown() {
server.shutdown()
// CRITICAL — without shutdown the port leaks across tests
}
}
One server per test class, started and shut down in @Before/@After. Each test queues its own responses. The Retrofit client is built pointing at the test server — same code that points at https://api.weather.com in production, just a different base URL.
Layer 1: The Happy Path — Response Body Round-Trip
@Test
fun `forecast deserializes correctly`() = runTest {
val responseJson = “””
{
“temp”: 72.4,
“condition”: “CLEAR”,
“hourly”: [
{“hour”: 14, “temp”: 73.1, “chance_of_rain”: 0.05},
{“hour”: 15, “temp”: 74.0, “chance_of_rain”: 0.10}
]
}
“””.trimIndent()
server.enqueue(
// enqueue is a FUNCTION on MockWebServer — queues a response for the next request
MockResponse()
.setResponseCode(200)
.setBody(responseJson)
.addHeader(“Content-Type”, “application/json”)
)
val response = api.getForecast(lat = 40.7, lng = -74.0, apiKey = “test_key”)
assertTrue(response.isSuccessful)
val forecast = response.body()!!
assertEquals(72.4, forecast.temp, 0.01)
assertEquals(“CLEAR”, forecast.condition)
assertEquals(2, forecast.hourly.size)
assertEquals(0.05, forecast.hourly[0].chanceOfRain, 0.001)
}
Two things this is actually verifying that the mocked-interface version doesn’t: that your ForecastDto matches the real server shape (field names, nesting, snake_case vs camelCase via Moshi’s @Json), and that nested objects deserialize correctly. The number of times I’ve caught “chance_of_rain returned 0.0 because we didn’t add the @Json annotation on a camelCase field” bugs with this setup — impossible to count.
For real production tests, factor the JSON out into resource files (src/test/resources/forecast_success.json) so they’re editable and reusable across tests:
private fun fixture(filename: String): String =
javaClass.classLoader!!.getResourceAsStream(filename)!!
.bufferedReader().readText()
// Usage:
server.enqueue(MockResponse().setResponseCode(200).setBody(fixture(“forecast_success.json”)))
Layer 2: Verifying the Outgoing Request
Half of API bugs aren’t about parsing responses — they’re about constructing requests wrong. Wrong path, wrong query params, missing headers, wrong HTTP method.
@Test
fun `forecast request includes correct params and api key header`() = runTest {
server.enqueue(MockResponse().setResponseCode(200).setBody(“{}”))
api.getForecast(lat = 40.7, lng = -74.0, apiKey = “secret_123”)
val recorded = server.takeRequest()
// takeRequest() is a FUNCTION on MockWebServer
// Returns the next request the server received, or blocks waiting for one
assertEquals(“GET”, recorded.method)
assertEquals(“/v2/forecast?lat=40.7&lng=-74.0”, recorded.path)
assertEquals(“secret_123”, recorded.getHeader(“X-API-Key”))
}
Now you’re actually verifying that your Retrofit interface generates the right HTTP. If someone refactors and accidentally changes @Header("X-API-Key") to @Query("api_key"), this test fails immediately. Without it, you find out in production.
Pattern: assert on the request, not just the response. For any endpoint with non-trivial input (POST bodies, headers, query params), have at least one test that calls takeRequest() and asserts on it.
For POST requests with bodies, you can read the body too:
@Test
fun `report inaccurate forecast sends correct json body`() = runTest {
server.enqueue(MockResponse().setResponseCode(204))
api.reportInaccuracy(
ReportDto(forecastId = “f_123”, actualTemp = 65.0, reportedAt = 1700000000)
)
val recorded = server.takeRequest()
assertEquals(“POST”, recorded.method)
assertEquals(“/v2/reports”, recorded.path)
val sentBody = recorded.body.readUtf8()
// body is a Buffer; readUtf8 drains it as a String
val parsed = Moshi.Builder().build().adapter(ReportDto::class.java).fromJson(sentBody)
assertEquals(“f_123”, parsed?.forecastId)
assertEquals(65.0, parsed?.actualTemp)
}
Layer 3: Error Responses
Real APIs return 4xx and 5xx. Your client code needs to handle them, and if it doesn’t, this is where the bugs hide. Test each error class your code differentiates.
@Test
fun `404 returns NotFound result`() = runTest {
server.enqueue(
MockResponse()
.setResponseCode(404)
.setBody(“””{“error”: “Location not in service area”}“””)
)
val result = repository.loadForecast(0.0, 0.0)
assertTrue(result is ForecastResult.NotFound)
assertEquals(“Location not in service area”, (result as ForecastResult.NotFound).reason)
}
@Test
fun `429 rate limit triggers backoff strategy`() = runTest {
server.enqueue(
MockResponse()
.setResponseCode(429)
.addHeader(“Retry-After”, “30”)
)
val result = repository.loadForecast(40.7, -74.0)
assertTrue(result is ForecastResult.RateLimited)
assertEquals(30, (result as ForecastResult.RateLimited).retryAfterSeconds)
}
@Test
fun `5xx is treated as transient error`() = runTest {
server.enqueue(MockResponse().setResponseCode(503))
val result = repository.loadForecast(40.7, -74.0)
assertTrue(result is ForecastResult.TransientError)
// ✅ The repository should distinguish “server is broken, retry later”
// from “your request was wrong, don’t retry”
}
The pattern: test that your client maps HTTP status codes to the domain error categories your app distinguishes. If your sealed class has NotFound, RateLimited, TransientError, NetworkError, you should have a test that produces each one. This is also the test that catches when someone accidentally changes the error mapping.
Layer 4: Authentication and Interceptors
If you have an OkHttp interceptor that adds auth headers, refreshes tokens, or retries, you want to test that integration end-to-end. The interceptor sits between Retrofit and OkHttp’s actual HTTP call, so MockWebServer sees the post-interceptor request.
class AuthInterceptorTest {
private lateinit var server: MockWebServer
private lateinit var api: WeatherApi
private val tokenStore = FakeTokenStore()
// FakeTokenStore is a hand-written test double, not a mock library
@Before
fun setup() {
server = MockWebServer()
server.start()
val client = OkHttpClient.Builder()
.addInterceptor(AuthInterceptor(tokenStore))
// The actual production interceptor — not mocked
.build()
api = Retrofit.Builder()
.baseUrl(server.url(“/”))
.client(client)
.addConverterFactory(MoshiConverterFactory.create())
.build()
.create(WeatherApi::class.java)
}
@Test
fun `interceptor adds auth header from token store`() = runTest {
tokenStore.setToken(“Bearer abc123”)
server.enqueue(MockResponse().setResponseCode(200).setBody(“{}”))
api.getForecast(40.7, -74.0, “ignored”)
val recorded = server.takeRequest()
assertEquals(“Bearer abc123”, recorded.getHeader(“Authorization”))
}
@Test
fun `401 triggers token refresh and retry`() = runTest {
tokenStore.setToken(“Bearer expired”)
tokenStore.setRefreshedToken(“Bearer fresh”)
server.enqueue(MockResponse().setResponseCode(401)) // First call: expired token
server.enqueue(MockResponse().setResponseCode(200).setBody(“{}”)) // Retry: fresh token
api.getForecast(40.7, -74.0, “ignored”)
val first = server.takeRequest()
assertEquals(“Bearer expired”, first.getHeader(“Authorization”))
val second = server.takeRequest()
assertEquals(“Bearer fresh”, second.getHeader(“Authorization”))
}
}
Two responses queued, two requests asserted. The second test verifies the entire refresh-and-retry flow that production users hit dozens of times a day — with a real interceptor, real Retrofit, real HTTP semantics. The mocked-interface version of this test would be unable to verify the retry actually happens, since there’s no “HTTP call” for the interceptor to intercept.
Layer 5: Network Conditions — Slow, Failing, Disconnected
This is where MockWebServer goes beyond what mocking can do. Real network conditions on Android — slow connections, dropped packets, connection resets — produce specific exceptions and behaviors that your error handling needs to cope with. MockWebServer can simulate them.
@Test
fun `slow response is cancelled by timeout`() = runTest {
server.enqueue(
MockResponse()
.setResponseCode(200)
.setBody(“{}”)
.setBodyDelay(5, TimeUnit.SECONDS)
// setBodyDelay is a FUNCTION on MockResponse
// Server holds the response for N seconds before sending
)
val client = OkHttpClient.Builder()
.readTimeout(2, TimeUnit.SECONDS)
.build()
val timeoutApi = Retrofit.Builder()
.baseUrl(server.url(“/”))
.client(client)
.addConverterFactory(MoshiConverterFactory.create())
.build()
.create(WeatherApi::class.java)
val exception = assertFailsWith<SocketTimeoutException> {
timeoutApi.getForecast(40.7, -74.0, “k”)
}
// ✅ Verifies that timeout actually fires and produces SocketTimeoutException
}
@Test
fun `connection reset becomes IOException`() = runTest {
server.enqueue(
MockResponse().setSocketPolicy(SocketPolicy.DISCONNECT_AT_START)
// SocketPolicy is an ENUM — values include:
// DISCONNECT_AT_START, DISCONNECT_AT_END, DISCONNECT_DURING_RESPONSE_BODY,
// NO_RESPONSE, FAIL_HANDSHAKE
)
val exception = assertFailsWith<IOException> {
api.getForecast(40.7, -74.0, “k”)
}
}
The repository code that handles these — turning SocketTimeoutException into ForecastResult.NetworkError, deciding whether to retry on connection reset — can now be tested end-to-end. This is the difference between “works on my machine” and “works on a moving train in a tunnel.”
Layer 6: Dispatching Different Responses to Different Paths
Up to now, we’ve been calling enqueue(), which serves responses in FIFO order regardless of which endpoint was called. That’s fine for single-endpoint tests. For tests that exercise multiple endpoints in one flow (e.g., a dashboard that hits forecast, alerts, and historical), use a Dispatcher.
@Test
fun `dashboard loads all three endpoints in parallel`() = runTest {
server.dispatcher = object : Dispatcher() {
// Dispatcher is an ABSTRACT CLASS from okhttp3.mockwebserver
override fun dispatch(request: RecordedRequest): MockResponse {
return when {
request.path?.startsWith(“/v2/forecast”) == true ->
MockResponse().setBody(fixture(“forecast_success.json”))
request.path?.startsWith(“/v2/alerts”) == true ->
MockResponse().setBody(fixture(“alerts_success.json”))
request.path?.startsWith(“/v2/historical”) == true ->
MockResponse().setBody(fixture(“historical_success.json”))
else -> MockResponse().setResponseCode(404)
}
}
}
val dashboard = repository.loadDashboard(40.7, -74.0)
assertNotNull(dashboard.forecast)
assertNotNull(dashboard.alerts)
assertNotNull(dashboard.historical)
}
Custom dispatcher is also where you simulate “forecast endpoint is up but historical is down” scenarios — the kind of partial failure that mocking the API interface can’t represent at all.
Integrating with Coroutines and Turbine
For repositories that expose Flow<Result> rather than suspending functions, the test uses Turbine to assert on emissions, but the MockWebServer setup is identical.
@Test
fun `forecast flow emits Loading then Success`() = runTest {
server.enqueue(
MockResponse().setResponseCode(200).setBody(fixture(“forecast_success.json”))
)
repository.observeForecast(40.7, -74.0).test {
// .test is an EXTENSION FUNCTION on Flow from Turbine
assertEquals(ForecastState.Loading, awaitItem())
val success = awaitItem() as ForecastState.Success
assertEquals(72.4, success.forecast.temp, 0.01)
awaitComplete()
// Or cancelAndIgnoreRemainingEvents() if the Flow is hot
}
}
@Test
fun `flow retries on transient error then succeeds`() = runTest {
server.enqueue(MockResponse().setResponseCode(503)) // First attempt fails
server.enqueue(
MockResponse().setResponseCode(200).setBody(fixture(“forecast_success.json”))
)
repository.observeForecast(40.7, -74.0).test {
assertEquals(ForecastState.Loading, awaitItem())
assertEquals(ForecastState.Retrying, awaitItem())
// ✅ The repository’s retry logic is exercised end-to-end
val success = awaitItem() as ForecastState.Success
assertNotNull(success.forecast)
awaitComplete()
}
}
This pattern — enqueue N responses for the N attempts your retry logic should make, then assert on the Flow emissions — is one of the most useful integration test patterns I know. It catches retry bugs (off-by-one retries, retrying on non-retryable errors), state-machine bugs (skipped Loading state), and serialization bugs all in one test.
Pitfalls to Avoid
Forgetting to shutdown(). Without it, ports leak and you eventually run out, especially in CI with parallel test workers. Always pair start() with shutdown() in @Before/@After.
Sharing one MockWebServer across tests. Don’t. Test order isolation matters — one test queueing a response that the next test consumes is a flaky test waiting to happen. New server per test class (or per test method, if you’re paranoid).
Asserting on response.code() instead of the domain result. The point of these tests is to verify your code maps HTTP correctly to your domain types. assertEquals(404, response.code()) tests that MockWebServer works, not your code. assertTrue(result is NotFound) tests your code.
Hard-coding the test port. server.start(8080) works locally and breaks in CI when something else is on 8080. Always let it pick: server.start() with no argument.
Testing only the happy path. Five happy-path tests catch the bug where you broke the URL. Five error-path tests catch the bug where 401 silently became NoOp. The error-path tests are usually the ones that find production bugs.
Forgetting that takeRequest() blocks. If your code didn’t actually make a request, takeRequest() blocks forever. Pair with a timeout: server.takeRequest(5, TimeUnit.SECONDS) if you’re asserting “a request was made” and want a clean failure if not.
When Not to Use MockWebServer
For pure ViewModel tests where you want to assert on UI state transitions in response to repository results, you don’t need MockWebServer — mock the repository at that layer, since the repository’s contract with the ViewModel is what matters there. MockWebServer is for the network layer specifically: API client, interceptors, error mapping, retry logic.
Likewise, for unit tests of pure mappers (DTO → domain model), you don’t need network at all — just construct the DTO and test the mapper directly. MockWebServer earns its place when the test actually exercises HTTP behavior.
The right test pyramid: many fast unit tests of mappers and pure logic; a focused suite of MockWebServer tests for the API/repository layer; a small set of true integration tests against staging when needed. MockWebServer is the middle layer — fast enough to run on every commit, real enough to catch the bugs unit tests miss.
Closing
That production bug from the opening — missing header on a specific endpoint — would have been caught instantly by a 4-line MockWebServer test asserting on recorded.getHeader("X-Trace-Id"). We’d have known about it before the build hit the Play Store. We didn’t, because our tests were mocking the wrong layer.
The shift from “mock the interface” to “run a real HTTP server” is one of the highest-leverage testing changes a team can make. The setup is 10 lines. The payoff is catching the network bugs that ship to users, before they ship.
That’s the testing series wrapped: unit testing, Compose UI testing, coroutines/Flows, Room, and now Retrofit. Five posts that should cover most of what an Android codebase actually needs to test. Whatever you’re building — weather app, news feed, banking, music, chat — the pattern is the same: test at the layer where bugs actually live.
Happy coding!
Comments (0)
Sign in to leave a comment.
No comments yet. Be the first to share your thoughts.