The Repository pattern is the backbone of Android’s data layer. It sits between your ViewModel and your data sources (API, database, preferences), providing a clean API that hides where the data comes from. The ViewModel doesn’t know if data came from the network, a local cache, or in-memory storage — it just asks the Repository. This guide covers the pattern deeply — single source of truth, caching strategies, offline-first, error handling, and how to structure repositories in real production apps.
What is the Repository Pattern?
// The Repository is a CLASS that COORDINATES multiple data sources
// and provides a single, clean API to the ViewModel
// WITHOUT Repository:
// ViewModel talks to API, Database, and Preferences DIRECTLY
//
// ViewModel ──→ Retrofit API
// ──→ Room Database
// ──→ DataStore
//
// Problems:
// - ViewModel has too many responsibilities
// - No caching strategy — ViewModel manages cache logic
// - Can't swap data sources without changing ViewModel
// - Hard to test — need to mock API, database, AND preferences
// WITH Repository:
//
// ViewModel ──→ Repository ──→ Remote Data Source (API)
// ──→ Local Data Source (Room)
// ──→ Preferences (DataStore)
//
// Benefits:
// ✅ ViewModel only talks to Repository — simple API
// ✅ Repository decides WHERE to get data (network, cache, both)
// ✅ Data sources are swappable (mock for tests, different API for staging)
// ✅ Single source of truth — one place manages data consistency
// ✅ Testable — test Repository with fake data sources
Single Source of Truth
The most important concept in the Repository pattern: one source owns the data, everything else reads from it.
// ═══ SINGLE SOURCE OF TRUTH PATTERN ═════════════════════════════════
//
// Network API Repository Room Database
// (remote) (coordinator) (single source of truth)
//
// GET /articles ←───────── refresh() ──────────→ INSERT articles
// │
// │ Flow emits
// ↓
// ViewModel ←──────────── getArticlesFlow() ←──── SELECT * FROM articles
// │
// UI ←────── collectAsStateWithLifecycle ←─────────────┘
//
// The DATABASE is the single source of truth — NOT the network
// Network response is written TO the database
// UI reads FROM the database
// When database changes → Flow emits → UI updates automatically
//
// Why the database, not the network?
// - Works offline (user sees cached data even without internet)
// - Consistent (one source, no conflicting states)
// - Reactive (Room Flow emits automatically when data changes)
// - Survives process death (database is persistent)
// Implementation:
class ArticleRepository @Inject constructor(
// @Inject is an ANNOTATION from javax.inject — marks constructor for Hilt
private val remoteDataSource: ArticleRemoteDataSource,
private val localDataSource: ArticleLocalDataSource
) {
// OBSERVE — always reads from local (single source of truth)
fun getArticlesFlow(): Flow<List<Article>> {
// Returns Flow — a cold reactive stream from Room
// Room's @Query that returns Flow automatically emits when data changes
return localDataSource.getArticlesFlow()
.map { entities -> entities.map { it.toDomain() } }
// map is an EXTENSION FUNCTION on Flow — transforms each emission
// toDomain() is an EXTENSION FUNCTION on ArticleEntity (mapper)
}
// REFRESH — fetches from network, writes to local
suspend fun refreshArticles() {
// suspend KEYWORD — this function can be called from a coroutine
val remoteArticles = remoteDataSource.getArticles()
// Network call → returns List<ArticleDto>
val entities = remoteArticles.map { it.toEntity() }
// toEntity() is an EXTENSION FUNCTION on ArticleDto (mapper)
localDataSource.insertArticles(entities)
// Write to database → Room Flow AUTOMATICALLY emits updated list
// ViewModel's collector receives the new data → UI updates
}
}
Data Sources — The Building Blocks
Remote Data Source
// Wraps a single remote source (Retrofit API)
// Handles: network calls, DTO mapping, network-specific error handling
class ArticleRemoteDataSource @Inject constructor(
private val api: ArticleApi,
// ArticleApi is an INTERFACE — Retrofit generates the implementation
private val ioDispatcher: CoroutineDispatcher
// CoroutineDispatcher is an ABSTRACT CLASS from kotlinx.coroutines
) {
suspend fun getArticles(): List<ArticleDto> {
return withContext(ioDispatcher) {
// withContext is a TOP-LEVEL SUSPEND FUNCTION — switches dispatcher
api.getArticles()
}
}
suspend fun getArticle(id: String): ArticleDto {
return withContext(ioDispatcher) {
api.getArticle(id)
}
}
suspend fun searchArticles(query: String): List<ArticleDto> {
return withContext(ioDispatcher) {
api.search(query)
}
}
}
// Note: Retrofit suspend functions are main-safe by default
// (they switch to a background thread internally)
// But wrapping in withContext(IO) is a common defensive pattern
// Some teams skip it and let Retrofit handle threading
Local Data Source
// Wraps a single local source (Room Database)
// Handles: database operations, entity mapping
class ArticleLocalDataSource @Inject constructor(
private val dao: ArticleDao
// ArticleDao is an INTERFACE annotated @Dao — Room generates implementation
) {
fun getArticlesFlow(): Flow<List<ArticleEntity>> {
// Returns Flow — Room emits whenever the table changes
return dao.getAllArticles()
}
fun getArticleFlow(id: String): Flow<ArticleEntity?> {
return dao.getArticleById(id)
}
suspend fun insertArticles(articles: List<ArticleEntity>) {
dao.insertAll(articles)
}
suspend fun deleteArticle(id: String) {
dao.deleteById(id)
}
suspend fun clearAll() {
dao.deleteAll()
}
}
// Room DAO
@Dao
// @Dao is an ANNOTATION from Room
interface ArticleDao {
@Query("SELECT * FROM articles ORDER BY published_at DESC")
fun getAllArticles(): Flow<List<ArticleEntity>>
// Returning Flow makes this REACTIVE — emits on every database change
@Query("SELECT * FROM articles WHERE id = :id")
fun getArticleById(id: String): Flow<ArticleEntity?>
@Insert(onConflict = OnConflictStrategy.REPLACE)
// @Insert is an ANNOTATION from Room
// OnConflictStrategy is an ENUM — REPLACE, IGNORE, ABORT
suspend fun insertAll(articles: List<ArticleEntity>)
@Query("DELETE FROM articles WHERE id = :id")
suspend fun deleteById(id: String)
@Query("DELETE FROM articles")
suspend fun deleteAll()
}
Preferences Data Source
// Wraps DataStore for user preferences
class PreferencesDataSource @Inject constructor(
private val dataStore: DataStore<Preferences>
// DataStore is an INTERFACE from androidx.datastore
// Preferences is a CLASS that holds key-value pairs
) {
val sortOrderFlow: Flow<SortOrder> = dataStore.data
// data is a PROPERTY on DataStore — returns Flow<Preferences>
.map { preferences ->
val value = preferences[SORT_ORDER_KEY] ?: SortOrder.NEWEST.name
SortOrder.valueOf(value)
}
.catch { emit(SortOrder.NEWEST) }
// catch is an EXTENSION FUNCTION on Flow — handles read errors
suspend fun setSortOrder(sortOrder: SortOrder) {
dataStore.edit { preferences ->
// edit is a SUSPEND EXTENSION FUNCTION on DataStore
preferences[SORT_ORDER_KEY] = sortOrder.name
}
}
companion object {
private val SORT_ORDER_KEY = stringPreferencesKey("sort_order")
// stringPreferencesKey() is a TOP-LEVEL FUNCTION — creates a typed key
}
}
enum class SortOrder { NEWEST, OLDEST, POPULAR }
Layer Models and Mappers
// Each layer has its OWN model — prevents coupling between layers
//
// DTO (Data Transfer Object) — network model
// Matches the API response JSON
// Has Gson/Moshi annotations
// Lives in: data/remote/dto/
//
// Entity — database model
// Matches the Room table schema
// Has Room annotations (@Entity, @PrimaryKey)
// Lives in: data/local/entity/
//
// Domain Model — business logic model
// Clean, no framework annotations
// What the rest of the app works with
// Lives in: domain/model/
// ═══ WHY separate models? ═══════════════════════════════════════════
//
// API returns: Database stores: App uses:
// { "article_id": "123", @Entity data class Article(
// "article_title": "Hi", id: String id: String,
// "pub_date": 170000, title: String title: String,
// "img": "url" } publishedAt: Long publishedAt: Long,
// imageUrl: String? imageUrl: String?,
// formattedDate: String,
// isBookmarked: Boolean)
//
// If the API renames "article_title" to "title":
// → Only DTO changes, Entity and Domain are unaffected
// If you add a computed property (formattedDate):
// → Only Domain model has it, not DTO or Entity
// If you switch from Room to SQLDelight:
// → Only Entity changes, Domain and DTO are unaffected
// Mappers — extension functions that convert between layers
// DTO → Entity (network response → database)
fun ArticleDto.toEntity(): ArticleEntity = ArticleEntity(
id = articleId, // API uses "articleId", Entity uses "id"
title = articleTitle, // API uses "articleTitle", Entity uses "title"
author = authorName,
publishedAt = pubDate,
imageUrl = img
)
// toEntity() is an EXTENSION FUNCTION on ArticleDto
// Entity → Domain (database → app logic)
fun ArticleEntity.toDomain(): Article = Article(
id = id,
title = title,
author = author,
publishedAt = publishedAt,
imageUrl = imageUrl,
isBookmarked = false // enriched later from bookmarks table
)
// toDomain() is an EXTENSION FUNCTION on ArticleEntity
// Domain → Entity (app logic → database, for local-only data)
fun Article.toEntity(): ArticleEntity = ArticleEntity(
id = id,
title = title,
author = author,
publishedAt = publishedAt,
imageUrl = imageUrl
)
// Why extension functions for mappers?
// ✅ Clean — articleDto.toEntity() reads naturally
// ✅ Discoverable — IDE suggests .toEntity() on ArticleDto
// ✅ Testable — pure functions, easy to unit test
// ✅ No mapper class needed — just functions
Caching Strategies
// Strategy 1: NETWORK FIRST — always fetch, cache as backup
// Best for: frequently changing data (social feed, news, stock prices)
suspend fun refreshArticles() {
try {
val remote = remoteDataSource.getArticles()
localDataSource.insertArticles(remote.map { it.toEntity() })
} catch (e: Exception) {
// Network failed — cached data is already in the database
// Room Flow still emits the last cached data → UI shows stale but usable data
}
}
// Strategy 2: CACHE FIRST — read local, refresh in background
// Best for: data that doesn't change often (user profile, settings, categories)
fun getArticlesFlow(): Flow<List<Article>> = flow {
// Emit cached data immediately
val cached = localDataSource.getArticlesOnce()
if (cached.isNotEmpty()) {
emit(cached.map { it.toDomain() })
}
// Then refresh from network in background
try {
val remote = remoteDataSource.getArticles()
localDataSource.insertArticles(remote.map { it.toEntity() })
val updated = localDataSource.getArticlesOnce()
emit(updated.map { it.toDomain() })
} catch (e: Exception) {
if (cached.isEmpty()) throw e // no cache AND no network → error
}
}
// Strategy 3: CACHE WITH EXPIRY — use cache if fresh, fetch if stale
// Best for: data that changes periodically (weather, exchange rates)
suspend fun getArticles(): List<Article> {
val lastFetch = preferencesDataSource.getLastFetchTimestamp()
val cacheExpiry = 30 * 60 * 1000L // 30 minutes
return if (System.currentTimeMillis() - lastFetch < cacheExpiry) {
// Cache is fresh — use it
localDataSource.getArticlesOnce().map { it.toDomain() }
} else {
// Cache is stale — refresh
val remote = remoteDataSource.getArticles()
localDataSource.insertArticles(remote.map { it.toEntity() })
preferencesDataSource.setLastFetchTimestamp(System.currentTimeMillis())
localDataSource.getArticlesOnce().map { it.toDomain() }
}
}
// Strategy 4: OFFLINE-FIRST with Room Flow (RECOMMENDED)
// Best for: most Android apps — works offline, reactive updates
fun getArticlesFlow(): Flow<List<Article>> {
return localDataSource.getArticlesFlow()
.map { entities -> entities.map { it.toDomain() } }
// Room Flow emits cached data IMMEDIATELY
// Then when refresh() writes new data → Room Flow emits AGAIN automatically
}
suspend fun refreshArticles() {
val remote = remoteDataSource.getArticles()
localDataSource.insertArticles(remote.map { it.toEntity() })
// Room Flow picks up the change and emits to all collectors
}
Error Handling in Repository
// Option 1: Result wrapper — explicit success/failure
sealed interface Result<out T> {
// sealed interface — restricted hierarchy for exhaustive when
data class Success<T>(val data: T) : Result<T>
data class Error(val exception: Throwable) : Result<Nothing>
}
class ArticleRepository @Inject constructor(
private val remoteDataSource: ArticleRemoteDataSource,
private val localDataSource: ArticleLocalDataSource
) {
suspend fun refreshArticles(): Result<Unit> {
return try {
val remote = remoteDataSource.getArticles()
localDataSource.insertArticles(remote.map { it.toEntity() })
Result.Success(Unit)
} catch (e: CancellationException) {
throw e // always re-throw CancellationException!
} catch (e: Exception) {
Result.Error(e)
}
}
}
// ViewModel handles the result:
fun refresh() {
viewModelScope.launch {
when (val result = repository.refreshArticles()) {
is Result.Success -> { /* Room Flow handles UI update */ }
is Result.Error -> _events.emit(UiEvent.ShowSnackbar(result.exception.message ?: "Error"))
}
}
}
// Option 2: Let exceptions propagate — ViewModel catches them
class ArticleRepository @Inject constructor(...) {
suspend fun refreshArticles() {
// Throws on failure — let caller handle it
val remote = remoteDataSource.getArticles()
localDataSource.insertArticles(remote.map { it.toEntity() })
}
}
// ViewModel:
fun refresh() {
viewModelScope.launch {
try {
repository.refreshArticles()
} catch (e: CancellationException) { throw e }
catch (e: Exception) {
_events.emit(UiEvent.ShowSnackbar("Refresh failed"))
}
}
}
// Which approach?
// Result wrapper → when Repository methods are called directly and caller needs to decide
// Exception propagation → when using Flow pipelines (catch handles it)
// Both are valid — pick one and be consistent in your project
Complete Repository — Production Example
class ArticleRepository @Inject constructor(
private val remoteDataSource: ArticleRemoteDataSource,
private val localDataSource: ArticleLocalDataSource,
private val preferencesDataSource: PreferencesDataSource
) {
// ═══ OBSERVE (reactive) ═════════════════════════════════════════
// All articles — sorted by user preference
fun getArticlesFlow(): Flow<List<Article>> {
return combine(
// combine is a TOP-LEVEL FUNCTION from kotlinx.coroutines.flow
// Merges multiple Flows — emits when ANY source emits
localDataSource.getArticlesFlow(),
preferencesDataSource.sortOrderFlow
) { entities, sortOrder ->
val articles = entities.map { it.toDomain() }
when (sortOrder) {
SortOrder.NEWEST -> articles.sortedByDescending { it.publishedAt }
SortOrder.OLDEST -> articles.sortedBy { it.publishedAt }
SortOrder.POPULAR -> articles.sortedByDescending { it.viewCount }
}
}
}
// Single article by ID
fun getArticleFlow(id: String): Flow<Article?> {
return localDataSource.getArticleFlow(id)
.map { entity -> entity?.toDomain() }
}
// Search articles
fun searchArticlesFlow(query: String): Flow<List<Article>> {
return localDataSource.searchFlow(query)
.map { entities -> entities.map { it.toDomain() } }
}
// ═══ WRITE (imperative) ═════════════════════════════════════════
// Refresh from network
suspend fun refreshArticles() {
val remote = remoteDataSource.getArticles()
localDataSource.insertArticles(remote.map { it.toEntity() })
}
// Toggle bookmark
suspend fun toggleBookmark(articleId: String) {
localDataSource.toggleBookmark(articleId)
}
// Delete article
suspend fun deleteArticle(articleId: String) {
localDataSource.deleteArticle(articleId)
// Optionally: remoteDataSource.deleteArticle(articleId)
}
// Change sort order
suspend fun setSortOrder(sortOrder: SortOrder) {
preferencesDataSource.setSortOrder(sortOrder)
// preferencesDataSource.sortOrderFlow emits → combine re-evaluates
// → getArticlesFlow() emits re-sorted list → UI updates
}
}
Repository in ViewModel
@HiltViewModel
class ArticleViewModel @Inject constructor(
private val repository: ArticleRepository
) : ViewModel() {
// Reactive pipeline: Repository Flow → StateFlow for UI
val articles: StateFlow<List<Article>> = repository.getArticlesFlow()
.stateIn(
scope = viewModelScope,
started = SharingStarted.WhileSubscribed(5000),
initialValue = emptyList()
)
// stateIn is an EXTENSION FUNCTION on Flow → converts cold to hot StateFlow
private val _isRefreshing = MutableStateFlow(false)
val isRefreshing: StateFlow<Boolean> = _isRefreshing.asStateFlow()
private val _events = MutableSharedFlow<UiEvent>()
val events: SharedFlow<UiEvent> = _events.asSharedFlow()
init {
refresh() // initial data load
}
fun refresh() {
viewModelScope.launch {
_isRefreshing.value = true
try {
repository.refreshArticles()
// Room Flow automatically emits updated data
// stateIn picks it up → articles StateFlow updates → UI recomposes
} catch (e: CancellationException) { throw e }
catch (e: Exception) {
_events.emit(UiEvent.ShowSnackbar("Refresh failed: ${e.message}"))
} finally {
_isRefreshing.value = false
}
}
}
fun onSortOrderChanged(sortOrder: SortOrder) {
viewModelScope.launch {
repository.setSortOrder(sortOrder)
// Preferences Flow emits → combine in Repository re-sorts
// → articles StateFlow updates → UI shows re-sorted list
// All automatic — no manual re-fetch needed!
}
}
fun onBookmarkToggled(articleId: String) {
viewModelScope.launch {
repository.toggleBookmark(articleId)
}
}
}
Testing Repositories
// Repositories are easy to test with FAKE data sources
// Fake remote data source
class FakeArticleRemoteDataSource : ArticleRemoteDataSource {
var articlesToReturn = listOf<ArticleDto>()
var shouldThrow = false
override suspend fun getArticles(): List<ArticleDto> {
if (shouldThrow) throw IOException("Network error")
return articlesToReturn
}
}
// Fake local data source
class FakeArticleLocalDataSource : ArticleLocalDataSource {
private val articles = MutableStateFlow<List<ArticleEntity>>(emptyList())
override fun getArticlesFlow(): Flow<List<ArticleEntity>> = articles
override suspend fun insertArticles(entities: List<ArticleEntity>) {
articles.value = entities
}
}
// Test
class ArticleRepositoryTest {
private val remoteDataSource = FakeArticleRemoteDataSource()
private val localDataSource = FakeArticleLocalDataSource()
private val repository = ArticleRepository(remoteDataSource, localDataSource)
@Test
fun `refresh writes remote data to local`() = runTest {
// runTest is a TOP-LEVEL FUNCTION from kotlinx-coroutines-test
remoteDataSource.articlesToReturn = listOf(sampleDto)
repository.refreshArticles()
val local = localDataSource.getArticlesFlow().first()
// first() is an EXTENSION FUNCTION on Flow — collects first emission
assertEquals(1, local.size)
assertEquals(sampleDto.articleId, local[0].id)
}
@Test
fun `refresh failure doesn't clear cache`() = runTest {
localDataSource.insertArticles(listOf(sampleEntity))
remoteDataSource.shouldThrow = true
try { repository.refreshArticles() }
catch (e: Exception) { /* expected */ }
val local = localDataSource.getArticlesFlow().first()
assertEquals(1, local.size) // cache is still intact!
}
}
Common Mistakes to Avoid
Mistake 1: Repository holding state
// ❌ Repository caches data in a variable — becomes stale, hard to test
class ArticleRepository {
private var cachedArticles: List<Article>? = null // ❌ state in Repository
suspend fun getArticles(): List<Article> {
if (cachedArticles != null) return cachedArticles!!
val remote = api.getArticles()
cachedArticles = remote
return remote
}
}
// ✅ Database IS the cache — Repository is stateless
class ArticleRepository {
fun getArticlesFlow(): Flow<List<Article>> = dao.getArticles().map { /* ... */ }
suspend fun refresh() { dao.insertAll(api.getArticles()) }
}
Mistake 2: Returning DTO or Entity from Repository
// ❌ ViewModel receives network/database model — coupled to data layer
class ArticleRepository {
fun getArticles(): Flow<List<ArticleEntity>> = dao.getArticles()
// ViewModel now depends on Room's ArticleEntity!
}
// ✅ Return domain model — clean separation
class ArticleRepository {
fun getArticles(): Flow<List<Article>> = dao.getArticles()
.map { entities -> entities.map { it.toDomain() } }
}
Mistake 3: Multiple sources of truth
// ❌ Sometimes reads from API, sometimes from database — inconsistent
class ArticleRepository {
suspend fun getArticles(): List<Article> {
return if (isOnline()) api.getArticles() else dao.getArticles()
// Two sources — can show different data!
}
}
// ✅ Single source: always read from database, network refreshes it
class ArticleRepository {
fun getArticlesFlow(): Flow<List<Article>> = dao.getArticlesFlow().map { ... }
suspend fun refresh() { dao.insertAll(api.getArticles()) }
}
Mistake 4: Swallowing CancellationException
// ❌ Catches ALL exceptions including CancellationException
suspend fun refreshArticles(): Result<Unit> = try {
val data = api.getArticles()
dao.insertAll(data)
Result.Success(Unit)
} catch (e: Exception) { // catches CancellationException too!
Result.Error(e) // breaks structured concurrency!
}
// ✅ Always re-throw CancellationException
suspend fun refreshArticles(): Result<Unit> = try {
val data = api.getArticles()
dao.insertAll(data)
Result.Success(Unit)
} catch (e: CancellationException) {
throw e // ALWAYS re-throw
} catch (e: Exception) {
Result.Error(e)
}
Mistake 5: One giant Repository for everything
// ❌ God Repository — knows about articles, users, settings, analytics
class AppRepository {
fun getArticles() = ...
fun getUser() = ...
fun getSettings() = ...
fun logEvent() = ...
// Grows endlessly, impossible to test, violates single responsibility
}
// ✅ One Repository per domain concept
class ArticleRepository { fun getArticlesFlow(): Flow<...> }
class UserRepository { fun getUserFlow(): Flow<...> }
class SettingsRepository { fun getSortOrder(): Flow<...> }
// Each is focused, testable, and reusable
Summary
- The Repository pattern coordinates multiple data sources behind a clean API — ViewModel doesn’t know where data comes from
- Single source of truth: the database (Room) is the source, network refreshes update it, UI reads from it via Flow
- Repository uses Data Sources: Remote (Retrofit), Local (Room), Preferences (DataStore) — each wraps one external source
- Use separate models per layer: DTO (network), Entity (database), Domain (business logic) with mapper extension functions
- Caching strategies: network-first, cache-first, cache-with-expiry, offline-first (recommended)
- Repository should be stateless — the database is the cache, not an in-memory variable
- Return domain models from Repository, never DTOs or Entities — keeps layers decoupled
- Use
combine()(top-level function) to merge multiple Flows (articles + sort order → sorted articles) - Handle errors with Result wrapper or exception propagation — pick one approach and be consistent
- Always re-throw CancellationException in catch blocks — breaking structured concurrency causes subtle bugs
- One Repository per domain concept (ArticleRepository, UserRepository) — not one giant AppRepository
- Test with Fake data sources — Repository is easy to test because it depends on interfaces
The Repository pattern isn’t complicated — it’s a coordinator between data sources with the database as the single source of truth. The key insight is that the UI should never care where data comes from. It observes a Room Flow, and the Repository makes sure that Flow always has the right data — whether it came from the network, a cache, or an offline fallback. Get this right, and your app works reliably online, offline, and everything in between.
Happy coding!
Comments (0)