Compose Gestures Deep Dive — pointerInput, Multi-Touch, and Conflict Resolution

A admin · May 7, 2026 · 6 views

The thing about gestures in Compose is that the high-level APIs make 80% of cases trivial — Modifier.clickable, Modifier.draggable, swipeable all just work. And then someone hands you a photo editor that needs pinch-to-zoom, two-finger rotation, draggable layers over a draw-path canvas, and the whole gesture stack collapses into a tangle of conflicting handlers. The high-level APIs run out of road, and you discover that gestures are actually built on a lower-level primitive that nobody explained to you.

That primitive is pointerInput, and once you understand how it works, every gesture — from a custom drag-to-dismiss to a multi-touch transform — becomes a question of writing the right loop, not finding the right modifier. This post: how Compose actually delivers pointer events, the high-level gesture detectors and where they fall short, building custom gesture handlers from scratch, conflict resolution between nested gestures, and the new SuspendingPointerInputModifierNode APIs that replaced the old composed { } approach. Examples come from a photo-editing app, because zoom-rotate-pan is the canonical problem that breaks naive gesture code.

Pointer Events — What’s Actually Flowing Through Your Composables

Every touch on a Compose UI generates a stream of PointerInputChange events — one per pointer (finger), per frame, with position, pressure, and a flag indicating whether the pointer was pressed/released this frame. These events flow down the composable tree from root to leaf, in three passes:

Initial pass (root → leaf). Parents see the event first. Useful for “intercept this gesture” behavior — the parent can consume the event before the child sees it.

Main pass (leaf → root). The default. Children see the event first; if they don’t consume it, parents get a chance.

Final pass (root → leaf). Parents get a last look at events that nothing else consumed. Rarely needed, but it’s how some scroll-conflict resolution works.

You don’t usually think about these passes — the high-level detectors handle them. But when nested gestures conflict (a draggable inside a scrollable container), the pass system is what governs who wins. Mention this in an interview and you’ve flagged that you understand the underlying model, not just the API surface.

The High-Level Gesture Modifiers

Most apps need none of the underlying machinery. Compose ships purpose-built modifiers for common gestures:

// Tap detection
Modifier.clickable { onTap() }
Modifier.combinedClickable(
    onClick = { onTap() },
    onLongClick = { onLongPress() },
    onDoubleClick = { onDoubleTap() }
)

// Single-axis drag
Modifier.draggable(
    state = rememberDraggableState { delta -> offset += delta },
    orientation = Orientation.Horizontal,
    onDragStarted = { ... },
    onDragStopped = { velocity -> ... }
)

// Two-axis drag
Modifier.draggable2D(
    state = rememberDraggable2DState { delta -> offset += delta }
)

// Scrolling
Modifier.verticalScroll(rememberScrollState())
Modifier.horizontalScroll(rememberScrollState())

// Material-style swipe-to-dismiss / swipe-to-reveal
Modifier.anchoredDraggable(
    state = anchoredDraggableState,
    orientation = Orientation.Horizontal
)

Decision tree for which to reach for:

Tap, long-press, double-tap → combinedClickable
One-finger drag along an axis → draggable
One-finger drag in any direction → draggable2D
Multi-finger gestures (pinch, rotate) → you need pointerInput directly — no high-level alternative
Snap-to-anchor gestures (drawer, bottom sheet) → anchoredDraggable

The moment you need pinch-to-zoom or rotation, you’re below the high-level API. That’s where most teams hit the wall.

pointerInput — The Primitive Everything Is Built On

Modifier.pointerInput gives you a coroutine scope that receives raw pointer events. Inside, you write a loop that awaits events and reacts. It looks foreign at first because it’s coroutine-based pointer handling — but once you grasp the pattern, it’s consistent across every gesture you’ll write.

// The skeleton: a custom drag handler from scratch
Modifier.pointerInput(Unit) {
    // pointerInput is a MODIFIER FUNCTION from compose.ui.input.pointer
    // The lambda is a SUSPEND lambda — you can use coroutine APIs
    // The KEY (Unit here) controls when the input loop is restarted
    awaitEachGesture {
        // awaitEachGesture is a SUSPEND FUNCTION
        // Each iteration handles ONE gesture from press-down to all-fingers-up
        val down = awaitFirstDown()
        // Suspends until the user touches the screen
        // Returns the PointerInputChange for the down event

        do {
            val event = awaitPointerEvent()
            // Suspends until the next pointer event (move, up, additional down)
            event.changes.forEach { change ->
                if (change.positionChanged()) {
                    val delta = change.position - change.previousPosition
                    onDrag(delta)
                    change.consume()
                    // consume() marks this event as handled
                    // Prevents parent gestures from also reacting
                }
            }
        } while (event.changes.any { it.pressed })
        // Loop until all fingers are lifted
    }
}

Three concepts to internalize:

1. The key parameter. pointerInput(Unit) means “the input loop runs once for the lifetime of this composable.” Pass a real key (like a state object) and the loop restarts when the key changes — useful when your gesture handling depends on changing parameters. Be careful with this: restarting the loop mid-gesture cancels in-progress drags.

2. awaitEachGesture. Each iteration is one full gesture: from first finger down to all fingers up. Inside the iteration, you have a suspend-based event loop. Outside it, you’re between gestures — a clean state.

3. consume(). Calling change.consume() tells parent modifiers “this event is handled, don’t process it further.” Skip the consume and you’ll get unexpected behavior — both your handler and a parent verticalScroll reacting to the same drag, for instance.

Building a Real Gesture — Pinch-Zoom-Rotate-Pan

The photo editor case. The user can pan with one finger, pinch to zoom with two, rotate with two. All three should compose — rotating mid-pinch should work, panning should work during a zoom.

The good news: Compose ships Modifier.transformable for exactly this. The lower-level alternative is detectTransformGestures, which you wrap in a pointerInput:

@Composable
fun ZoomableImage(painter: Painter, modifier: Modifier = Modifier) {
    var scale by remember { mutableFloatStateOf(1f) }
    var rotation by remember { mutableFloatStateOf(0f) }
    var offset by remember { mutableStateOf(Offset.Zero) }

    Image(
        painter = painter,
        contentDescription = null,
        modifier = modifier
            .pointerInput(Unit) {
                detectTransformGestures(
                    // detectTransformGestures is a SUSPEND FUNCTION
                    // Built on top of pointerInput — handles all the math for you
                    onGesture = { centroid, pan, zoom, rotationChange ->
                        scale *= zoom
                        rotation += rotationChange
                        offset += pan
                    }
                )
            }
            .graphicsLayer {
                // Apply transformations in the draw phase — cheap
                scaleX = scale
                scaleY = scale
                rotationZ = rotation
                translationX = offset.x
                translationY = offset.y
            }
    )
}

Three things worth lingering on:

The four parameters of the gesture lambda — centroid (the center of the gesture), pan (translation delta), zoom (scale multiplier), rotationChange (degrees this frame). They’re computed from raw pointer positions but you don’t need to do the math.

State applied via graphicsLayer, not Modifier.scale or Modifier.offset. The animations post drilled this and it matters even more for gestures: graphicsLayer { scaleX = scale } reads the state in the draw phase. Using Modifier.scale(scale) reads in composition — meaning every frame of a pinch triggers a re-composition of the image. On a slow device, that’s the difference between buttery zoom and stuttering jank.

Constraints not enforced here. In production you’d clamp scale between 0.5 and 5.0, prevent offset from panning the image entirely off-screen, and possibly snap rotation to 0/90/180/270 degree increments after the gesture ends. The bare gesture detector gives you raw deltas; the constraints are your responsibility.

The Drag Detectors — Tap, Drag, and Their Hybrids

Beyond detectTransformGestures, Compose ships several detectors as suspend functions on the pointer-input scope:

// All called inside pointerInput {} blocks

detectTapGestures(
    onPress = { offset -> ... },     // Touch down
    onTap = { offset -> ... },       // Quick tap
    onDoubleTap = { offset -> ... },
    onLongPress = { offset -> ... }
)

detectDragGestures(
    onDragStart = { offset -> ... },
    onDragEnd = { ... },
    onDragCancel = { ... },
    onDrag = { change, dragAmount -> ... }
)

detectVerticalDragGestures(...)
detectHorizontalDragGestures(...)
detectDragGesturesAfterLongPress(...)
// Drag only starts after a long-press — useful for “long-press to grab” UIs

detectTransformGestures(
    panZoomLock = false,  // If true, locks to pan-only after first move
    onGesture = { centroid, pan, zoom, rotation -> ... }
)

Why these exist as separate detectors instead of one big one: each consumes events in a different way. detectTapGestures needs to wait briefly to distinguish tap from drag (you tapped vs. started a drag). detectDragGestures needs a minimum-distance threshold before deciding it’s a drag, not a jittery tap. The right detector handles those decisions for you. Combining them in one pointerInput requires careful ordering — we’ll cover that next.

Combining Gestures — The Conflict Problem

The photo editor needs both: tap a layer to select it, drag a layer to move it. Naive approach — both detectors in the same pointerInput — doesn’t work, because they fight for the same events.

// ❌ This doesn’t work the way you’d hope
Modifier.pointerInput(Unit) {
    detectTapGestures(onTap = { onSelect() })
    detectDragGestures(onDrag = { _, delta -> onMove(delta) })
}
// detectTapGestures suspends forever waiting for the next gesture
// detectDragGestures never gets called

The fix: separate pointerInput blocks. Each runs in its own coroutine and gets its own pass at events.

// ✅ Two separate pointerInput modifiers, two separate handlers
Modifier
    .pointerInput(Unit) {
        detectTapGestures(onTap = { onSelect() })
    }
    .pointerInput(Unit) {
        detectDragGestures(onDrag = { _, delta -> onMove(delta) })
    }

This works because Compose’s pointer system handles the disambiguation: a quick touch-and-release fires the tap handler; a touch-and-move fires the drag handler. The detectors internally coordinate via the pointer-event consumption protocol.

For combining tap-or-drag with custom logic (e.g., “long-press to start drag, regular tap to select”), detectDragGesturesAfterLongPress is the purpose-built tool. For more exotic combinations — “tap-tap-hold-then-drag,” “swipe direction matters” — you’ll write the loop yourself with awaitEachGesture and full control.

Nested Scrolling and the Conflict That Eats Apps

The hardest gesture problem is: a horizontally-swiping carousel inside a vertically-scrolling page. You start touching the carousel; if you swipe horizontally, the carousel should scroll. If you swipe vertically, the page should scroll. The handlers fight, and naive code makes either one work but not both.

Compose’s answer is the NestedScrollConnection protocol — a way for nested scrollable areas to negotiate over each event.

val connection = remember {
    object : NestedScrollConnection {
        // NestedScrollConnection is an INTERFACE from compose.ui.input.nestedscroll
        override fun onPreScroll(available: Offset, source: NestedScrollSource): Offset {
            // Called BEFORE the inner scroll consumes the gesture
            // Return how much YOU consumed
            return Offset.Zero // Let the inner handle it
        }

        override fun onPostScroll(
            consumed: Offset,
            available: Offset,
            source: NestedScrollSource
        ): Offset {
            // Called AFTER the inner scroll has consumed what it can
            // `available` is what’s left over
            return Offset.Zero
        }

        override suspend fun onPreFling(available: Velocity): Velocity = Velocity.Zero
        override suspend fun onPostFling(consumed: Velocity, available: Velocity): Velocity =
            Velocity.Zero
    }
}

// Apply to a parent that wants to coordinate with nested scrollables
Modifier.nestedScroll(connection)

The pattern: parent intercepts the gesture in onPreScroll if it has priority (e.g., a collapsing app bar that should consume vertical scroll before the inner list). Otherwise, the inner scroll handles it, and the parent gets the leftover via onPostScroll if it wants. Velocity is similarly negotiated for fling animations.

Most apps don’t need to write a custom NestedScrollConnection — LazyColumn, verticalScroll, and Material’s scaffold integrations all coordinate automatically. You write one when you have a custom collapsing header, a swipe-to-dismiss inside a scrollable list, or some product-specific scroll behavior. When you do need it, the protocol is the right tool — trying to roll your own with pointerInput conflicts becomes a maintenance disaster.

Velocity, Fling, and Spring-Back

A drag that ends with the finger still moving has velocity. Real apps use that velocity for fling animations — the swipe-the-card-away motion that continues after release.

@Composable
fun SwipeableCard(onDismiss: () -> Unit) {
    val offsetX = remember { Animatable(0f) }
    val scope = rememberCoroutineScope()
    val velocityTracker = remember { VelocityTracker() }
    // VelocityTracker is a CLASS from compose.ui.input.pointer.util
    // It tracks position-over-time and computes velocity at release

    Box(
        modifier = Modifier
            .offset { IntOffset(offsetX.value.roundToInt(), 0) }
            .pointerInput(Unit) {
                detectHorizontalDragGestures(
                    onDragEnd = {
                        val velocity = velocityTracker.calculateVelocity().x
                        scope.launch {
                            if (abs(velocity) > 1000f || abs(offsetX.value) > 200) {
                                // Fling-dismiss: animate off-screen using the captured velocity
                                offsetX.animateDecay(
                                    initialVelocity = velocity,
                                    animationSpec = exponentialDecay()
                                )
                                onDismiss()
                            } else {
                                // Spring back to center
                                offsetX.animateTo(
                                    targetValue = 0f,
                                    animationSpec = spring(
                                        dampingRatio = Spring.DampingRatioMediumBouncy
                                    )
                                )
                            }
                        }
                    },
                    onHorizontalDrag = { change, dragAmount ->
                        velocityTracker.addPosition(change.uptimeMillis, change.position)
                        // Track positions for velocity computation
                        scope.launch {
                            offsetX.snapTo(offsetX.value + dragAmount)
                            // snapTo is INSTANT — follow-the-finger feel
                        }
                    }
                )
            }
    ) { /* card content */ }
}

The interesting part isn’t the gesture detection — it’s the decision at onDragEnd: do we have enough velocity or distance to dismiss? Two thresholds, both arbitrary, both critical to feel. Real apps tune these for a long time.

The connection to the Animations post: animateDecay for fling, animateTo with a spring for snap-back, snapTo for follow-the-finger. The gesture provides input; Animatable provides controlled output.

The New SuspendingPointerInputModifierNode (2025+)

The classic pointerInput modifier has been the standard for years. As of recent Compose versions, there’s a newer, lower-level API: SuspendingPointerInputModifierNode, which is the Modifier.Node-based equivalent. Same coroutine model, same suspend functions, but it integrates with the modifier-node system the Modifiers post covered.

// You generally DON’T write this directly — the existing pointerInput modifier
// already uses it under the hood. But for a library author writing a custom
// gesture node:
private class CustomDragNode : DelegatingNode(), PointerInputModifierNode {
    private val pointerInputNode = delegate(SuspendingPointerInputModifierNode {
        awaitEachGesture {
            val down = awaitFirstDown()
            // ... rest of gesture logic
        }
    })

    override fun onPointerEvent(
        pointerEvent: PointerEvent,
        pass: PointerEventPass,
        bounds: IntSize
    ) {
        pointerInputNode.onPointerEvent(pointerEvent, pass, bounds)
    }

    override fun onCancelPointerInput() {
        pointerInputNode.onCancelPointerInput()
    }
}

For app-level code, Modifier.pointerInput { ... } is still the right API and isn’t going anywhere — it’s implemented in terms of SuspendingPointerInputModifierNode. The new API matters mainly to library authors writing reusable gesture handlers, where the Modifier.Node performance characteristics matter (no allocation per recomposition, stable equality, can implement multiple capabilities).

Practical takeaway: for app code, keep using pointerInput { }. If you ever wrote a gesture utility with the old composed { } + pointerInput sandwich, that’s the case worth migrating to the node-based API for the same reasons covered in the Modifiers post.

Pitfalls That Bite

Forgetting to consume events. Your gesture handler reacts but parent scroll also reacts. The fix is change.consume() at the right point in the loop — usually right after you decide “yes, I’m handling this drag.”

Reading state in pointerInput(state) { ... }. If state changes during a gesture, your input loop restarts mid-gesture. The current drag is cancelled. Use a stable key (often Unit) and read changing state via by remember or a captured State reference inside the lambda.

Putting expensive work inside the gesture lambda. Pointer events fire at frame rate. If onDrag does anything heavy (recomputing layout, allocating, hitting a database), every frame stutters. Update state synchronously, defer work to a coroutine launched outside the gesture.

Applying transforms via composition modifiers. Modifier.scale(scale.value) recomposes on every change. Modifier.graphicsLayer { scaleX = scale.value } doesn’t. For gestures driving 60fps animations, the difference is jank vs. smooth.

Trying to do multi-touch with high-level APIs. draggable handles single-finger drag. The moment a second finger lands, undefined-ish behavior. For multi-touch, drop to pointerInput + detectTransformGestures or awaitEachGesture with manual pointer tracking.

Forgetting that Modifier.scrollable intercepts gestures. A pointerInput nested inside a verticalScroll may never see drags — the scroll consumes them. Use NestedScrollConnection for proper coordination, or the nestedScroll integration patterns.

Memory leaks via captured state in long-running gesture loops. The lambda inside pointerInput can run for the lifetime of the composable. Capturing references to large state objects keeps them alive. Be aware of what your gesture loop holds onto.

When to Reach for What

┌──────────────────────────────────────────────────────────────────────┐
│ I want to...                              │ Use                      │
├───────────────────────────────────────────┼──────────────────────────┤
│ Detect a tap/click                        │ Modifier.clickable       │
│ Tap, long-press, double-tap combined      │ Modifier.combinedClickable │
│ Single-finger drag (axis or 2D)           │ Modifier.draggable / 2D  │
│ Snap-to-anchor drag (drawer/sheet)        │ Modifier.anchoredDraggable │
│ Scrolling content                         │ verticalScroll, LazyColumn │
│ Multi-touch (pinch, rotate, transform)    │ detectTransformGestures  │
│ Custom gesture loop                       │ awaitEachGesture         │
│ Fling-with-velocity ending                │ Animatable.animateDecay  │
│ Snap-back ending                          │ Animatable.animateTo + spring │
│ Nested scroll coordination                │ NestedScrollConnection   │
│ Library-level gesture component           │ Modifier.Node + Suspending │
│                                           │ PointerInputModifierNode │
└──────────────────────────────────────────────────────────────────────┘

Closing

The trick to gestures in Compose is realizing the high-level APIs are convenience layers over a small set of primitives, not magic. Once you can write a custom gesture loop with awaitEachGesture, every “impossible” gesture becomes routine. The photo editor scenario from the opening — pinch, rotate, pan, draggable layers, draw paths — is solvable with the patterns in this post: separate pointerInput blocks for non-conflicting gestures, detectTransformGestures for the multi-touch transform, NestedScrollConnection when scroll-vs-gesture conflicts arise, Animatable driven by velocity for the polish.

That’s the Compose Advanced quartet wrapped: animations (when state moves smoothly), custom layouts (when measure/place needs custom logic), modifiers (the chain that ties everything together), and gestures (how user input enters the system). These four interlock: a gesture drives an animation that updates state, the modifier chain projects that state to layout and draw, the layout decides where things go. Once those four feel automatic, you can build any UI Compose can express.

Happy coding!

6 views · 0 comments

Comments (0)

No comments yet. Be the first to share your thoughts.

Pointer Events — What’s Actually Flowing Through Your Composables

The High-Level Gesture Modifiers

pointerInput — The Primitive Everything Is Built On

Building a Real Gesture — Pinch-Zoom-Rotate-Pan

The Drag Detectors — Tap, Drag, and Their Hybrids

Combining Gestures — The Conflict Problem

Nested Scrolling and the Conflict That Eats Apps

Velocity, Fling, and Spring-Back

The New SuspendingPointerInputModifierNode (2025+)

Pitfalls That Bite

When to Reach for What

Closing

Comments (0)

Related articles

Compose Custom Layouts — From Layout {} to SubcomposeLayout, with Real Use Cases

Compose Animations — Picking the Right API and Avoiding the Recomposition Trap

Compose Snapshot System — How mutableStateOf Tracks Reads, Writes, and Triggers Recomposition

Hilt Dependency Injection — Compile-Time Code Gen, Internals, Compose Injection, @EntryPoint, and Production Patterns

Compose Performance

Share this article