← back to /onboarding Engineer

Sling for engineers

Phases, modes, overrides — what's safe to delegate, what isn't.

When NOT to use Sling

Easier to learn from than "when to use." These are the failure-mode patterns we keep seeing.

The STIV pipeline

Six phases. Each writes a row to task_steps. You can replay individual phases on retry.

  1. workspace — clones repo, checks out base branch
  2. spec — turns the issue into structured acceptance criteria
  3. tests — generates tests for each AC (skipped on skip-tests)
  4. implement — writes code to pass the tests; up to 3 attempts per file
  5. quality — runs lints, types, the existing test suite
  6. review — AI review, score 0–10, finds nits
  7. push + pr — pushes branch, opens PR, posts the rolling comment

On retry, completed phases are skipped (checkpointed in Postgres). Workspace + branch are reused.

Execution modes

ModeTriggerBest for
stiv (default)No special labelWell-scoped, single-file, TDD-friendly
claude-codeLabel claude-codeMulti-file, exploratory — recommended today
claude-code-tmuxLabel claude-code-tmuxSame as above + live tmux attach for debugging

Mode resolution: task label → repo config (execution_mode) → task property → default (stiv).

Overrides you can set

Failure modes you'll see

Categories from failure-classifier.js. Each is reported on the PR + dashboard.

tests-failing

Generated tests pass but existing suite fails. Most often: brittle existing test referencing an old field name.

typecheck-failing

tsc errors. Usually a missing type import on a refactor.

lint-failing

Easy — re-run usually fixes. If repeated, the lint config drifted.

budget-exceeded

Per-ticket cost cap hit. Brief is too big — split it or raise the cap on this one ticket.

runner-timeout

Hit the worker timeout. Long compile / test suite — bump CLAUDE_CODE_TIMEOUT_MS for the run.

tier3-rejected

Risk assessment refused. Touches auth/payments/PHI. Implement by hand.

Reading the trace

Different role?