50 questions across four roles — Junior, Mid, Senior, and Lead — grounded in real projects: the DSAR (Realworld Suite) Spring Boot service, Helfa’s visa state machine, Nova Schilda’s graph algorithms, and what I learned the hard way along the way.
Each question is colour-coded:
Technical precise, code-grounded answer.
Database SQL/JPA/Postgres specifics.
Story told through my own path.
Tip the angle most candidates miss.
Watch out the follow-up the interviewer is queueing up.
Technical Source compiles to .class bytecode. The JVM loads it (class loaders: bootstrap → platform → application), verifies it for type-safety, then either interprets or JIT-compiles hot methods to native code via C1/C2. Memory is split into heap (objects, GC’d, generational by default) and thread stacks (frames, primitives, references). Modern GCs (G1, ZGC, Shenandoah) are mostly concurrent — you don’t see stop-the-world pauses on a healthy app.
== and .equals().Technical == compares references for objects, values for primitives. .equals() is logical equality defined by the class. Catch: when you override equals you must also override hashCode — otherwise HashSet/HashMap break in subtle ways. String trips juniors because "x" == "x" can be true thanks to interning, but it’s never guaranteed across all sources.
Technical Checked extend Exception — the compiler forces you to catch or declare them. Unchecked extend RuntimeException — handle when meaningful, ignore when not. Modern Spring code leans hard on unchecked because checked exceptions don’t compose with lambdas. Real engineering: throw a typed unchecked exception per failure mode (VisaNotFoundException, InvalidStatusTransitionException) and translate at the controller boundary with @ControllerAdvice.
final, finally, finalize?Technical final — variable can’t reassign, method can’t override, class can’t subclass. finally — block that runs whether the try succeeded, threw, or returned. finalize() — deprecated GC hook, never use it. Modern equivalent: AutoCloseable + try-with-resources.
ArrayList and LinkedList.Technical ArrayList: array-backed, O(1) random access, O(n) insert at front. LinkedList: doubly-linked nodes, O(1) insert at either end if you have the node, O(n) random access. In practice ArrayList wins almost every benchmark — modern CPU caches love contiguous memory and most workloads aren’t insert-at-front. LinkedList is mostly a textbook curiosity.
Technical A pipeline of operations (filter, map, reduce, collect) over a sequence. They’re lazy — nothing runs until a terminal operation. Use them for declarative data processing; avoid them inside hot inner loops where the indirection cost shows up. I lean on Collectors.groupingBy a lot when reporting.
Map<VisaStatus, Long> byStatus = visas.stream()
.collect(Collectors.groupingBy(VisaApplication::getStatus, Collectors.counting()));
Technical Use Optional as a return type for “might not be there” lookups. Don’t use it as a field, parameter, or in a collection. Prefer orElseThrow, map, ifPresent over .get() — calling .get() without checking is just a NullPointerException with extra steps.
Technical synchronized — intrinsic lock, simple but coarse. ReentrantLock — explicit lock with try-lock, fairness, interruptibility. ReadWriteLock — many readers, one writer. java.util.concurrent.atomic.* — lock-free CAS for counters and flags. ConcurrentHashMap for shared maps. Threads themselves: ExecutorService always, never raw new Thread().
Technical JEP 444 (Java 21) made “cheap threads” production-ready. A virtual thread is a fiber managed by the JVM, not the OS. Blocking I/O on a virtual thread parks it instead of blocking a kernel thread, so a request-per-thread server can scale to hundreds of thousands of concurrent connections without rewriting to async. Spring Boot 3.2+ supports them via spring.threads.virtual.enabled=true.
Technical Generational hypothesis: most objects die young. Young generation (Eden + survivor) is collected often and fast (minor GC). Survivors get promoted to old generation, collected less often (major GC). G1 (default since 11) divides the heap into regions and prioritises regions with the most garbage. ZGC and Shenandoah are concurrent and target sub-millisecond pauses on big heaps.
Technical The principle: a class declares what it needs, an external system supplies it. The benefits are testability (substitute a fake), composability (the same service works in different contexts), and reduced coupling. Spring’s container reads bean definitions, wires the graph, manages lifecycles, and exposes hooks for cross-cutting concerns (transactions, security, caching) at well-defined extension points.
Technical Constructor every time. Three reasons: (1) fields can be final — class is immutable after construction; (2) easy to instantiate in unit tests without a Spring context; (3) circular dependencies fail loudly at startup instead of crashing in production.
Technical Servlet container (Tomcat) → Spring’s DispatcherServlet → handler-mapping picks the controller → HandlerInterceptors + Spring Security filter chain → controller method (with @RequestParam / @RequestBody binding via converters) → service layer (transactional) → repository → JDBC → response → message converters → JSON. @ControllerAdvice wraps it all to translate exceptions into ProblemDetail responses.
@Component, @Service, @Repository, @Controller?Technical All four are @Component under the hood. The specialisations exist for two reasons: clarity for the reader, and Spring hooks. @Repository activates persistence-exception translation. @Controller participates in the MVC handler lookup. @Service is purely semantic.
@Transactional actually do?Technical Wraps the annotated method in a database transaction managed by Spring’s PlatformTransactionManager. Default propagation is REQUIRED — join an existing transaction or start a new one. Default rollback is on unchecked exceptions only. Two big gotchas: (a) self-invocation skips the proxy, so this.method() inside the same class is unwrapped; (b) only public methods are wrapped on JDK proxies.
IOException?” — because IOException is checked. Fix with @Transactional(rollbackFor = Exception.class) or by re-throwing as unchecked.Technical A chain of servlet filters runs before your controller. Each filter handles one concern: CSRF, CORS, auth (JWT, session, OAuth2), authorization, exception translation. The SecurityContextHolder stores the authenticated principal for the duration of the request. On Helfa I configure the chain to validate JWTs on /api/v1/** and let public endpoints (login, signup) through.
http
.authorizeHttpRequests(auth -> auth
.requestMatchers("/api/v1/auth/**").permitAll()
.anyRequest().authenticated())
.oauth2ResourceServer(oauth -> oauth.jwt(Customizer.withDefaults()))
.sessionManagement(s -> s.sessionCreationPolicy(STATELESS));
Technical When config diverges by environment in a way that’s more than a value change — different bean wiring, different DataSource, different fake adapters. Pure value differences belong in environment variables, not profile-specific YAMLs. On Realworld DSAR, test profile swaps Postgres for H2; everything else is identical.
Technical @SpringBootTest + Testcontainers for a real Postgres in Docker. Don’t mock the database — mocks lie about migrations and SQL semantics. MockMvc for the HTTP layer, WebTestClient for reactive. Always reset state between tests (transactional rollback or schema truncation).
Technical Flyway runs versioned SQL files (V1__init.sql, V2__add_x.sql) once each, in order, recorded in flyway_schema_history. For a big backfill: never block writes. (1) ship the schema change with a nullable default; (2) backfill in batches in a separate job, monitoring lag; (3) once filled, ship the constraint (NOT NULL, FK) in a follow-up migration. Big-bang migrations on production tables are how you cause an outage.
Technical A single @RestControllerAdvice that translates domain exceptions into RFC 7807 ProblemDetail responses. Validation errors → 400 with field details. Not-found → 404. Conflict → 409. Auth → 401/403. Anything unexpected → 500 with a logged correlation id and a sanitised message. Never leak stack traces.
Database JDBC is the raw API. Hibernate is an ORM. JPA is the standard interface, Hibernate is its most-used implementation. I reach for JPA when the model is genuinely entity-shaped; for read-heavy reporting I drop to JdbcTemplate or jOOQ because the hand-written SQL is faster, clearer, and avoids N+1.
Database You query a list of VisaApplications (1 query), then in a loop access app.getDocuments() on each (N queries). Solutions: JOIN FETCH in JPQL, @EntityGraph on the repository method, or DTO projections that select only what you need. Hibernate’s @BatchSize is a workaround, not a fix.
Database Read uncommitted (allows dirty reads), Read committed (Postgres default), Repeatable read (no non-repeatable reads), Serializable (no phantoms). Trade-off is correctness vs throughput. I default to Read committed and bump to Serializable on the specific transaction that needs it (financial transfers, inventory decrements).
Database Add when a query is hot and the planner is doing a Seq Scan or Sort. Don’t add indexes blindly — every index is write overhead and disk space. Composite index column order follows the query: most-selective filter first. Partial indexes (WHERE deleted_at IS NULL) for soft-deleted rows. Always confirm with EXPLAIN ANALYZE.
Database
-- inner: only visas that have at least one document SELECT v.* FROM visa_application v INNER JOIN document d ON d.visa_id = v.id; -- left: every visa, with documents if any (NULL otherwise) SELECT v.*, d.* FROM visa_application v LEFT JOIN document d ON d.visa_id = v.id;
Rule of thumb: if you’re using LEFT JOIN and then filtering on the right side’s column with WHERE d.x = ..., you’ve effectively turned it back into an INNER JOIN — usually a bug.
Database Two tables: visa_application(id, status, ...) and visa_status_history(id, visa_id, from_status, to_status, actor, occurred_at, reason). Status column carries the current state, history is the append-only audit trail. Transitions enforced in service code with a typed enum, not a DB trigger. On Helfa this is exactly the shape — every status change ends up in the history table within the same transaction.
Database Two flavours. Offset pagination (LIMIT 20 OFFSET 200) is simple but degrades badly past page ~100 because the DB still walks the prefix. Keyset pagination (WHERE id > :last_id ORDER BY id LIMIT 20) is O(log n) all the way down. Use offset for admin tables, keyset for infinite scroll and feeds.
Database Default Postgres. JSONB, full-text search, geospatial via PostGIS, partial indexes, materialised views, generated columns — all first-class. MySQL is fine for simple workloads and has marginally faster bulk-insert throughput in some configurations, but everything I build needs at least one of the Postgres features sooner or later.
Technical Resource-first. POST /api/v1/visas creates DRAFT. GET /api/v1/visas/{id} reads. PATCH /api/v1/visas/{id} updates fields while still DRAFT. State transitions are explicit sub-resources: POST /api/v1/visas/{id}/submit, POST /api/v1/visas/{id}/approve, POST /api/v1/visas/{id}/reject — never a generic PATCH on status, because that would let clients invent transitions. Documents at POST /api/v1/visas/{id}/documents. History at GET /api/v1/visas/{id}/history.
Technical REST: external partners, caching, public APIs, simple CRUD. GraphQL: many UIs over the same data with widely-varying field needs (BFF pattern especially). gRPC: internal service-to-service where you control both sides and want strong typing + streaming. Most products are 90% REST + 10% something else; pick REST as the default.
Technical URL prefix (/api/v1) is the pragmatic choice. Clearer in logs, easier to route, easier for partners to reason about. Header versioning (Accept: vnd.app.v1+json) is purer but the operational pain isn’t worth it. Whatever you pick: deprecate by date, never break v1 once it’s public, and instrument deprecated-endpoint usage so you know who to email.
Technical Don’t — use Auth0/Clerk/Cognito. If you must roll your own: signup flow stores Argon2id-hashed passwords; login issues a short-lived access JWT (15 min) and a long-lived refresh token (rotated on use, stored httpOnly). Sensitive endpoints require recent re-auth. Never store reversible passwords. Always rate-limit login.
Technical Sessions: state on the server, opaque cookie to client. Easy to revoke, harder to scale horizontally without sticky sessions or shared store. JWTs: state in the token, stateless server. Hard to revoke before expiry — solved with short access tokens + a refresh-token blacklist. For a single-tenant SaaS I usually pick sessions; for mobile + multi-service I pick JWT.
Technical Client-supplied Idempotency-Key header. Server stores {key, request_hash, response}. On retry: same key + same hash → return cached response; same key + different hash → 409. TTL is a business decision, often 24 hours. Critical for payments and any “create” that the client may retry on network failure.
Technical Token bucket (the classic): X tokens replenished per second, each request consumes one. Sliding window log: precise but expensive. Fixed window counter: cheap but bursty at boundaries. I use token bucket via Redis (Bucket4j on the JVM, or a Lua script for atomicity).
Technical Three layers, cheapest first. (1) HTTP cache headers (Cache-Control, ETag) — push the work to the client/CDN. (2) Application cache (Caffeine or Redis) keyed by query parameters with a sensible TTL. (3) DB-level materialised views for expensive aggregations. Always have an invalidation story before you add a cache.
Technical Modular monolith until proven otherwise. Microservices buy you independent deploy and team autonomy at the cost of distributed-systems complexity (consistency, tracing, network failure modes). Most teams of <30 engineers are better off with a well-modularised monolith and fewer 3am pages.
Technical Event-driven: business code emits domain events to a queue (Postgres LISTEN/NOTIFY for small scale, Kafka or SQS for big). A notification service consumes them, applies user preferences, batches by channel and time window, dedupes by event-key+user. Critical: idempotency key on every send so retries don’t double-deliver.
Technical Single-source shortest path on non-negative weights. Maintain a min-heap keyed by tentative distance, pop the smallest, relax its neighbours, repeat. O((V + E) log V) with a binary heap. Breaks on negative edge weights — use Bellman-Ford instead. On Nova Schilda I implemented Dijkstra for our F2 (efficient routes) feature; the trick was returning the path itself, not just the distance, by carrying a predecessor map alongside the heap.
Technical BFS: shortest path in unweighted graph, level-order tasks, “closest” queries. DFS: detect cycles, topological sort, connectivity, anywhere recursion fits the problem (e.g. expression-tree evaluation). On Nova Schilda F1 (reachability from a hub), BFS was the right call because we wanted minimum-hop reach, not just connectedness.
Technical Given a network with capacities, find the maximum throughput from source to sink. Ford-Fulkerson finds augmenting paths and saturates them; Edmonds-Karp uses BFS for the path so the runtime is O(VE^2). Use case: F3 on Nova Schilda — given drone-corridor capacities, what’s the maximum number of packages we can deliver per hour from hub s to district t?
Technical An edge whose removal disconnects the graph. Tarjan’s algorithm finds them in O(V+E) by tracking discovery times and the lowest reachable ancestor. Real-world meaning on Nova Schilda F4: a single corridor whose failure splits the network — those are the corridors you want redundancy on.
Technical ArrayList: O(1) get, O(n) insert-front. LinkedList: O(1) push/pop ends, O(n) random. HashMap: O(1) average get/put, O(n) worst case (degenerate hash). TreeMap: O(log n) for everything, ordered traversal. PriorityQueue (binary heap): O(log n) push/pop, O(1) peek. Knowing these is table stakes — explaining why the worst case happens is what actually impresses.
Story “On a Spring Boot service we had once, requests would intermittently 504 under load. Locally everything was fine. The trail was in three places: APM showed thread-pool saturation, logs showed slow queries on a specific endpoint, and pg_stat_activity on production showed a missing index causing seq scans. None of those alone would have told me; the trick was correlating them with a request id. I made structured logging with request ids the default after that.”
Technical Log: request boundaries (id, user, latency, status), state transitions, errors with context, key business events. Don’t log: PII, secrets, full request bodies in production, every line of every method. Use levels properly — INFO for state changes, WARN for retried failures, ERROR for things a human needs to look at.
Technical Three pillars. Metrics: Micrometer + Prometheus, RED method (Rate, Errors, Duration) per endpoint. Logs: structured JSON with traceId/requestId, shipped to Loki/ELK. Traces: OpenTelemetry, sampled. Connect them with the same id so you can pivot from a failed span to its log line in one click.
Technical Blue-green: two environments, switch traffic atomically. Fast rollback, but binary. Canary: send 1% → 10% → 50% → 100% to the new version, watching error rates. Lower blast radius, but needs traffic-shaping and a real metric pipeline. I default to canary in production, blue-green in staging.
Technical Never in code, never in application.properties. Inject via environment from a secret manager (AWS Secrets Manager, GCP Secret Manager, HashiCorp Vault). Rotate regularly. Audit access. For local dev: .env files not committed, plus a .env.example with stub values.
Story “When I started Helfa, Stripe checkout was new to me. I gave myself two days. Day one was reading their docs end-to-end and building the smallest checkout that worked against test mode. Day two was webhooks and idempotency. I shipped the scaffold on day three. The actual lesson wasn’t Stripe — it was that for any third-party integration, the docs are usually the fastest path; tutorials are slower and Stack Overflow is for stuck moments only.”
Story “Backend is where the truth of the system lives. Migrations, transactions, state machines, audit trails — those decisions outlast every UI redesign. I came to architecture because I liked load-bearing structures; backend is the same instinct in a different medium. If a frontend is ten weeks of work, a wrong schema decision is ten years of work.”