What SQL questions should I expect in a data analyst interview?

Core SQL topics: JOINs (inner, left, right, full outer), GROUP BY with HAVING, window functions (ROW_NUMBER, RANK, LEAD/LAG, SUM OVER), subqueries and CTEs, and performance considerations. In behavioural coding screens, the most common formats are: 'write a query to find X from this table', cohort analysis, funnel queries, and self-join problems. Know how to write rolling averages and running totals with window functions.

Do I need to know Python or just SQL for a data analyst role?

For entry to mid-level data analyst roles, strong SQL is the primary requirement. Python (pandas, matplotlib, seaborn) is increasingly expected for senior and lead roles, especially for data cleaning, automation, and visualisation beyond BI tools. Statistics (A/B testing, hypothesis testing, regression) is tested more heavily at companies with mature data cultures (tech, fintech, healthcare).

How do I answer 'Tell me about a time you used data to change a decision'?

Use a specific STAR story: the decision that was about to be made or was already made (Situation), what you were asked or chose to investigate (Task), the specific analysis you ran — what data, what method, what you found (Action), and what decision changed and what happened as a result (Result). Quantify the impact if possible: 'This led to a 14% reduction in churn over the following quarter'. Avoid vague stories about 'dashboards I built'.

What's the hardest type of data analyst interview question?

Business case / product sense questions are where most candidates struggle: 'DAU dropped 15% last Tuesday — walk me through how you'd investigate.' This tests structured thinking, knowledge of analytical frameworks, and communication. The answer should be structured as a funnel: data pipeline issues? → product bug? → external event? → metric definition change? → user behavior change? Demonstrate systematic elimination, not just listing possibilities.

Data Analyst Interview Questions 2025 — SQL, Stats & Behavioral Answers

🗄️

SQL & Technical questions

Write a query to find the second highest salary in a table.

Hint

Use DENSE_RANK() window function or a subquery with LIMIT/OFFSET.

What's the difference between HAVING and WHERE?

Hint

WHERE filters rows before aggregation; HAVING filters after. Use HAVING to filter on aggregated values (SUM, COUNT).

How would you write a rolling 7-day average for daily revenue?

Hint

Use AVG() OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW).

What's a CTE and when would you use it over a subquery?

Hint

CTEs improve readability and allow re-use of the same result set multiple times. Subqueries are fine for single-use, inline logic.

How would you find duplicate records in a table?

Hint

GROUP BY all relevant columns + HAVING COUNT(*) > 1, or use ROW_NUMBER() OVER (PARTITION BY ... ORDER BY id) and filter WHERE rn > 1.

📊

Statistics & A/B Testing questions

What is a p-value and what does it tell you?

Hint

Probability of seeing results as extreme as yours if the null hypothesis were true. NOT the probability the null is true. Threshold is arbitrary (0.05 is convention, not magic).

How would you design an A/B test for a new checkout button colour?

Hint

Define success metric (conversion rate). Calculate sample size for given MDE and power. Randomise at user level. Set end date before looking at results. Consider novelty effect.

What's the difference between Type I and Type II errors?

Hint

Type I = false positive (rejecting null when it's true). Type II = false negative (failing to reject null when it's false). False positive rate = significance level (α). False negative rate = 1 - power (β).

How do you detect if an A/B test has sample ratio mismatch?

Hint

Run a chi-squared test on the assignment proportions. SRM means the randomisation was biased — trust any result from an SRM test with caution.

📉

Product / Business Case questions

DAU dropped 15% last Tuesday. Walk me through how you'd investigate.

Hint

Structure your investigation: (1) Is the data correct? Check pipeline/logging. (2) Is it global or segment-specific? (3) Is it correlated with a deploy or external event? (4) Which funnel steps show the drop? Work systematically, not randomly.

How would you measure the success of a new feature?

Hint

Define a primary metric tied to the feature's goal. Define guardrail metrics (things that shouldn't get worse). Use A/B test or time-series comparison. Consider leading vs lagging indicators.

How would you prioritise 10 dashboards with limited engineering time?

Hint

Framework: (1) Who uses it and how often? (2) What decision does it drive? (3) What's the cost of getting it wrong? Prioritise high-impact, high-frequency, decision-driving dashboards.

💬

Behavioral questions

Tell me about a time you found an insight that surprised stakeholders.

Hint

Specific STAR story. What was the analysis, what did you expect vs find, how did you communicate it, what changed? Quantify the outcome.

How do you communicate uncertainty in your analysis?

Hint

Say: confidence intervals, scenario ranges, data quality caveats, sample size limitations. Show that you can be accurate about uncertainty — that's more valuable than false confidence.

Tell me about a time you pushed back on a metric or methodology.

Hint

Good data analysts challenge the question, not just answer it. A story where you identified that the wrong metric was being used or that sampling bias would invalidate the conclusion is very strong.

What separates good from great data analysts in interviews

Structured thinking over pattern matching

Great analysts structure their investigation before diving in. When asked to diagnose a metric drop, they frame a hypothesis tree, not a list. They say 'my first hypothesis is X because Y, let me check Z' — not 'I'd look at everything'.

Communicate uncertainty explicitly

Average candidates state conclusions. Great ones state conclusions with confidence intervals and caveats: 'The data suggests X, but this is based on 3 weeks of data which may not account for seasonal patterns. I'd want to validate with a longer window before acting on this.'

Challenge the question

The best data analyst interview answer sometimes starts with 'That's an interesting metric choice — is that actually what we want to optimise?' Showing you can think about whether you're solving the right problem is more impressive than perfectly solving the stated problem.

Data Analyst Interview Questions 2025

SQL & Technical questions

Statistics & A/B Testing questions

Product / Business Case questions

Behavioral questions

What separates good from great data analysts in interviews

FAQ

What SQL questions should I expect in a data analyst interview?

Do I need to know Python or just SQL for a data analyst role?

How do I answer 'Tell me about a time you used data to change a decision'?

What's the hardest type of data analyst interview question?

Practice, not just reading.