Although compiling SQL queries is not the most interesting thing for Data Scientists, a good understanding of SQL is extremely important for anyone who wants to succeed in any activity related to data processing. The point here is that SQL is not only SELECT
, FROM
and WHERE
. The more SQL constructions a specialist knows, the easier it will be for him to create requests for obtaining from the databases everything that he may need. The author of the article, the translation of which we are publishing today, says that it is aimed at solving two problems:
- Exploring mechanisms that go beyond basic SQL knowledge.
- Consideration of several practical tasks for working with SQL.
This article covers 5 SQL questions from Leetcode. They represent practical tasks that are often encountered in interviews.Question No. 1: second place in terms of salary
Write an SQL query to obtain from the table with employee salary information ( Employee
) an entry containing the second highest salary.For example, such a query executed for the table below should return 200
. If the table does not have a value lower than the highest salary, the request should return null
.+
| Id | Salary |
+
| 1 | 100 |
| 2 | 200 |
| 3 | 300 |
+
▍ Decision A: use IFNULL
andOFFSET
Here are the main mechanisms that will be used in this solution to the problem:IFNULL(expression, alt)
: this function returns its argument expression
if it is not equal null
. Otherwise, the argument is returned alt
. We will use this function to return null
if the table does not contain the desired value.OFFSET
: This operator is used with an expression ORDER BY
to discard the first n
lines. This is useful to us for the reason that we are interested in the second row of the result (that is, the second largest salary, the data on which is in the table).
Here is a ready-made request:SELECT
IFNULL(
(SELECT DISTINCT Salary
FROM Employee
ORDER BY Salary DESC
LIMIT 1 OFFSET 1
), null) as SecondHighestSalary
FROM Employee
LIMIT 1
▍ Solution B: use MAX
The query below uses the function MAX
. Here, the highest salary value is selected, not equal to the maximum salary received throughout the table. As a result, we get what we need - the second largest salary.SELECT MAX(salary) AS SecondHighestSalary
FROM Employee
WHERE salary != (SELECT MAX(salary) FROM Employee)
Question # 2: duplicate email addresses
Write an SQL query that detects Person
all duplicate email addresses in the table .+
| Id | Email |
+
| 1 | a@b.com |
| 2 | c@d.com |
| 3 | a@b.com |
+
▍ Decision A: COUNT
in the subquery
First, we create a subquery in which the frequency of occurrence of each address in the table is determined. Then the result returned by the subquery is filtered using the instruction WHERE count > 1
. The query will return information about addresses found in the source table more than once.SELECT Email
FROM (
SELECT Email, count(Email) AS count
FROM Person
GROUP BY Email
) as email_count
WHERE count > 1
▍ Solution B: expression HAVING
HAVING
: This is an expression that allows you to use an instruction WHERE
with an expression GROUP BY
.
SELECT Email
FROM Person
GROUP BY Email
HAVING count(Email) > 1
Question No. 3: rising temperature
Write an SQL query that finds in the table Weather
all dates (date identifiers) when the temperature would be higher than the temperature on the dates preceding it. That is, we are interested in dates on which the “today's” temperature is higher than the “yesterday's”.+
| Id(INT) | RecordDate(DATE) | Temperature(INT) |
+
| 1 | 2015-01-01 | 10 |
| 2 | 2015-01-02 | 25 |
| 3 | 2015-01-03 | 20 |
| 4 | 2015-01-04 | 30 |
+
▍Solution: DATEDIFF
DATEDIFF
: This function calculates the difference between two dates. It is used to provide a comparison of precisely “today's” and “yesterday's” temperatures.
If we formulate the following query in ordinary language, it turns out that it expresses the following idea: we need to choose identifiers such that the temperature corresponding to the dates they represent is greater than the temperature for the “yesterday's” dates with respect to them.SELECT DISTINCT a.Id
FROM Weather a, Weather b
WHERE a.Temperature > b.Temperature
AND DATEDIFF(a.Recorddate, b.Recorddate) = 1
Question number 4: the highest salary in the unit
The table Employee
stores information about company employees. Each record in this table contains information about the identifier ( Id
) of the employee, his name ( Name
), salary ( Salary
), and the division of the company where he works ( Department
).+
| Id | Name | Salary | DepartmentId |
+
| 1 | Joe | 70000 | 1 |
| 2 | Jim | 90000 | 1 |
| 3 | Henry | 80000 | 2 |
| 4 | Sam | 60000 | 2 |
| 5 | Max | 90000 | 1 |
+
The table Department
contains information about the divisions of the company.+
| Id | Name |
+
| 1 | IT |
| 2 | Sales |
+
Write an SQL query that finds in each of the departments of employees with the maximum salary. For example, for the above tables, a similar query should return the results represented by the following table (the order of the rows in the table does not matter):+
| Department | Employee | Salary |
+
| IT | Max | 90000 |
| IT | Jim | 90000 |
| Sales | Henry | 80000 |
+
▍Solution: team IN
The command IN
allows you to set WHERE
conditions in the instruction that correspond to the use of several commands OR
. For example, the two following constructs are identical:WHERE country = ‘Canada’ OR country = ‘USA’
WHERE country IN (‘Canada’, ’USA’).
Here we want to get a table containing the name of the department ( Department
), the name of the employee ( Employee
) and his salary ( Salary
). To do this, we create a table that contains information about the unit identifier ( DepartmentID
) and the maximum salary for this unit. Then we combine the two on a condition table according to which entry in the resulting table fall only if DepartmentID
and Salary
have previously formed in the table.SELECT
Department.name AS 'Department',
Employee.name AS 'Employee',
Salary
FROM Employee
INNER JOIN Department ON Employee.DepartmentId = Department.Id
WHERE (DepartmentId , Salary)
IN
( SELECT
DepartmentId, MAX(Salary)
FROM
Employee
GROUP BY DepartmentId
)
Question No. 5: transplanting students
Mary is a high school teacher. She has a table seat
that stores the names of students and information about their places in the classroom. The value id
in this table is constantly increasing. Mary wants to swap neighboring students.Here is a table of students' initial placement:+
| id | student |
+
| 1 | Abbot |
| 2 | Doris |
| 3 | Emerson |
| 4 | Green |
| 5 | Jeames |
+
Here's what should happen after transplanting neighboring students:+
| id | student |
+
| 1 | Doris |
| 2 | Abbot |
| 3 | Green |
| 4 | Emerson |
| 5 | Jeames |
+
Write a request that will allow the teacher to solve the above problem.Please note that if the number of students is odd, you don’t need to transfer the last student anywhere.▍Solution: using an operator WHEN
An SQL construct CASE WHEN THEN
can be considered an operator if
in programming.In our case, the first operator is WHEN
used to check whether the odd identifier is assigned to the last row in the table. If so, the line is not subject to change. The second operator WHEN
is responsible for adding 1 to each odd identifier (for example, 1, 3, 5 turns into 2, 4, 6) and for subtracting 1 from each even identifier (2, 4, 6 turn into 1, 3, 5).SELECT
CASE
WHEN((SELECT MAX(id) FROM seat)%2 = 1) AND id = (SELECT MAX(id) FROM seat) THEN id
WHEN id%2 = 1 THEN id + 1
ELSE id - 1
END AS id, student
FROM seat
ORDER BY id
Summary
We examined several SQL tasks, discussing along the way some advanced tools that can be used to compile SQL queries. We hope that what you learned today will come in handy for you during interviews in SQL and will prove useful in everyday work.PS In our marketplace there is a Docker image with SQL Server Express, which is installed in one click. You can check the operation of containers on VPS. All new customers are given 3 days free of charge for testing.Dear readers! What can you advise to those who want to master the art of creating SQL queries?