Get the big picture, determine scope, define success, find resources

Big picture: know daily SQL queries, how to use them, what are the tricks and how to optimize them. Scope: 80% most freqent SQL usage. Success: Finish the online quiz on Hackerrank on SQL. Resources: SQL in 10 Minutes, Sams Teach Yourself (4th Edition) by Ben Forta

Get started

Database vs. Table vs. Column: Container vs. structured file for some specific character vs. one characteristic

Row: Observation == record Primary key: unique identifier as a column for each row: cannot be the same for different row, cannot change, delete, reuse and leave in NULL SQL: Structured Query Language

SELECT: the most freqent used keyword.

  • SELECT: What columns you want to get
    • DISTINCT: Get unique value (works on all columns)
    • TOP Num: Select top X observations
    • AS Str: give alias for variabble name and for table name.
    • CONCAT: Use comma and char to create calcualted field
  • FROM: Which table you are referencing to
    • LIMIT Num1 OFFSET Num2: Search from Num2 observations and get Num1 results. Could be simplified as LIMIT Num2, Num1.
    • ORDER BY Var1, Var2 (DESC): Could use variable name to order, or use relative position. Default order is ASC, use DESC for descending order. Could put DESC under each variable. Put it after WHERE is there is WHERE existing.
  • WHERE: filter the conditions
    • Conditions ( = / != / > / >= … / BETWEEN / IS NULL)
    • Use logical words (AND, OR, NOT) to combine conditions, use IN (Val1, Val2) to filter results in the brackets. Combine Sub-queries using SELECT and IN.
    • Use LIKE to do regular expression: use * / % for any characters appears any time, use _ / ? for any characters appears one time, use [Char] for match specific characters appears one time.
    • GROUP BY: Group observations. Put it after WHERE but in front of ORDER BY.
    • HAVING: Similiar to WHERE, but use HAVING after GROUP BY.

Order of the keywords: SELECT -> (FROM) -> (WHERE) -> (GROUP BY) -> (HAVING) -> (ORDER BY)

Joining Tables: - Basic Syntax: FROM Table1 (JOIN) Table2 WHERE Table1.Var1 = Table2.Var1 - Different JOIN methods: INNER JOIN, SELF JOIN, LEFT JOIN, RIGHT JOIN, FULL JOIN. This graph illustrate the difference between those JOIN methods.

Use UNION to combine results from different SELECT. Put ORDER BY in the last SELECT statement.

Appendix: Common functions for SQL - LEN: Get length of the string - NOW / CURDATE: Get current date - TRIM / LTRIM / RTRIM: Trim the space in different sides - UPPER / LOWER: Covert to uppercase or lowercase - TODATE: Convert string to date

Aggregate functions: - AVG: - COUNT: - MAX / MIN / SUM:

INSERT: insert records to the table

  • INSERT INTO Table1 (Var list) VALUES (Value list)
  • Could use SELECT instead of VALUES
  • USE SELECT INTO NewTable to create new table with results from SELECT statement

UPDATE: update the observations in the table

  • Basic Syntax: UPDATE Table1 SET Var1 = “Value1” WHERE CONDITIONS

DELETE: remove the observations

  • Basic Syntax: DELETE FROM Table1 WHERE CONDITIONS

CREATE TABLE: create your own table

  • Basic Syntax: CREATE TABLE NewTable (Var TYPE NULL(NOT NULL))
  • Change table: ALTER TABLE Table1 ADD Var TYPE for adding new variable, ALTER TABLE Table1 DROP COLUMN Var for delete existing variable.
  • Delete the whole table: DROP TABLE Table1

CREATE VIEW

PROCEDURE

COMMIT & ROLLBACK

CURSOR


Solving problems:

  • For this problem, I stucked at combining two SELECT statements together with seprate ORDER BY. I figured out that I need to filter the table and UNION ALL resulted tables, instead of merging several SELECT statements. The code is shown here:
result.sqlresult.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
SELECT * FROM (
    SELECT CITY, LENGTH(CITY)
    FROM STATION
    ORDER BY LENGTH(CITY), CITY
    LIMIT 1
) AS A
UNION ALL
SELECT * FROM (
    SELECT CITY, LENGTH(CITY)
    FROM STATION
    ORDER BY LENGTH(CITY) DESC, CITY
    LIMIT 1
) AS B
  • Use LEFT(Var, 1) to get the first character, and use () for IN statement.

  • This problem uses sub-select statement to generate the aggregated value on the new variable total_income, I believe there are multiple ways to solve it, and here is my version.

result.sqlresult.sql
1
2
3
4
5
6
7
8
9
SELECT total_income, COUNT(total_income)
FROM (
    SELECT months * salary AS total_income
    FROM Employee
) AS A
GROUP BY total_income
ORDER BY total_income DESC
LIMIT 1
  • Usage of BETWEEN: WHERE var1 BETWEEN val1 AND val2
  • Use CASE WHEN CONDITION THEN Str1 ELSE Str2 END to create value based on current value, see my answer to that question.
result.sqlresult.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
SELECT
    CASE
        WHEN 
            A + B <= C OR A + C <= B OR B + C <= A
                THEN
                    'Not A Triangle'
        WHEN
            A = B AND B = C
                THEN
                    'Equilateral'
        WHEN
            A = B OR B = C OR A = C
                THEN 
                    'Isosceles'
        ELSE
            'Scalene'
    END AS TRIANGLE
FROM TRIANGLES
  • It took me a while for this question: I found that it is actually useless to have the grade table, and by using some if conditions and string manipulation, I finally got the answer. However, I didn’t pay attention to the order of the result, and I struggled for 5 min figuring what was wrong. Be careful on the questions and how to answer it! Here is my code:
result.sqlresult.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
SELECT
    CASE
        WHEN 
            Students.Marks >= 70
                THEN Students.Name
        ELSE
            NULL
    END AS Name,
    CASE
        WHEN
            Students.Marks < 100
                THEN LEFT(Students.Marks, 1) + 1
        ELSE
            10
        END AS Grade, Students.Marks
FROM Students
ORDER BY Grade DESC, Students.Name, Marks