Day 2: Refreshing SQL
Get the big picture, determine scope, define success, find resources
Big picture: know daily SQL queries, how to use them, what are the tricks and how to optimize them. Scope: 80% most freqent SQL usage. Success: Finish the online quiz on Hackerrank on SQL. Resources: SQL in 10 Minutes, Sams Teach Yourself (4th Edition) by Ben Forta
Database vs. Table vs. Column: Container vs. structured file for some specific character vs. one characteristic
Row: Observation == record Primary key: unique identifier as a column for each row: cannot be the same for different row, cannot change, delete, reuse and leave in NULL SQL: Structured Query Language
SELECT: the most freqent used keyword.
- SELECT: What columns you want to get
- DISTINCT: Get unique value (works on all columns)
- TOP Num: Select top X observations
- AS Str: give alias for variabble name and for table name.
- CONCAT: Use comma and char to create calcualted field
- FROM: Which table you are referencing to
- LIMIT Num1 OFFSET Num2: Search from Num2 observations and get Num1 results. Could be simplified as LIMIT Num2, Num1.
- ORDER BY Var1, Var2 (DESC): Could use variable name to order, or use relative position. Default order is ASC, use DESC for descending order. Could put DESC under each variable. Put it after WHERE is there is WHERE existing.
- WHERE: filter the conditions
- Conditions ( = / != / > / >= … / BETWEEN / IS NULL)
- Use logical words (AND, OR, NOT) to combine conditions, use IN (Val1, Val2) to filter results in the brackets. Combine Sub-queries using SELECT and IN.
- Use LIKE to do regular expression: use * / % for any characters appears any time, use _ / ? for any characters appears one time, use [Char] for match specific characters appears one time.
- GROUP BY: Group observations. Put it after WHERE but in front of ORDER BY.
- HAVING: Similiar to WHERE, but use HAVING after GROUP BY.
Order of the keywords: SELECT -> (FROM) -> (WHERE) -> (GROUP BY) -> (HAVING) -> (ORDER BY)
Joining Tables: - Basic Syntax: FROM Table1 (JOIN) Table2 WHERE Table1.Var1 = Table2.Var1 - Different JOIN methods: INNER JOIN, SELF JOIN, LEFT JOIN, RIGHT JOIN, FULL JOIN. This graph illustrate the difference between those JOIN methods.
Use UNION to combine results from different SELECT. Put ORDER BY in the last SELECT statement.
Appendix: Common functions for SQL - LEN: Get length of the string - NOW / CURDATE: Get current date - TRIM / LTRIM / RTRIM: Trim the space in different sides - UPPER / LOWER: Covert to uppercase or lowercase - TODATE: Convert string to date
Aggregate functions: - AVG: - COUNT: - MAX / MIN / SUM:
INSERT: insert records to the table
- INSERT INTO Table1 (Var list) VALUES (Value list)
- Could use SELECT instead of VALUES
- USE SELECT INTO NewTable to create new table with results from SELECT statement
UPDATE: update the observations in the table
- Basic Syntax: UPDATE Table1 SET Var1 = “Value1” WHERE CONDITIONS
DELETE: remove the observations
- Basic Syntax: DELETE FROM Table1 WHERE CONDITIONS
CREATE TABLE: create your own table
- Basic Syntax: CREATE TABLE NewTable (Var TYPE NULL(NOT NULL))
- Change table: ALTER TABLE Table1 ADD Var TYPE for adding new variable, ALTER TABLE Table1 DROP COLUMN Var for delete existing variable.
- Delete the whole table: DROP TABLE Table1
COMMIT & ROLLBACK
- For this problem, I stucked at combining two SELECT statements together with seprate ORDER BY. I figured out that I need to filter the table and UNION ALL resulted tables, instead of merging several SELECT statements. The code is shown here:
SELECT * FROM ( SELECT CITY, LENGTH(CITY) FROM STATION ORDER BY LENGTH(CITY), CITY LIMIT 1 ) AS A UNION ALL SELECT * FROM ( SELECT CITY, LENGTH(CITY) FROM STATION ORDER BY LENGTH(CITY) DESC, CITY LIMIT 1 ) AS B
Use LEFT(Var, 1) to get the first character, and use () for IN statement.
This problem uses sub-select statement to generate the aggregated value on the new variable total_income, I believe there are multiple ways to solve it, and here is my version.
SELECT total_income, COUNT(total_income) FROM ( SELECT months * salary AS total_income FROM Employee ) AS A GROUP BY total_income ORDER BY total_income DESC LIMIT 1
- Usage of BETWEEN: WHERE var1 BETWEEN val1 AND val2
- Use CASE WHEN CONDITION THEN Str1 ELSE Str2 END to create value based on current value, see my answer to that question.
SELECT CASE WHEN A + B <= C OR A + C <= B OR B + C <= A THEN 'Not A Triangle' WHEN A = B AND B = C THEN 'Equilateral' WHEN A = B OR B = C OR A = C THEN 'Isosceles' ELSE 'Scalene' END AS TRIANGLE FROM TRIANGLES
- It took me a while for this question: I found that it is actually useless to have the grade table, and by using some if conditions and string manipulation, I finally got the answer. However, I didn’t pay attention to the order of the result, and I struggled for 5 min figuring what was wrong. Be careful on the questions and how to answer it! Here is my code:
SELECT CASE WHEN Students.Marks >= 70 THEN Students.Name ELSE NULL END AS Name, CASE WHEN Students.Marks < 100 THEN LEFT(Students.Marks, 1) + 1 ELSE 10 END AS Grade, Students.Marks FROM Students ORDER BY Grade DESC, Students.Name, Marks