Skip to main content

My attempt (5) at learning a coding language (SQL) in my mid 30s

I read through my previous page and did my exercises. It really drives the point to make notes while learning something. Many times I will be watching the video and copying down the examples but you only know how the code works if you try it out at the exercises. I know if I have to replay the video to find out the codes example again, I will be very frustrated so luckily I had done so in my notes. I take down the definition and examples shown by the speaker. Another way I use this is as a recap resource since I learn the code every few days and obviously, I would have forgotten how the code works. Hope it will be useful to you if you found it. Here on to the next page of syntax.

1) Aggregate functions - take in multiple inputs and return a single output. 
AVG() -returns average value as a float (with decimal places)
COUNT() - returns the number of values
MAX() - returns the maximum value
MIN() - returns minimum value
SUM() - returns the sum of all values
ROUND() - returns a rounded value by how many decimal places as specified.

eg. SELECT MIN(column_name) FROM table;
cannot choose to return different content from different columns but can do two aggregate functions on the same column since only one output will be returned. 
eg. SELECT MAX(column_name), MIN(column_name) FROM table;

eg. SELECT ROUND (AVG(column_name), 2) FROM table; 
this will return round to 2 decimal places for the average value. 

AGGREGATE functions can only be used after SELECT or HAVING. 

2) GROUP BY - allow you to consolidate the data based on the chosen categories.
Categorical columns are non-continuous and can be numerical or alphabetically, it can be red, blue, green or with numbers like cat1, cat2 and so on.  We can select the categories then do aggregate functions on them.

eg. SELECT category_col, AGG(data_col) FROM table
WHERE category_col !=A 
GROUP BY category_col
This will ignore all the A category and perform an aggregate function on the other categories.

It is important to note that the category_col used in GROUP BY is also used in the SELECT statement. However, it is ok to use the aggregate function on another data_col.

Can group by many different columns, decide the order like how we use ORDER BY. However both columns must be seen in GROUP BY. Should not use filtering on the aggregated data column, we will use HAVING instead. This is because the aggregated amount will only be executed after the GROUP BY statement while WHERE (the filtering statement will execute before GROUP BY so essentially it means there is nothing to however we can called upon directly ORDER BY and WHERE if it is data not involved in the aggregate data column.  

3) to remove time from timestamp using DATE(date column with timestamp entries) as this will create unique categories for every minute and seconds.

4) HAVING - allows us to use aggregate results as a filter with a GROUP BY. 

eg. SELECT col1, sum(col2) FROM table
WHERE col1 != 'Red' 
HAVING sum(col2) > 2
This will return the sum of more than 2 of the same colours that are not red. 

5) AS - allows you to rename a column whatever you want

eg1. SELECT col1, col2 AS new_name FROM table
only col2 is renamed so placed AS directly behind the col you want to rename.  

eg2. SELECT sum(column) AS new_name FROM table
make the output more readable. 
AS gets executed at the very end of a query so you cannot use inside the WHERE or HAVING clause. 

JOIN - allow us to combine multiple tables. Different type of join exist for different joining criteria. 

6) INNER JOIN - combine for exact match that appear on both tables.

eg. SELECT * FROM TableA
ON TableA.col_match = TableB.col_match

this will return a table that repeat the col_match data so to avoid this, we can 
SELECT col1, TableA.col_match, col3 From TableA 
we need to specify which table to take col_match from since it can be found in both tables. 
By default JOIN = INNER JOIN

OUTER JOIN - allow us to specify how to deal with values only present in one of the tables being joined. 

7) FULL OUTER JOIN will combine everything in both tables together no matter if data is present in one or both tables.

ON TableA.col_match = TableB.col_match
It will fill in "null" values where there is match in the other table. 

Adding this WHERE IS null OR is null will result in a table that only contains data that are uniquely found in either one of the table. 

8) LEFT OUTER JOIN - results in a set of records that are in the left table. If there is no match with the right table then the results are null.

LEFT OUTER JOIN is the same as LEFT JOIN and will be processed the same way.
The order of the table selected matters as the first table will be used and nothing found exclusively on the second table will be returned. 

eg. SELECT * FROM TableA
     ON TableA.col_match = TableB.col_match;
This will grab everything in table A or had a match with table B but if found exclusively in table B then it is not going to be shown in our table. 

If we want to find entries unique to table A only, we can filter out to show the null entries by adding this statement at the end.
eg. WHERE TableB.col1 is null.

9) RIGHT OUTER JOIN - same as LEFT OUTER JOIN except the tables are switched and have the same outcome if the we switched the table order in a LEFT OUTER JOIN. 

10) UNION - concatenate the result data sets of two or more SELECT statements. 

eg. SELECT column_name(s) FROM table1
      SELECT column_name(s) FROM table2; 

This will paste the results of table1 above table2. 

We can join multiple tables together by using JOIN in the same query. 

JOIN is really useful and it is worthwhile to really try out the JOIN syntax. On wards to learn other little tricks at the next post!