Onwards to creating a database!
Understanding 1: Setting data type for each data to be stored.
These are pretty similar like other coding languages. There are the boolean (true/false), character(char,varchar(no of characters) and text), numeric(integer/float), temporal(date,time,timestamp). Not as common ones are UUID (universally unique identifiers, a unique code to identify the row), array(stores a list of strings or numbers), JSON(a data table file), Hstore key-value pair, network address and geometric data.
eg. To store phone number. Consideration should extend to what do we do with phone numbers. Since we do not need to perform mathematical calculations on them, it is easier to store as text-based data type. And this also avoids the issue that 2 and 02 have the same numerical values but can mean a totally different phone number in terms of area code.
It is good to google the best practices on what data type to store for each kind of data. It is also vital to plan for long term storage, to store more information than needed since it is easy to remove the information but we are unable to add in information that was not captured.
Understanding 2: Primary Keys and Foreign Keys
Primary key is a column or group of columns used to identify a row uniquely in a table. Eg. ID numbers (integer-based) and there must be no null numbers in the column. This is important when we are joining tables together.
SERIAL - automatically creates and assigns a unique number to each row by creating a sequence object (a special kind of database object that generates a sequence of integers). If a row is later removed, the number will not adjust. It will be shown as 1,2,4,5 meaning that row 3 had been removed. You will choose either of these smallserial (1 to 32767), serial(1 to 2147483647) and bigserial(1 to 9223372036854775807) depending on the size of your data.
Foreign key is a field or group of fields in a table that uniquely identifies a row in another table. It references the primary key of the other table. The table that contains the foreign key is called the child table or referencing table. And the table that the foreign key references to is called referenced table or parent table.
A table can have multiple foreign keys depending on its relationships with other tables.
When creating tables and defining columns, we can use constraints to define columns as being a primary key or attaching a foreign key relationship to another table.
pgAdmin will only alert you to the primary key with [PK] along with the type in the column name but not the foreign keys. However we can check the type of keys through the Browser>> database>>schema>>table>>constraints. Constraints will highlight the primary key column in gold with "_pkey" while the foreign keys in grey with "_fkey" at the column names.
Select the fkey and click on "Dependencies", at the right side of the pgAdmin (usually where we type the query section), to see its primary table. Or you can right-click on the fkey>>properties>>Columns to check where it is referencing from.
Understanding 3: Constraints
Constraints are rules enforced on data columns to prevent invalid data from being entered into the database. This ensures the accuracy and reliability of the data in the database.
We can set constraints to certain columns or to the entire table and here are some examples.
NOT NULL - ensures no empty value
UNIQUE - ensures all values are different (eg ID)
PRIMARY key - uniquely identifies each row
FOREIGN key - constrains data based on columns in other tables
CHECK - ensures all values satisfy certain conditions (eg >20)
EXCLUSION - ensures that if any two rows are compared on the specified column or expression using the specified operator, not all of these comparisons will return TRUE.
CHECK(condition) - to check a condition when inserting or updating data.
REFERENCES - to constrain the value stored in the column that must exist in a column in another table.
UNIQUE(column_list) - forces the values stored in the columns listed inside the parentheses to be unique.
PRIMARY key(column_list) - allows you to define the primary key that consists of multiple columns.
After learning the 3 understandings, we can now create a table in SQL.
Creating a table
CREATE TABLE table_name (
column_name TYPE column_constraint,
column_name TYPE column_constraint,
table_constraint table_constraint )
eg. CREATE TABLE results (
student_id SERIAL PRIMARY KEY,
student_name VARCHAR(100) NOT NULL,
marks SMALLINT NOT NULL);
At pgAdmin, check that you are not working on any existing database. If you are >>right-clicked on name of database>> Remove Panel.
Very important to have ',' after each column, it is ok to not have constraints for every column and constraints will vary depending on the data you are storing in the column. You can only run the command to create the table once since you do not want to accidentally overwrite an entire table. Once done so, it will appear under the Schemas. So if you had made a mistakes then you need to use ALTER clause to add or remove the columns.
To create tables that references to others.
eg. CREATE TABLE cohort (
student_id INTEGER REFERENCES results(student_id),
class VARCHAR(50) )
Notice that student_id data type is not SERIAL. SERIAL is only used for creating unique identifier for primary key, when we are referencing to it, it is an INTEGER data.
Refresh after each creation of table.
Adding entries into the table
eg1. INSERT INTO table(column1,column2,...)
eg2. allows you to insert values from another table)
INSERT INTO table (column1,column2,...)
SELECT column1, column2,....
It is important to know that inserted row values must match up for the table like their datatype including constraints and SERIAL columns do not need to be provided with a value since it is automatically provided.
eg3. INSERT INTO results(student_name, marks)
Once this is run, you need to SELECT * FROM cohort to check out the row of entry. Notice that student_id was not required since it was auto-generated by the SERIAL type and Maria will be assigned the student_id 1.
eg4. INSERT INTO cohort(student_id, class)
Take note that we will run into an error if we assigned an empty student_id no such as 10. I had coloured all my examples for a table in purple.
Updating a table
eg1. UPDATE table
SET column1 = value1,
column2 = value 2,....
Or to just overwrite all the values in the table to a certain value or replace them with data from another column (using the column name) without the WHERE condition.
eg2. UPDATE results
SET student_name = MariaA
WHERE student_id = 1;
eg3. UPDATE JOIN though there is no need to use the JOIN syntax here.
SET original_col = TableB.new_col
WHERE TableA.id = TableB.id
Placing RETURNING at the last row allow us to return the affected rows back to check if the updates had been done correctly.
Deleting a row -remove rows from table (duh!)
eg1. DELETE FROM table
WHERE row_id = 1
eg2. DELETE JOIN (similar concept to the update join)
DELETE FROM tableA
WHERE tableA.id = tableB.id
This will remove those rows in tableA that are found in tableB.
Placing RETURNING column_name1, column_name2 will show the row that matches the WHERE condition that is to be deleted. Running the command twice will not return anything since that row had been deleted.
Altering - allows for changes to an existing table structure such as:
ADD COLUMN, DROP COLUMN, RENAME COLUMN old_name TO new_name to rename column, ALTER COLUMN col_name DROP (to remove a constraints) / SET (set new constraints);
Changing column data type;
SET DEFAULT/DROP DEFAULT/NOT NULL values for a column;
Add CHECK constraints;
RENAME TO new_name to rename table.
ALTER TABLE table_name action
ALTER TABLE table_name
ADD COLUMN new_col TYPE
Drop table - allows complete removal of a column in a table along with the indexes and constraints involving the column. However it will not remove columns used in views, trigger or stored procedures unless you add the CASCADE clause.
eg. ALTER TABLE table_name
DROP COLUMN col_name CASCADE
We can check if a column is present before carrying on with the removal using the following:
ALTER TABLE table_name
DROP COLUMN IF EXISTS col_name
This will only drop the column if the column exists.
To drop multiple columns, we can just add in multiple DROP COLUMN clause with a ','.
Check constraints allows us to create more customized constraints that adhere to a certain condition such as all inserted integer values fall below a certain threshold.
use CHECK constraints inside a CREATE table clause.
eg. birthdate DATE (datatype) CHECK (birthdate > '2000-01-01')
so this will check that the child is above a certain years old otherwise it will be an error entry.
And this marks the end of table creation!
Post a Comment
I would love to hear from you.