# Databases

# SQL

* __Structured Query Language__
  * A language __ __ to organize and manipulate data in a relational database
* Originally developed by IBM in the 70s
  * Quickly became the most popular database language
* SELECT id, email
* FROM users
* WHERE first\_name = 'Teppo';

# Relational Database
Management Systems

* In relational databases, values are stored in  __tables__
  * Each table has  __rows __ and  __columns__
  * Data is displayed in a two-dimensional matrix
* Values in a table are related to each other
  * Values can also be related to values in other tables
* A relational database management system (RDBMS) is a program that executes queries to relational databases

![](imgs/3-databases-with-docker_0.png)

[https://db-engines.com/en/ranking](https://db-engines.com/en/ranking)

# PostgreSQL

A free, open-source, cross-platform relational database management system

Emphasizes extensibility and SQL compliance

Fully  [ACID](https://en.wikipedia.org/wiki/ACID) -compliant (atomicity, consistency, isolation and durability)

# pgAdmin

Administration and development platform for PostgreSQL

Cross-platform, features a web interface

Basically a control panel application for your PostgreSQL database

# Running Postgres in Docker

Using the official  [Postgres Docker image](https://hub.docker.com/_/postgres) , let's create a locally running Postgres instance. You can choose the values for POSTGRES\_PASSWORD and POSTGRES\_USER freely.

__docker run --name my-postgres --env POSTGRES\_PASSWORD=pgpass --env POSTGRES\_USER=pguser -p 5432:5432 -d postgres:15.2__

![](imgs/3-databases-with-docker_1.png)

# PostgreSQL: Using psql

If you have PostgreSQL installed

Win search  _psql _ and open  _SQL Shell (psql)

Press enter four times to insert default values, or values matching to your configuration, and insert your password for the fifth prompt

If you do not have PostgreSQL installed, you can use psql directly from the container running the database	docker exec -it my-postgres psql -U pguser

* If you have connected to the server as user "pguser", you are by default connected to "pguser" database.
* This is a database for user information.
* You should not use it for storing program data!!
* All databases can be listed using the  _list _ command: __ \\l __
* PostgreSQL uses a default database named "postgres"
* Users can connect to a different database with  _connect _ command  __\\c__ :
  * __\\c <database-name>__
* A new database is created with CREATE command:
  * __CREATE DATABASE <database-name>;__
  * Do not forget the  __;__  at the end
* After creating a new database, you still need to connect to it!
* Exit psql with the command  __exit__

# Exercise 1: Postgres Server

Start a local instance of Postgres in Docker

Connect to the server using psql

Use command \\l to see what databases there are already created on the server.

Create a new database called "sqlpractice".

Connect to the newly created database.

# Running pgAdmin in Docker

Using the official  [pgAdmin](https://hub.docker.com/r/dpage/pgadmin4)  image, we'll run pgAdmin alongside Postgres

__docker run --name my-pgadmin -p 5050:80 __  __-e PGADMIN\_DEFAULT\_EMAIL=<your-email-address> __  __-e PGADMIN\_DEFAULT\_PASSWORD=<your-password> __  __-d dpage/pgadmin4__

![](imgs/3-databases-with-docker_2.png)

# Logging into pgAdmin

With pgAdmin running, navigate your web browser to  [http://localhost:5050](http://localhost:5050)  and use the username and password you provided to log in.

![](imgs/3-databases-with-docker_3.png)

# PostgreSQL internal IP Address

Both  __PostgreSQL__  and  __pgAdmin __ are now running in our  _local_  Docker. To connect pgAdmin to PostgreSQL, we need the IP address used  _inside_  Docker.

Using Docker's inspect, we'll get Docker's internal IP for my-postgres -container. Since the command produces quite a lot of information, we pipe the result to grep, to see only the rows that contain the word "IPAddress"

__docker inspect <container-name> | grep IPAddress    __  __<- unix__  __docker inspect <container-name> | findstr IPAddress __  __<- windows__

In the example output, the IP Address is 172.17.0.2

![](imgs/3-databases-with-docker_4.png)

# Connecting PgAdmin to our DB

Now we have all that we need for a connection. In pgAdmin, select "Object" > "Register" > "Server".

In the "General" tab, give the server a name that identifies the connection.

![](imgs/3-databases-with-docker_5.png)

If Object menu is greyed out, click on Servers.

![](imgs/3-databases-with-docker_6.png)

* In the "Connection" tab, enter
* Host name/address:
  * the PostgreSQL internal Docker address
* Port, Username, Password
  * the values defined when running the PostgreSQL container
* Then click Save. You should now see all the databases available on this server.

![](imgs/3-databases-with-docker_7.png)

![](imgs/3-databases-with-docker_8.png)

# Exercise 2: pgAdmin

Start a local instance of pgAdmin in Docker

Following lecture instructions, connect the pgAdmin to your already running PostgreSQL server.

Verify that you can see the database created in the previous assignment.

# PostgreSQL: Querying

With psql: After connecting to a database, just type a query and hit enter.

With pgAdmin:

Right-click a database _ _ >  _Query tool_

Insert a query into the  _Query Editor_  and hit  _Execute _ (F5)

![](imgs/3-databases-with-docker_9.png)

![](imgs/3-databases-with-docker_10.png)

---

# Editing Data with pgAdmin

* Tables of Data in a DataBase are found under
  * Database > Schemas > Tables
* Inspect and edit data in pgAdmin by right-clicking a table and selecting  _View/Edit Data_

![](imgs/3-databases-with-docker_11.png)

![](imgs/3-databases-with-docker_12.png)

Individual values in the table can be directly modified by double clicking the value and then editing the value in the visual user interface

Save the changes with the  _Save Data Changes_  button

![](imgs/3-databases-with-docker_13.png)

# Exercise 3: Preparing the Database

Using either PgAdmin or PSQL

Insert the  [provided query](https://gitea.buutti.com/education/academy-assignments/src/branch/master/Databases/initdb.txt)  to the database you created previously.

Verify that the query has created new tables to your database.

# Types of queries

Select

Insert

Delete

Update

Create & Drop

# Querying data with SELECT

Syntax:

__SELECT column1, column2, column3 FROM table\_name;__

Examples:

SELECT full\_name, email FROM users;

SELECT full\_name AS name, email FROM users;

SELECT * FROM users;

# Filtering data with WHERE

Syntax:

__SELECT column1, column2 FROM table\_name WHERE condition;__

Text is captured in  _single quotes_ .

In a LIKE condition,  _%_  sign acts as a wildcard.

IS and IS NOT are also valid comparison operators.

Example:

SELECT full\_name FROM users

WHERE full\_name = 'Teppo Testaaja';

SELECT * FROM books WHERE name LIKE '%rr%';

SELECT * FROM books WHERE author IS NOT null;

# Ordering data with ORDER BY

Syntax:

__SELECT column1 FROM table\_name ORDER BY column1 ASC;__

Examples:

SELECT full\_name FROM users

ORDER BY full\_name ASC;

SELECT full\_name FROM users

ORDER BY full\_name DESC

# Combining with JOIN

Also known as INNER JOIN

Corresponds to intersection from set theory

![](imgs/3-databases-with-docker_14.png)

# JOIN examples

SELECT

users.id, users.full\_name, borrows.id,

borrows.user\_id, borrows.due\_date, borrows.returned\_at

FROM users

JOIN borrows ON

users.id = borrows.user\_id;

SELECT

U.full\_name AS name,

B.due\_date AS due\_date,

B.returned\_at AS returned\_at

FROM users AS U

JOIN borrows AS B ON

U.id = B.user\_id;

# Combining with LEFT JOIN

Also known as LEFT OUTER JOIN

Example:

SELECT

U.full\_name AS name,

B.due\_date AS due\_date,

B.returned\_at AS returned\_at

FROM users AS U

LEFT JOIN borrows AS B ON

U.id = B.user\_id;

![](imgs/3-databases-with-docker_15.png)

# Exercise 4: Querying the Library

Using SQL queries, get

All columns from loans that are loaned before 1.3.2000

All columns of loans that are returned

Columns user.full\_name and borrows.borrowed\_at of the user with an id of 1

Columns book.name, book.release\_year and language.name of all books that are released after 1960

# INSERT

Syntax

__INSERT INTO table\_name (column1, column2, column3)_  __VALUES (value1, value2, value3);__

Example

INSERT INTO users (full\_name, email, created\_at)

VALUES ('Pekka Poistuja', 'pekka.poistuja@buutti.com', NOW());

Since id is not provided, it will be automatically generated.

# UPDATE

Syntax

__UPDATE table\_name__  __SET column1 = value1, column2 = value2__  __WHERE condition;__

__Notice:__  if a condition is not provided, all rows will be updated!If updating only one row, it is usually best to use id.

Example

UPDATE usersSET email = 'taija.testaaja@gmail.com'WHERE id = 2;

# REMOVE

Syntax

__DELETE FROM table\_name WHERE condition;__

Again, if the  _condition_  is not provided, DELETE affects  _all_  rows. Before deleting, it is a good practice to execute an equivalent SELECT query to make sure that only the proper rows will be affected.

Example:

SELECT  __*__  FROM  __users __ WHERE  __id = 5;__

DELETE FROM users WHERE id = 5;

# Exercise 5: Editing Data

Postpone the due date of the loan with an id of 2 by two days in the  _borrows_  _ _ table

Add a couple of new books to the  _books _ table

Delete one of the loans.

# CREATE TABLE

Before data can be manipulated, a database and its tables need to be initialized.

Syntax

__CREATE TABLE table\_name (__

__	column1 datatype,__  __	column2 datatype,__  __	…__  __);__

Example:CREATE TABLE "users" (

"id" SERIAL PRIMARY KEY,

"full\_name" varchar NOT NULL,

"email" varchar UNIQUE NOT NULL,

"created\_at" timestamp NOT NULL

);

# DROP

In order to remove tables or databases, we use a DROP statement

__DROP TABLE table\_name;__  __DROP DATABASE database\_name;__

These statements do not ask for confirmation and there is no undo feature. Take care when using a drop statement.

# NoSQL

* In addition to SQL databases, there are also NoSQL databases
* Many differing definitions, but...
  * most agree that NoSQL databases store data in a format other than tables
  * They can still store relational data - just differently
* Four different database types:
  * Document databases
  * Key-value databases
  * Wide-column stores
  * Graph databases
* Example database engines include MongoDB, Redis and Cassandra
---

Document databases store data in documents similar to JSON (JavaScript Object Notation) objects. Each document contains pairs of fields and values. The values can typically be a variety of types including things like strings, numbers, booleans, arrays, or objects, and their structures typically align with objects developers are working with in code. Because of their variety of field value types and powerful query languages, document databases are great for a wide variety of use cases and can be used as a general purpose database. They can horizontally scale-out to accomodate large data volumes. MongoDB is consistently ranked as the world's most popular NoSQL database according to DB-engines and is an example of a document database. For more on document databases, visit What is a Document Database?.
Key-value databases are a simpler type of database where each item contains keys and values. A value can typically only be retrieved by referencing its value, so learning how to query for a specific key-value pair is typically simple. Key-value databases are great for use cases where you need to store large amounts of data but you don't need to perform complex queries to retrieve it. Common use cases include storing user preferences or caching. Redis and DynanoDB are popular key-value databases.
Wide-column stores store data in tables, rows, and dynamic columns. Wide-column stores provide a lot of flexibility over relational databases because each row is not required to have the same columns. Many consider wide-column stores to be two-dimensional key-value databases. Wide-column stores are great for when you need to store large amounts of data and you can predict what your query patterns will be. Wide-column stores are commonly used for storing Internet of Things data and user profile data. Cassandra and HBase are two of the most popular wide-column stores.
Graph databases store data in nodes and edges. Nodes typically store information about people, places, and things while edges store information about the relationships between the nodes. Graph databases excel in use cases where you need to traverse relationships to look for patterns such as social networks, fraud detection, and recommendation engines. Neo4j and JanusGraph are examples of graph databases.

# Object-Relational Mappers

* ORMs allow developers to manipulate databases with code instead of SQL queries
  * For example, for performing CRUD operations on their database
* Some popular ORMs:
* Hibernate (Java)
* EFCore (.NET)
* Sequelize (Node.js)
* TypeORM (TypeScript)
* We recommend using TypeORM instead of writing SQL yourself. You can find a good step-by-step guide within  [their own documentation](https://typeorm.io/#installation) .

# PostgreSQL with Node

# Local environment

Using our previously created, Dockerized Postgres instance, we'll create a Node.Js application to connect to our database.

If you have deleted your postgres container, a new one can be created with a following command:

__docker run --name my-postgres -e POSTGRES\_PASSWORD=mypassword -e POSTGRES\_USER=pguser -e POSTGRES\_DB=mydb -p 5432:5432 -d postgres:15.2__

With a previously created container, this command will start the container:

__docker start my-postgres__

# Preparing our Node application

Initialize the application with:	 __npm init -y__

Install  [express](https://www.npmjs.com/package/express)  and  [PostgreSQL client for Node.JS](https://www.npmjs.com/package/pg)  __	npm install express pg__

Install  [dotenv](https://www.npmjs.com/package/dotenv)  and nodemon as a development dependencies	 __npm install –-save-dev dotenv nodemon__

If using TypeScript, install  types for express and pg, plus ts-node for running nodemon.	 __npm install --save-dev @types/express @types/pg ts-node__

# Dotenv

Example of .env file:

In development, it's customary to store ports, database login info, etc. as  _environment variables_  in a separate .env file

Previously installed package  _dotenv_  is used to load variables from .env to the main program

dotenv can be imported with

PORT=3000

PG\_HOST=localhost

PG\_PORT=5432

PG\_USERNAME=pguser

PG\_PASSWORD=mypassword

PG\_DATABASE=postgres

These values must match the values declared when running the PostgreSQL container.

The database must exist as well.

Note that the example uses the default "postgres" database, but you can use any database you want.

import dotenv from 'dotenv'

dotenv.config()

# Dotenv (continued)

{

"name": "products\_api",

"version": "1.0.0",

"scripts": {

"dev": "nodemon -r dotenv/config ./src/index.ts"

},

"dependencies": {

"express": "^4.18.2",

"pg": "^8.9.0"

},

"devDependencies": {

"@types/express": "^4.17.17",

"@types/pg": "^8.6.6",

"dotenv": "^16.0.3",

"nodemon": "^2.0.20",

"ts-node": "^10.9.1",

"typescript": "^4.9.5"

}

}

Dotenv is usually only used in development, not in production. In professional setting the dotenv config is often preloaded in the development startup script

You can require dotenv when running  _npm run dev_

-r is short for --require

# Dotenv and Git

* .env files usually contain sensitive data that we do  __not__  want to store in Git repositories.
* Thus, usually  __.env__  file is excluded from the Git repository
  * Add  __.env__  to  __.gitignore__
  * If you have auto-generated .gitignore with npx gitignore Node, environment files are excluded automatically

![](imgs/3-databases-with-docker_16.png)

# Connecting to PostgreSQL

Our database file contains functions and configuration for initializing the Postgres pool, creating tables and running queries.

At the moment, we have only one query. It is a single-use query that creates a products table to the database if such table does not yet exist.

// db.ts

import pg from "pg";

const { PG\_HOST, PG\_PORT, PG\_USERNAME, PG\_PASSWORD, PG\_DATABASE } = process.env;

const pool = new pg.Pool({

host: PG\_HOST,

port: Number(PG\_PORT),

user: PG\_USERNAME,

password: String(PG\_PASSWORD),

database: PG\_DATABASE,

});

const executeQuery = async (query: string, parameters?: Array<any>) => {

const client = await pool.connect();

try {

const result = await client.query(query, parameters);

return result;

} catch (error: any) {

console.error(error.stack);

error.name = "dbError";

throw error;

} finally {

client.release();

}

};

export const createProductsTable = async () => {

const query = \`

CREATE TABLE IF NOT EXISTS "products" (

"id" SERIAL PRIMARY KEY,

"name" VARCHAR(100) NOT NULL,

"price" REAL NOT NULL

)\`;

await executeQuery(query);

console.log("Products table initialized");

};

At the moment our  __index.js__  does nothing but creates a single table for our database and launches express server.

It doesn't even have any endpoints, so it's not much of a server yet.

import express from "express";

import { createProductsTable } from "./db";

const server = express();

createProductsTable();

const { PORT } = process.env;

server.listen(PORT, () => {

console.log("Products API listening to port", PORT);

});

# Launching the application

Let's use our predefined  _npm run dev_

![](imgs/3-databases-with-docker_17.png)

…And check with psql that our application succeeds in connecting to the database and creating the table.

Epic success!

![](imgs/3-databases-with-docker_18.png)

# Exercise 6: Node & PostgreSQL

Following the lecture example, create an Express server that connects to your local PostgreSQL instance. The database information should be stored in environment variables. When the server starts, it should create a product table with three columns: id (serial, primary key), name (varchar) and price (real).

# Creating Queries

* Next, we will create an actual CRUD API for communicating with the database.
* For that we need endpoints for creating, reading, updating and deleting products.
  * All of these need their own queries.

# Using queries

* We'll use the pre-made executeQuery() function for querying the database from a few slides back
* It takes in two arguments:
  * the actual query string
  * an optional parameters array
* When supplying parameters, the query string should have placeholders $1, $2, etc
  * These will be replaced with the contents of the parameters array.

# Parameterized queries example

When running  executeQuery(query, parameters) with above values defined, the query would be parsed as

![](imgs/3-databases-with-docker_19.png)

SELECT * FROM cats WHERE color = 'yellow' and age > 10;

# Why not just use String templating?

…Because of  [SQL injections](https://fi.wikipedia.org/wiki/SQL-injektio) . Always use database library's built-in parameterization!

_NEVER DO THIS!!!_

![](imgs/3-databases-with-docker_20.png)

![](imgs/3-databases-with-docker_21.png)

# Creating queries

We will create a Data Access Object,  __dao.js__  that will handle interacting with our database.

The idea is that we want to just tell our DAO what we want done (e.g. "add this customer to the database") and the DAO will handle the details of that action.

The DAO will also return possible additional information that was created during the action.

Our  _insertProduct _ function

generates a new, unique ID for the product using  [uuid](https://www.npmjs.com/package/uuid)

constructs a parameters array containing said id, the name of the product and the price of the product

executes the query using  _db.executeQuery_  method

returns the database result object

![](imgs/3-databases-with-docker_22.png)

![](imgs/3-databases-with-docker_23.png)

![](imgs/3-databases-with-docker_24.png)

The rest of the DAO operations work in similar fashion.

The router that declares the endpoints, uses the DAO to interact with the database.

# Testing the API

![](imgs/3-databases-with-docker_25.png)

Now we can use Insomnia to verify that all the endpoints work as expected.

We can also use psql to observe the changes in the database

![](imgs/3-databases-with-docker_26.png)

# Exercise 7: Creating Queries

Continue following the lecture example. Create a router and a database access object to handle

Creating a product

Reading a product

Updating a product

Deleting a product

Listing all products

# Dockerized PostgreSQL App

# Setting Environment Variables

Docker has two kinds of environment variables: run-time and build-time.

In this scenario we want to set our environment variables at  __build time__ . This means that  _the Docker image will contain all the environment variable information_ , including sensitive things like passwords. This might be an issue in some scenarios. In those cases the environment variables need to be set at  __run time__ .

In the Dockerfile we set the build-time values by setting ARG parameters. Then we use these values to set the run-time environment variables by setting ENV parameters.

More information:  [https://vsupalov.com/docker-arg-env-variable-guide/](https://vsupalov.com/docker-arg-env-variable-guide/)

When the ARGs and ENVs have been set in the Dockerfile, we provide the ARG values when building the Docker image by using __--build-arg <key>=<value>__  flags. To build an image with these parameters, we'd use something like

__docker build __

__--build-arg PORT=3000 __

__--build-arg PG\_HOST=https://my.postgres.server__

__--build-arg PG\_PORT=5432 __

__--build-arg PG\_USERNAME=pguser __

__--build-arg PG\_PASSWORD=pgpass __

__--build-arg PG\_DATABASE=my-database __

__-t my-app .__

ARG PORT

ARG PG\_HOST

ARG PG\_PORT

ARG PG\_USERNAME

ARG PG\_PASSWORD

ARG PG\_DATABASE

ENV PORT=${PORT}

ENV PG\_HOST=${PG\_HOST}

ENV PG\_PORT=${PG\_PORT}

ENV PG\_USERNAME=${PG\_USERNAME}

ENV PG\_PASSWORD=${PG\_PASSWORD}

ENV PG\_DATABASE=${PG\_DATABASE}

[Docker documentation here!](https://www.docker.com/blog/how-to-use-the-postgres-docker-official-image/)

# Exercise 8: Dockerized PG App

Dockerize the application you have built. Build the docker image, run the app and test that it works using insomnia/postman.

Remember that when you run the application on your local Docker, both the app and the database are in the same Docker network, so you have to check the database IP address just like when running pgAdmin.