PostgreSQL

From Bauman National Library
This page was last modified on 21 January 2019, at 13:34.
PostgreSQL
Postgresql.png
Developer(s) community PostgreSQL
Initial release August 1995 (1995-08)
Repository {{#property:P1324}}
Development status active
Written in C
Operating system Crossplatform
License license PostgreSQL
Website postgresql.org

PostgreSQL is a freely distributed object-relational database management system (ORDBMS), the most advanced open database system in the world and a real alternative to commercial databases. [Reference 1]

History

PostgreSQL is based on the Postgres non-profit database developed as an open-source project at the University of California at Berkeley. The development of Postgres, which began in 1986, was directly related to Michael Stounbraker, head of the earlier project Ingres, at that time already acquired by Computer Associates. The name was deciphered as “Post Ingres”, and many of the previously made developments were applied when creating Postgres. Stonebraker and his students developed a new DBMS for eight years from 1986 to 1994. During this period, procedures, rules, user-defined types and other components were introduced into the syntax. In 1995, the development was again divided: Stonebreaker used the experience to create a commercial DBMS Illustra, promoted by his own eponymous company (later acquired by Informix), and his students developed a new version of Postgres - Postgres95, in which the Ingres legacy was replaced on SQL. The development of Postgres95 was taken outside the university and transferred to a team of enthusiasts. The new DBMS has received the name by which it is known and is currently being developed - PostgreSQL. [Reference 2]

The development of Postgres95 was taken outside the university and transferred to a team of enthusiasts. The new DBMS has received the name by which it is known and is currently being developed - PostgreSQL.

About the product

PostgreSQL is supported on all modern Unix systems (34 platforms), including the most common ones, such as Linux, FreeBSD, NetBSD, OpenBSD, SunOS, Solaris, DUX, and also under macOS. Starting from version 8.X, PostgreSQL runs in "native" mode under MS Windows NT, Win2000, WinXP, Win2003. It is known that there are successful attempts to work with PostgreSQL under Novell Netware 6 and OS2. PostgreSQL has repeatedly been recognized as the base of the year, for example, Linux New Media AWARD 2004, 2003 Editors 'Choice Awards, 2004 Editors' Choice Awards. [Reference 3]

PostgreSQL is used as a testing ground for research of a new type of database focused on working with data streams - this is the TelegraphCQ project, which started in 2002 in Berkeley after the successful Telegraph project (the name of the main street in Berkeley).

Key features and functionality

Reliability

PostgreSQL reliability is a proven and proven fact and is provided by the following features:

  • full compliance with the principles of ACID - atomicity, consistency, isolation, data integrity.
  • Multi-versioning (Multiversion Concurrency Control, MVCC) is used to maintain data consistency in a competitive environment, while in traditional databases, locks are used. MVCC means that each transaction sees a copy of the data (the version of the database) at the time the transaction begins, even though the state of the database may have changed. This protects the transaction from inconsistent data changes that might have been caused by (another) concurrent transaction, and provides transaction isolation.
  • Write Ahead Logging (WAL) is a common mechanism for recording all transactions, which allows you to restore the system after possible failures.
  • Point in Time Recovery (PITR) - the ability to restore the database (using WAL) at any time in the past, which allows continuous backup of the PostgreSQL cluster.
  • Replication also improves the reliability of PostgreSQL. There are several replication systems, for example, Slony (version 1.1 is being tested), which is the free and most used solution that supports master-slaves replication.
  • Data integrity is the heart of PostgreSQL. In addition to MVCC, PostgreSQL maintains data integrity at the schema level — these are foreign keys (foreign keys), constraints.
  • The openness of PostgreSQL codes means their absolute availability to anyone, and the liberal BSD license does not impose any restrictions on the use of the code. [Reference 4]

Performance

PostgreSQL performance is based on the use of indexes, an intelligent query scheduler, a fine locking system, a memory buffer and caching management system, and excellent scalability in competitive work.

  • Index support:
    • Standard indexes - B-tree, hash, R-tree, GiST (generalized search tree)
    • Partial indices (partial indices) - you can create an index on a limited subset of values, for example, create index idx_partial on foo (x) where x> 0;
    • Functional indexes (expressional indices) allow you to create indexes using function values ​​from a parameter, for example, create index idx_functional on foo (length (x));
  • The query planner is based on the cost of various plans, taking into account many factors. It provides the user with the ability to debug queries and configure the system.
  • The lock system maintains locks at a lower level, which allows you to maintain a high level of competition while protecting the integrity of data. The lock is maintained at the table and record level. At the lower level, the lock for shared resources is optimized for a specific OS and architecture.
  • Buffer management and caching use complex algorithms to maintain efficient use of allocated memory resources.
  • Tablespaces (tablespaces) for managing data storage at the object level, such as databases, schemas, tables, and indexes. This allows flexible use of disk space and improves reliability, performance, and also contributes to the scalability of the system.
  • Scalability is based on the capabilities described above. PostgreSQL’s low resource requirements and flexible locking system ensure its scaling, while indexes and buffer management ensure good system controllability even at high loads. [Reference 4]

Extensibility

PostgreSQL extensibility means that the user can customize the system by defining new functions, aggregates, types, languages, indexes, and operators. PostgreSQL object-orientedness allows you to transfer application logic to the database level, which greatly simplifies client development, since all business logic is in the database. Functions in PostgreSQL are uniquely identified by the name, number, and types of arguments. [Reference 4]

SQL support

In addition to the basic features of any SQL database, PostgreSQL supports:

  • Very high compliance with ANSI SQL 92, ANSI SQL 99 and ANSI SQL 2003.
  • Schemas that provide a namespace at the SQL level. Schemes contain tables, in which you can define data types, functions, and operators. Using the full name of the object, you can simultaneously work with several schemes. Schemes allow you to organize a database of multiple logical parts, each of which has its own access policy, data types.
  • Subqueries - subselects, full support for SQL92. Subqueries make SQL language more flexible and often more efficient.
  • Outer Joins - external bundles (LEFT, RIGHT, FULL)
  • Rules - the rules according to which the original request is modified. The main difference from the triggers is that the rule works at the request level and before the execution of the request, and the trigger is the system’s response to data changes, i.e. The trigger is triggered during the execution of the request for each modified record (PER ROW). Rules are used to tell the system what actions to take when attempting to update the view.
  • Cursors - cursors, allow you to reduce traffic between the client and the server, as well as memory on the client, if you want to get not the entire result of the query, but only part of it.
  • Table Inheritance - the inheritance of tables, allowing you to create objects that inherit the structure of the parent object and add their specific attributes.
  • Stored Procedures - server (stored) procedures allow you to implement the business logic of the application on the server side. In addition, they can greatly reduce the traffic between the client and the server.
  • Triggers allow you to control the system's response to data changes (INSERT, UPDATE, DELETE), both before the operation itself (BEFORE) and after (AFTER). During the execution of the trigger, special variables NEW are available (the record to be inserted or updated) and the OLD (the record before the update).
  • Cluster table - ordering the records of the table on the disk according to the index, which sometimes by reducing the access to the disk speeds up the execution of the query. [Reference 4]

Data Types

PostgreSQL supports a large set of built-in data types:

  • Boolean type
  • Numeric types
    • Whole
    • fixed point
    • Float
    • Money type (different special output format, but otherwise similar to the numbers with a fixed point with two decimal places)
  • Character types of arbitrary length
  • Binary types
  • Date / time types (fully supporting various formats, accuracy, output formats, including recent changes in time zones)
  • Enumeration
  • Types of text search
  • Composite types
  • HStore (extension adding key value to PostgreSQL)
  • Arrays (of various lengths and any data type, including textual and composite types) of up to 1 GB in size
  • Geometric primitives
  • Network types
  • XML data with support for XPath queries
  • UUID identifier
  • JSON (from version 9.2) and faster JSONB (from version 9.4)

In addition, users can create their own data types, which can usually be fully indexed through the PostgreSQL indexing infrastructure - GiST, GIN, SP-GiST. These include geographic information system (GIS) data types from the PostGIS PostgreSQL project.

There is also a data type called “domain”, which is the same as any other type of data, but with optional restrictions defined by the creator of this domain. This means that any data entered into a column using a domain must comply with the restrictions that were defined as part of the domain.

Beginning with PostgreSQL 9.2, a data type representing a range of data, called range types, can be used. These can be discrete ranges (for example, all integer values ​​from 1 to 10) or continuous ranges (for example, any time between 10:00 and 11:00). Available types of available ranges include integer ranges, large integers, decimal numbers, time stamps (with and without time zone) and dates.

Custom range types can be created to make new range types available, such as IP address ranges, using the inet type as a base or floating point ranges, using the float data type as the base. Range types support included and exclusive range boundaries using the symbols and (), respectively. (for example, it represents all integers, starting from 4 inclusive, but not including 9.) Range types are also compatible with existing operators used for overlap, deterrence, entitlement, etc. [Reference 4]

Inheritance

Tables can inherit their characteristics from the "parent" table. The data in the child tables will exist in the parent tables if the data is not selected from the parent table using the ONLY keyword, i.e. SELECT * FROM ONLY parent_table; . Adding a column to the parent table will cause the column to appear in the child table.

Inheritance can be used to implement the partitioning of tables, using either triggers or rules to direct inserts in the parent table to the corresponding child tables.

As of 2010, this feature is not fully supported, but, in particular, the table of restrictions is not currently inheritable. All check constraints and non-zero constraints on the parent table are automatically inherited by its children. Other types of constraints (unique, primary key, and foreign key constraints) are not inherited.

Inheritance provides a way to display features of generalization hierarchies depicted on entity-relationship diagrams (ERD) directly in the PostgreSQL database. [Reference 4]

Query functions

  1. Operations
  2. Full-text search
  3. Views
  4. Materialized views
  5. Updatable views
  6. Recursive looks
  7. Internal, external (full, left and right) and cross- connections
  8. Sub selects
  9. Correlated Subqueries
  10. Regular expressions
  11. Common table expressions and writeable common table expressions
  12. Encrypted connections via TLS (current versions do not use vulnerable SSL, even with this configuration)
  13. Domains
  14. Save points
  15. Biphasic adoption
  16. TOAST (storage method with limited capabilities) is used to transparently store large table attributes (such as large attachments MIME ​​or XML messages) in a separate area with automatic compression.
  17. Built-in SQL is implemented using a preprocessor. The code SQL is first written to C code. The code is then run through the ECPG preprocessor, which replaces SQL with calls to the code library. The code can then be compiled using the C compiler. The embedding also works with C++, but does not recognize all the C++ constructs. [Reference 4]

Standards support and features

PostgreSQL supports most of the features of the standard SQL: 2011, ACID-compatible and transactional (including most DDL statements) avoids the problem of locking using the Multi-Versioned Parallel Access Control (MVCC) mechanism, provides immunity to dirty reads and full serialization; manages complex SQL queries using a variety of indexed methods that are not available in other databases; has updated views and materialized views, triggers, foreign keys; supports functions and stored procedures, and other extensibility options, and has many extensions written by third parties. In addition to the ability to work with the main branded and open source databases, PostgreSQL supports migration from them, through its extensive support for the SQL standard and the available migration tools. Branded extensions in databases such as Oracle can be emulated using built-in and third-party open source compatibility extensions. Latest versions also provide replication of the database itself for availability and scalability.

PostgreSQL is cross-platform and runs on a variety of operating systems, including Linux, FreeBSD, macOS, Solaris, and Microsoft Windows. Beginning with Mac OS X 10.7 Lion Server, PostgreSQL is the standard default database, and PostgreSQL client tools come with the desktop version. The vast majority of Linux distributions have PostgreSQL available in supported packages.

PostgreSQL is developed by the PostgreSQL Global Development Group, a diverse group of many companies and individual contributors. This is free and open source software distributed under the terms of the PostgreSQL License, the free software license.

Because DBMS PostgreSQL is released under a liberal license, it can be used, modified, and distributed free of charge for any purpose, including personal, commercial, or academic.

Currently (version 9.4.5), PostgreSQL has the following limitations: [Reference 5]

Maximum database size No restrictions
Maximum table size 32 TB
Maximum record size 1,6 TB
Maximum field size 1 Gb
Maximum records in the table No restrictions
Maximum fields in the record 250—1600, depending on field types
Maximum indexes in the table No restrictions


The strengths of PostgreSQL are:

  • high-performance and reliable transaction and replication mechanisms;
  • Extensible embedded programming language system: PL/pgSQL, [PL/Perl, PL/Python and PL/Tcl are supported as standard; in addition, you can use PL/Java, PL/ PHP, PL/Py, PL/R, PL/[[Ruby (programming language)|Ruby], PL/Scheme , PL/sh and PL/V8, as well as support for loading C-compatible modules [Reference 4];
  • support from many programming languages: C\ C++, Java, Perl, Python, Ruby, ECPG, Tcl, PHP and others.
  • inheritance;
  • easy extensibility.

Development

PostgreSQL is developed by the international development team (PGDG), which includes both programmers themselves and those responsible for promoting PostgreSQL (Public Relation), maintaining servers and services, writing and translating documentation, for a total of about 200 people in 2005 . In other words, PGDG is an established team that is completely self-sufficient and stable. The project develops according to the generally accepted scheme among open projects, when priorities are determined by real needs and opportunities. At the same time, a public discussion of all the issues on the mailing list is practiced, which practically excludes the possibility of incorrect and inconsistent decisions.

This also applies to those proposals that already have or are counting on the financial support of commercial companies.

The cycle of working on a new version usually lasts 10-12 months (now there is a discussion about a shorter cycle of 2-3 months) and consists of several stages. [Reference 1]

Other storage features

Referential integrity constraints, including foreign key constraints, column constraints, and row checks Binary and text large object storage Tabular Column Comparison Online backup Time recovery implemented using record based record In-place upgrade with pg_upgrade in less downtime (supports upgrades from 8.3.x and higher) [Reference 1]


Installation and Setup

This section provides instructions for installing and configuring PostgreSQL for various OS [Reference 6]

Installation

If the installation takes place on macOS, then the installation process can be started with the command: [Reference 7]

brew install postgresql

On Linux DBMS is set as follows:

sudo apt-get install postgresql postgresql-contrib

After everything is loaded and installed, you can check whether everything is in order and what version of PostgreSQL is worth. To do this, run the following command:

postgres --version

Installation instructions in digital format

Customization

Working with PostgreSQL can be done via the command line (terminal) using the psql utility, the PostgreSQL command line tool. [Reference 7] You must enter the following command:

psql postgres (to exit the interface, use \q)

This command starts the psql utility. Although there are many third-party tools for administering PostgreSQL, there is no need to install them, since psql is convenient and works fine.

If you need help, enter \help (или -h) in psql terminal. A list of all available help options appears. You can enter \help [name of command], if you need help with a specific team. For example, if you enter \help UPDATE in the psql console, the command syntax is displayed update.

Description: update rows of a table
[ WITH [ RECURSIVE ] with_query [, ...] ]
UPDATE [ ONLY ] table_name [ * ] [ [ AS ] alias ]
    SET { column_name = { expression | DEFAULT } |
          ( column_name [, ...] ) = ( { expression | DEFAULT } [, ...] ) |
          ( column_name [, ...] ) = ( sub-SELECT )
        } [, ...]
    [ FROM from_list ]
    [ WHERE condition | WHERE CURRENT OF cursor_name ]
    [ RETURNING * | output_expression [ [ AS ] output_name ] [, ...] ]

First you need to check the availability of existing users and databases. Run the following command to list all databases:

\list or \l
Figure 1 - The result of the operation \l

In the picture above, you see the three default databases and the postgres superuser, which are created when you install PostgreSQL.

To list all users, run the command \du. The postgres user attribute tells us that it is the superuser.

Figure 2 - The result of the operation \du

Basic database operations

To perform basic actions in DBMS, you need to know the database query language SQL.

Database creation

To create a database, use the command: [Reference 7]

create database

In the example below, a database called proglib_db is created.

Figure 3 - Creating a database named proglib_db

If you forget the semicolon at the end of the request, the “=” sign in the postgres prompt is replaced by “-”. This often indicates that you need to complete (add) the request.

Figure 4 - error output when creating the database

Figure 4 shows the error message due to the fact that in our case the base has already been created.

Creating a new user

To create a user there is a command:

create user

The following example creates a user named author.

Figure 5 - Creating a user named author

When the user is created, the message CREATE ROLE appears. Each user has his own rights (access to databases, editing, creating databases / users, etc.). You may have noticed that the Attributes column for the author user is empty. This means that the user author does not have administrator rights. It can only read data and cannot create another user or database.

You can set a password for an existing user. [Reference 7] The command will cope with this task. \password:

postgres=#\password author

To set a password when creating a user, you can use the following command:

postgres=#create user author with login password 'qwerty';

Remove base or user

For this operation, use the command drop: she can delete both user and database. [Reference 7]

drop database <database_name>
drop user <user_name>

This command should be used very carefully, otherwise the deleted data will be lost, and you can restore it only from the backup (if it was).

If you specify psql postgres (without a username), then postgreSQL will launch you under the standard superuser (postgres). To log in to the database with a specific user, you can use the following command:

psql [database_name] [user_name]

Log in to the proglib_db database under the user author. Run the \q command to exit the current database, and then execute the following command:

Figure 6 - Entering the proglib_db database

References

  1. 1.0 1.1 1.2 What is PostgreSQL? // Sternberg Astronomical Institute Moscow University. Date updated: 01.01.2005. URL: http://www.sai.msu.su/~megera/postgres/talks/what_is_postgresql.html (reference date: 17.01.2019).
  2. PostgreSQL // Wikipedia. [2019-2019]. Date updated: 07.04.2018. URL: https://ru.wikipedia.org/wiki/PostgreSQL (reference date: 17.01.2019).
  3. 2004 Editors' Choice Awards | Linux Journal // Linux Journal, LLC. [2019-2019]. Date updated: 01.08.2004. URL: https://www.linuxjournal.com/article/7564 (reference date: 17.01.2019).
  4. 4.0 4.1 4.2 4.3 4.4 4.5 4.6 4.7 PostgreSQL 11.1 Documentation // PostgreSQL. [1996-2019]. URL: https://www.postgresql.org/docs/current/ (reference date: 17.01.2019).
  5. About // PostgreSQL. [1996-2019]. URL: http://www.postgresql.org/about/ (reference date: 17.01.2019).
  6. Introduction to PostgreSQL // Metanit. [2012-2019]. Date updated: 15.03.2018. URL: https://metanit.com/sql/postgresql/1.1.php (reference date: 17.01.2019).
  7. 7.0 7.1 7.2 7.3 7.4 Working with PostgreSQL: from full zero to full enlightenment // Programmer’s Library. [2016-2018]. URL: https://proglib.io/p/learn-postgresql (reference date: 22.12.2018).