Today I am glad to announce the course "Hacking PostgreSQL" from 16 occupations on which we will investigate together features of architecture of open DBMS and to make changes at the level of the source code. Will take a course in Moscow, on a site of the Postgres Professional company. The beginning of a course is planned for February, 2016. Lectures will begin right after the February pgconf.ru conference and will take place once a week in the evening. We will spread videos and materials of lectures in process of processing.
The course is built from personal experience developers of our company, materials from conferences, articles and thoughtful reading documentation and source codes. First of all it is addressed to beginning developers of a kernel of PostgreSQL. But it will be interesting also to DBA which sometimes should get into a code, and just by all not indifferent to architecture of big system, persons interested to learn "And how it works actually?"
We specially announce a course in advance to prepare lectures taking into account your comments. And still because there are preliminary requirements:
- SQL. Specific knowledge is not required, but you have to be aware as from SQL indexes, transactions and so on work. Materials at choice: PostgreSQL Tutorial and SQL Tutorial
- S. Budet is a lot of code. If you just want to look, it is enough to be able to read of it. And here to perform practical tasks, it is better to refresh knowledge. For example, at this course Practical Programming in C
- Basic data structures and algorithms. It is possible to esteem here and to look at a course here.
We want to create the open knowledge base about the DBMS internal device on the example of PostgreSQL. We hope that hot discussions with administrators will reach mutual understandings new level, and users with developers at last will be able to speak one language. And even by preparation of a course we re-read a code, we correct small defects and outdated comments and we find interesting challenges which can be added to TODO.
No living man all things can, especially for 16 occupations. Though we very much tried. As a result the program of a course turned out such:
1. Overview of architectureThe first lecture is urged to give a general idea about subsystems of PostgreSQL and their communication, and also to briefly define terms which we will use in the following lectures.
2. Community PostgreSQL and developer's toolsLyrical digression how the community PostgreSQL as the international development what steps need to be passed is coordinated is arranged that your patch was accepted, and some useful trifles which will be useful to the beginning developer of a kernel. And also the overview of tools which need to be able to use.
3. ExpansibilityIn the third lecture creation of the expansion (contrib) is step by step investigated. It is easiest and natural way to join development and to add new functionality to PostgreSQL. Besides, internal expansibility of a postgres on the example of adding of new data type will be considered.
4. Overview of the source codeIn this lecture we will trace a way of execution of different requests from receipt of the text of request before issue of result.
5. Features of a codeTo understand the source code of system and furthermore to develop it, it is necessary will get acquainted with the adopted agreements. In this lecture the speech about Datum data type, processing of objects of variable length, different macroes and calling conventions of functions (calling conventions) will go.
6. System catalogThe system catalog contains metadata about all objects of system. Except information on the user tables, functions and triggers, data on data types, operators and access methods to them, and many other things are also stored in it. The overview of the main tables of the directory and an interaction interface with them is provided in lecture.
7. Physical data representationAfter this lecture you will know how attributes in line, lines on the page, pages in the table, and tables in base are laid. And also how Postgres copes with data smoothing and storage of big attributes. It will help to see some restrictions of architecture and to understand (and can think up new) acceptances of design of the optimum scheme DB.
8. Work with memoryFor convenience and efficiency of work with memory instead of standard functions of language With (malloc/free), in PostgreSQL their analogs are used by palloc/pfree. In this lecture it will be a question of how memory contexts as it is correct to use them in a code and what situations can lead to unexpectedly big consumption of memory are arranged.
9. Shared memory and blockingFrom this lecture you learn how the manager of memory in PostgreSQL how many different types of blocking are used that parallel transactions worked correctly and that actually change settings of memory in postgresql.conf is arranged.
10. Nodes &TreesInformation on SQL request in PostgreSQL contains in structures like Node. They are used as nodes of a tree of requests (Query tree), in trees of analysis (Parse tree) and the scheduler (Plan tree). In this lecture we will consider the main node types and we will sort algorithms of work of the mentioned trees.
11. PostgreSQL kernel patch
12. Debugging. Testing of functionality and performanceIn lecture we will talk about instruments of debugging and testing by means of which it is possible to investigate PostgreSQL. And how to stop comparing warm to soft and to begin correctly to test performance.
13. Transactions. MVCC and VACUUM
14. WAL. Both recovery, and replicationWAL, it is Write-Ahead Log, it is the log of the advancing record. In this lecture we will try to tell the most useful about its format, use and settings.
15. IndexesThe story about how different types of indexes in PostgreSQL are arranged, reasonings on data structures, councils for effective use and maintenance of indexes. And also the request for discussion with developers about perspectives and new ideas of development.
16. Tendencies of development of DBMS in general and PostgreSQL in particularFinal lecture of this course at which we will discuss perspectives of development of PostgreSQL. Different types of clusters, column storage, in-memory of structure, parallel processing of requests, hints, space data and other interesting challenges which are too large-scale for this course.
I will coordinate and give this course, Anastasia Lubennikova, the developer of a kernel PostgreSQL.
Sentences and wishes leave in comments.
This article is a translation of the original post at habrahabr.ru/post/273623/
If you have any questions regarding the material covered in the article above, please, contact the original author of the post.
If you have any complaints about this article or you want this article to be deleted, please, drop an email here: firstname.lastname@example.org.
We believe that the knowledge, which is available at the most popular Russian IT blog habrahabr.ru, should be accessed by everyone, even though it is poorly translated.
Shared knowledge makes the world better.