Developers Club geek daily blog

1 year, 10 months ago
In general, relational database management systems were conceived as "one - the size - is suitable - all a solution for storage and data acquisition" for decades. But the growing need for scalability and new requirements of applications created new problems for traditional management systems of RSUBD, including some dissatisfaction with approach "one - the size - approaches - all" in a number of scalable applications.

The new generation of the lightweight, high-performance databases created to throw down a challenge to domination of relational databases was the answer to it.

The fact that different implementations a web, corporate and cloudy applicaions have different requirements to their bases served as the big reason for the movement NoSQL.

Example: for such volume websites as eBay, Amazon, Twitter, or Facebook, scalability and high availability are the main requirements which cannot be compromised. For these applications even the slightest shutdown can have considerable financial effects and influence on trust of clients.

Thus, the ready solution of the database often has to resolve issues not only transactional integrity, but moreover, higher data volumes, increase in speed and performance of data, and the growing variety of formats. There were new technologies which specialize in optimization on one, or two of above-mentioned aspects, endowing others. Postgres with JSON applies more complete approach to needs of users, solving the majority of operational loads of NoSQL more successfully.

Comparison dokumento-oriyentirovannykh / relational databases


Smart approach of new technology relies on a close assessment of your requirements, with the tools available to achievement of these requirements. In the table given below characteristics not relational dokumento-oriented bases (such as MongoDB) and characteristics of Postgres'ovskoy relational / dokumento-oriyentirovannoy databases are compared to help you to find the correct solution for your requirements.
Features MongoDB PostgreSQL
Beginning of Open Source of development 2009 1995
Schemes Dynamic Static and dynamic
Support of hierarchical data Yes Yes (since 2012)
Support "key event" of data Yes Yes (since 2006)
Support relational the given / normalized storage form No Yes
Restrictions of data No Yes
Consolidation of data and foreign keys No Yes
Powerful language of requests No Yes
Support of transactions and Management of competitive access by means of a multiversion control No Yes
Atomic transactions In the document On all base
Supported languages of web development JavaScript, Python, Ruby, and others … JavaScript, Python, Ruby, and others …
Support of the general formats of data JSON (Document), Key-Value, XML JSON (Document), Key-Value, XML
Support of space data Yes Yes
The easiest way of scaling Horizontal scaling Vertical masshtabirvoaniye
Sharding Idle time Difficult
Programming on server side No Set of procedural languages, such as Python, JavaScript, C,C ++, Tcl, Perl and many, many others
Simple integration with other data sources No External collectors of data from Oracle, MySQL, MongoDB, CouchDB, Redis, Neo4j, Twitter, LDAP, File, Hadoop and others …
Business of the logician It is distributed according to client applications It is centralized with triggers and stored procedures, or distributed according to client applications
Availability of the training resources It is difficult to find It is easy to find
Primary use Big data (billions of records) with a large number of parallel updates where integrity and coordination of data is not required. Transactional and operational applications, which benefit in the normalized form, associations, restrictions of data and support of transactions.

Source: website EnterpriseDB.

The document in MongoDB is automatically supplied with the _id field if it is not present. When you want to receive this document, you can use to use _id — he behaves in accuracy as primary key in relational databases. PostgreSQL stores data in fields of tables, MongoDB stores them in the form of JSON documents. On the one hand, MongoDB looks as a fine solution as you can have all different data from several tables in PostgreSQL in one JSON document. This flexibility is reached by lack of restrictions on a data structure which can be really attractive at the first moment and really terrifying on the big database in which some records have the wrong values, or empty fields.

PostgreSQL 9.3 goes complete with fine functionality which allows to turn it into NoSQL the database, with a complete support of transactions and storage of JSON documents with restrictions on fields with data.

Simple example


I will show as to make it, using very simple example of the table of Employees. Each Employee has a name, the description, certain number id and a salary.

PostgreSQL version

The simple table in PostgreSQL can look as follows:

CREATE TABLE emp (
     id SERIAL PRIMARY KEY,
     name TEXT,
     description TEXT,
     salary DECIMAL(10,2)
);

This table allows us to add employees here so:

INSERT INTO emp (name, description, salary) VALUES ('raju', ' HR', 25000.00);

Alas, the above-stated table allows to add blank lines without some importance:

INSERT INTO emp (name, description, salary) VALUES (null, -34, 'sdad');

It can be avoided, having added restriction to the database. Let's assume that we always want to have a non-blank unique name, the non-blank description, not a negative salary. Such table with restrictions will look:

CREATE TABLE emp (
    id SERIAL PRIMARY KEY,
    name TEXT UNIQUE NOT NULL,
    description TEXT NOT NULL,
    salary DECIMAL(10,2) NOT NULL,
    CHECK (length(name) > 0),
    CHECK (description IS NOT NULL AND length(description) > 0),
    CHECK (salary >= 0.0)
);

Now all operations, such as adding, or record updating which contradict some of these restrictions will fall off with an error. Let's check:

INSERT INTO emp (name, description, salary) VALUES ('raju', 'HR', 25000.00);
--INSERT 0 1
INSERT INTO emp (name, description, salary) VALUES ('raju', 'HR', -1);
--ERROR: new row for relation "emp" violates check constraint "emp_salary_check"
--DETAIL: Failing row contains (2, raju, HR, -1).

NoSQL version

In MongoDB, record from the table will look as the following JSON document above:

{
    "id": 1,
    "name": "raju",
    "description": "HR,
    "salary": 25000.00
}

in this way, in PostgreSQL we can save this entry as a line in the table emp:

CREATE TABLE emp (
     data TEXT
);

It works as in the majority of not relational databases, any checks, any errors with bad fields. As a result, you can transform data as want, problems begin when your application expects that the salary is a number, and in practice it either the line, or it in general is absent.

Checking JSON

In PostgreSQL 9.2 for these purposes there is a good data type, it is called JSON. This type can store in itself only correct JSON, before conversion to this type there is a check on a validity.

Let's change the description of the table on:

CREATE TABLE emp (
     data JSON
);

We can add some correct JSON to this table:

INSERT INTO emp(data) VALUES('{
    "id": 1,
    "name": "raju",
    "description": "HR",
    "salary": 25000.00
}');
--INSERT 0 1
SELECT * FROM emp;
 { +
    "id": 1, +
    "name": "raju", +
    "description": "HR",+
    "salary": 25000.00 +
 }
--(1 row)

It will work, and here adding of incorrect JSONa will come to the end with an error:

INSERT INTO emp(data) VALUES('{
    "id": 1,
    "name": "raju",
    "description": "HR",
    "price": 25000.00,
}');
--ERROR: invalid input syntax for type json

The problem with formatting can be difficult noticeable (I added a comma to the last line, it is not pleasant to JSONU).

Checking fields

So, we have a solution which looks almost like first pure PostgreSQL a solution: we have data which are checked. It does not mean that data make sense. Let's add check for validation of data. In PostgreSQL 9.3 there is a new strong functionality for management of JSON of objects. There are certain operators for the JSON type who will give you easy access to fields and values. I will use only the operator "->>", but you can find more information in documentation of Postgres.

Besides, I need to check types of fields, including the id field. It is what Postgres simply checks because of determination of data types. I will use other syntax for checks as I want to name it. namnogop to a grove will look for a problem in specific a field, but not according to all huge JSON document.

The table with restrictions will look as follows:

CREATE TABLE emp (
    data JSON,
    CONSTRAINT validate_id CHECK ((data->>'id')::integer >= 1 AND (data->>'id') IS NOT NULL ),
    CONSTRAINT validate_name CHECK (length(data->>'name') > 0 AND (data->>'name') IS NOT NULL )
);

The operator "->>" allows me to retrieve value from the necessary JSON'a field, to check whether there is it and its validity.

Let's add JSON without description:

INSERT INTO emp(data) VALUES('{
    "id": 1,
    "name": "", 
    "salary": 1.0
}');

--ERROR: new row for relation "emp" violates check constraint "validate_name"

There was one more problem. The name and id fields have to be unique. It is easy to achieve it as follows:

CREATE UNIQUE INDEX ui_emp_id ON emp((data->>'id'));
CREATE UNIQUE INDEX ui_emp_name ON emp((data->>'name'));

Now, if to try to add the JSON document to base which id already contains in base then the following error will appear:

--ERROR: duplicate key value violates unique constraint "ui_emp_id"
--DETAIL: Key ((data ->> 'id'::text))=(1) already exists.
--ERROR: current transaction is aborted, commands ignored until end of transaction block

Performance

PostgreSQL copes with the most exacting requests of the largest insurance companies, banks, broker, public institutions, and defense contractors in the world on today's, as well as coped for many years. Improvements of performance of PostgreSQL are continuous with annual release of versions, and include improvements and for its unstructured data types including.

image


Source: EnterpriseDB White Paper: Using possibilities of NoSQL in Postgres

With own hand to test NoSQL performance in PostgreSQL? download pg_nosql_benchmark with GitHub.

This article is a translation of the original post at habrahabr.ru/post/272735/
If you have any questions regarding the material covered in the article above, please, contact the original author of the post.
If you have any complaints about this article or you want this article to be deleted, please, drop an email here: sysmagazine.com@gmail.com.

We believe that the knowledge, which is available at the most popular Russian IT blog habrahabr.ru, should be accessed by everyone, even though it is poorly translated.
Shared knowledge makes the world better.
Best wishes.

comments powered by Disqus