Friday, 25 May 2018

The SAP HANA JSON Document Store – Introduction Part-1

Overview


The SAP HANA JSON Document Store (also known as DocStore or Document Store) is a new feature which has been introduced with SAP HANA 2.0 SPS 01. The new store combines a relational and document-oriented database to a hybrid innovative technology which is unique for a variety of reasons, namely, its ACID compliant, fully integrated with SAP HANA in terms of access/query and administrative capabilities.

The embedded Document Store belongs to the group of “NoSQL” databases, more precisely to the document-oriented ones. These type of storing technologies are storing semi-structured documents (most JSON or XML) in collections without an explicit structure which offers high flexibility and compactness.

SAP HANA Certifications, SAP HANA Learning, SAP HANA Study Materials, SAP HANA JSON

Beside the fact that the Document Store offers the possibility to use a document-oriented database directly and fully integrated in SAP HANA without the necessity of operating another independent database in parallel, it features full ACID properties. This way a single transaction may span all stores of SAP HANA and offer the same qualities in terms of atomicity, consistency, transaction isolation and durability. Given that the Document Store is a regular SAP HANA service, the known features Backup & Recovery, System Replication and Failover work out of the box without additional administrative overhead. SAP HANA allows interactions – especially joins – between collections and relational database objects like tables. Furthermore, with complex path expressions it’s possible to extract relevant portions of the document.

Terms


Beside the known terms like tables or schemas, this blog and the documentation of the Document Store uses some (new) terms which will be explained as follows:

Semi-structured data: Data which is not fixed in its structure but has the structure information in itself. In contrary, structured data like tables has a constant or fixed structure which must be defined before inserting data.

Collection: A collection holds multiple documents and is assigned to a schema. This is comparable to a table with the difference that a collection doesn’t have a predefined structure (column definition).

Document: A document in the Document Store is a semi-structured document in the JSON format. Such a document is like a row in a table. In this analogy the keys of the JSON document are the columns of the table.

Statement Examples


Since the Document Store is being used in relational database context, SQL is used as the query language. For that some new expressions and keywords where introduced to enrich SQL with the needs of the Document Store. In the following section the most commonly used statements are illustrated.

Enablement of the Document Store


Since the document store is implemented as an additional store in SAP HANA that comes with its own process, it has to be enabled by the administrator in the SYSTEMDB for a specific tenant.

ALTER DATABASE <database> ADD 'docstore';

Create a collection

This statement creates a new collection called MyCollection into the current schema. This is like CREATE TABLE, but without defining the column characteristics. Users can create as many collections as needed.

CREATE COLLECTION MyCollection;

Drop collection

By using the DROP COLLECTION statement, the whole collection will be deleted. This statement behaves like the known DROP statements.

DROP COLLECTION MyCollection;

Insert

The insert statement of the document store takes one JSON document as an argument without an optional column definition. The newly document must be valid JSON, but documents may have different identifiers or structure.

INSERT INTO MyCollection VALUES({
  "name":'John Doe',
  "address": {
    "city": 'Berlin',
    "street": 'Street 22'
  }
});

Select

Selecting values from a collection is similar to the selection from a table. Furthermore, it is possible to access nested fields via a path by using the dot operator. The statement is tolerant to non-existing fields.

SELECT "name", "address"."city" AS "city" FROM MyCollection WHERE "name" = 'John Doe';

This returns a result set with the columns name and city where the name equals John Doe.

Update

To perform updates on the data, the update statement should be used. Beside the simple updating of values, this operation can be used for adding or deleting field or for replacing whole documents.

UPDATE MyCollection SET "address"."city" = 'Munich' WHERE "name" = 'John Doe';

Delete

As the statement name implicates, it deletes documents from a collection.

DELETE FROM MyCollection WHERE "name" = 'John Doe';

Conclusion


SAP HANA already provides capabilities for graph, spatial, hierarchies and for relational tables of course. By introducing a document store the set of capabilities is enriched. This enables applications that are built on SAP HANA to use the best from each database technology. Especially they can mix different technologies with the well-known relational world in an intuitive way. This leads to many advantages, such as the ability to use a flexible and dynamic kind of storing data and the availability of using both database technologies at the same time. Overall it reduces administration overhead since only one database needs to be maintained and offers innovative development.