SAP HANA Architecture & Overview
SAP HANA is an in-memory database.
- It is a combination of hardware and software made to process massive real time data using In-Memory computing.
- It combines row-based, column-based database technology.
- It’s best suited for performing real-time analytics, and developing and deploying real-time applications.
An in-memory database means all the data
is stored in the memory (RAM). This is no time wasted in loading the
data from hard-disk to RAM or while processing keeping some data in RAM
and temporary some data on disk. Everything is in-memory all the time,
which gives the CPUs quick access to data for processing.
SAP HANA is equipped with multiengine
query processing environment which supports relational as well as
graphical and text data within same system. It provides features that
support significant processing speed, handle huge data sizes and text
mining capabilities.
SAP has partnered with leading hardware vendors (HP, Fujitsu, IBM, Dell etc) to sell SAP certified hardware for HANA.
SAP is selling licenses and related
services for the SAP HANA product which includes the SAP HANA database,
SAP HANA Studio and other software to load data in the database.
The SAP HANA database is developed in C++.
Currently SUSE Linux Enterprise Server x86-64 (SLES) 11 SP1 is the Operating System supported by SAP HANA.
NO. You might have performance gains due
to more memory available for your current Oracle/Microsoft/Teradata
database but HANA is not just a database with bigger RAM.
It is a combination of a lot of hardware
and software technologies. The way data is stored and processed by the
In-Memory Computing Engine (IMCE) is the true differentiator. Having
that data available in RAM is just the icing on the cake.
Row based tables:
- It is the traditional Relational Database approach
- It store a table in a sequence of rows
Column based tables:
- It store a table in a sequence of columns i.e. the entries of a column is stored in contiguous memory locations.
- SAP HANA is particularly optimized for column-order storage.
SAP HANA supports both row-based and column-based approach.
Following figure explains the difference between the two storage mechanism.
Row based tables have advantages in the following circumstances:
- The application needs to only process a single record at one time (many selects and/or updates of single records).
- The application typically needs to access a complete record (or row).
- Neither aggregations nor fast searching are required.
- The table has a small number of rows (e. g. configuration tables, system tables).
Row based tables have advantages in the following circumstances:
- In case of analytic applications where aggregation are used and fast search and processing is required. In row based tables all data in a row has to be read even though the requirement may be to access data from a few columns.
Advantages:
- Faster Data Access:
Only affected columns have to be read during the selection process of a query. Any of the columns can serve as an index.
- Better Compression:
Columnar data storage allows highly
efficient compression because the majority of the columns contain only
few distinct values (compared to number of rows).
- Better parallel Processing
In a column store, data is already
vertically partitioned. This means that operations on different columns
can easily be processed in parallel. If multiple columns need to be
searched or aggregated, each of these operations can be assigned to a
different processor core
SQL queries involving aggregation
functions take a lot of time on huge amounts of data because every
single row is touched to collect the data for the query response.
In columnar tables, this information is
stored physically next to each other, significantly increasing the speed
of certain data queries. Data is also compressed, enabling shorter
loading times.
Conclusion:
To enable fast on-the-fly aggregations,
ad-hoc reporting, and to benefit from compression mechanisms it is
recommended that transaction data is stored in a column-based table.
The SAP HANA data-base allows joining
row-based tables with column-based tables. However, it is more efficient
to join tables that are located in the same row or column store. For
example, master data that is frequently joined with transaction data
should also be stored in column-based tables.
Since the SAP HANA database resides
entirely in-memory all the time, additional complex calculations,
functions and data-intensive operations can happen on the data directly
in the database. Hence materialized aggregations are not required.
It also provides benefits like
- Simplified data model
- Simplified application logic
- Higher level of concurrency
With availability of Multi-Core CPUs,
higher CPU execution speeds can be achieved. HANA Column-based storage
makes it easy to execute operations in parallel using multiple processor
cores. In a column store data is already vertically partitioned. This
means that operations on different columns can easily be processed in
parallel. If multiple columns need to be searched or aggregated, each of
these operations can be assigned to a different processor core. In
addition operations on one column can be parallelized by partitioning
the column into multiple sections that can be processed by different
processor cores. With the SAP HANA database, queries can be executed
rapidly and in parallel.
The SAP HANA database is developed in C++
and runs on SUSE Linux Enterpise Server. SAP HANA database consists of
multiple servers and the most important component is the Index Server.
SAP HANA database consists of Index Server, Name Server, Statistics
Server, Preprocessor Server and XS Engine.
Index Server:
- Index server is the main SAP HANA database component
- It contains the actual data stores and the engines for processing the data.
- The index server processes incoming SQL or MDX statements in the context of authenticated sessions and transactions.
Persistence Layer:
The database persistence layer is
responsible for durability and atomicity of transactions. It ensures
that the database can be restored to the most recent committed state
after a restart and that transactions are either completely executed or
completely undone.
Preprocessor Server:
The index server uses the preprocessor
server for analyzing text data and extracting the information on which
the text search capabilities are based.
Name Server:
The name server owns the information
about the topology of SAP HANA system. In a distributed system, the name
server knows where the components are running and which data is located
on which server.
Statistic Server:
The statistics server collects
information about status, performance and resource consumption from the
other servers in the system.. The statistics server also provides a
history of measurement data for further analysis.
Session and Transaction Manager:
The Transaction manager coordinates
database transactions, and keeps track of running and closed
transactions. When a transaction is committed or rolled back, the
transaction manager informs the involved storage engines about this
event so they can execute necessary actions.
XS Engine:
XS Engine is an optional component. Using XS Engine clients can connect to SAP HANA database to fetch data via HTTP.
In traditional data warehouses, such as
SAP BW, a lot of pre-aggregation is done for quick results. That is the
administrator (IT department) decides which information might be needed
for analysis and prepares the result for the end users. This results in
fast performance but the end user does not have flexibility.
The performance reduces dramatically if
the user wants to do analysis on some data that is not already
pre-aggregated. With SAP HANA and its speedy engine, no pre-aggregation
is required. The user can perform any kind of operations in their
reports and does not have to wait hours to get the data ready for
analysis.
131072
127 characters
127 characters
1000
1000
1000
Limited by storage size RS: 1TB/sizeof(row)
CS: 2^31 * number of partitions
M_SYSTEM_LIMITS
Finance economy finance & insurance money derivatives wall street young money got money cash money get money
Credit for your article! The Author has done a great job. And this blog contains SAP HANA Interview Questions and Answers. Thanks for such a useful information’s. But I had found a better website related to SAP HANA Training. Just have looks: SAP HANA Training
ReplyDeleteThanks for this great information. That’s an awesome article, you posted. very interesting , good job and thanks for sharing such a good blog.
ReplyDeleteHere’s another informative blogs.
SAP HANA Interview Questions
SAP HANA Architecture
Nice post.Thanks for sharing to all through your blog
ReplyDeleteReally thanks for sharing such an useful and informative stuff... very knowledgable post..
ReplyDeletesap hana admin online training
Single purpose German Shepherds can be used as patrol or detection dogs. We sell these dogs to law enforcement all over the world.
ReplyDeleteOur dogs all have high drives to work for the handler or trainer.
All our single purpose german shepherd [url=https://docs.google.com/document/u/1/d/e/2PACX-1vSsiNeJsmsjbDZpyTTCa7s84yOQxNyysvkx0x6nFuVbpD80s2lAZhlByrXNxCNa115c_ItQvzvDwWMB/pub]police dogs for sale[/url] are delivered with an international passport that holds vaccination records.