Menu
Is free
check in
the main  /  Firmware / OLAP program. OLAP Systems

OLAP program. OLAP Systems

4. Classification of OLAP products.

5. Principles of OLAP clients.

7. Scope of application OLAP-technologies.

8. An example of using OLAP technologies to analyze in sales.

1. Place OLAP in the information structure of the enterprise.

The term "OLAP" is inextricably linked with the term "data warehouse" (Data Warehouse).

The data in the repository fall from operational systems (OLTP systems), which are designed to automate business processes. In addition, the storage can be replenished due to external sources, such as statistical reports.

The task of the repository is to provide "raw materials" to analyze in one place and in a simple, understandable structure.

There is another reason that justifies the appearance of a separate repository - complex analytical requests for operational information inhibit the current work of the company, blocking tables for a long time and capturing server resources.

Under the repository, you can not understand the necessary gigantic data accumulation - the main thing is that it is convenient for analysis.

Centralization and convenient structuring is not all that is needed by analytics. He still requires a tool for viewing, visualizing information. Traditional reports, even built on the basis of a single storage, are deprived of one - flexibility. They cannot be "twist", "deploy" or "Collapse" to get the desired data presentation. That would be His tool that would allow to deploy and turn the data simply and comfortable! As such a tool and performs OLAP.

Although OLAP is not a necessary attribute of data warehouse, it is increasingly used more often to analyze the information accumulated in this storage.

OLAP place in the information structure of the enterprise (Fig. 1).

Picture 1. A placeOLAP. in the information structure of the enterprise

Operational data are collected from various sources, cleaned, integrated and folded into relational storage. At the same time, they are already available for analysis using various means of building reports. Then data (fully or partially) is prepared for OLAP analysis. They can be loaded into a special OLAP database or left in a relational storage. The most important element is metadata, i.e. information on the structure, placement and transformation of data. Thanks to them, the effective interaction of various storage components is ensured.

Summing up, it is possible to determine OLAP as a set of means of multidimensional analysis of data accumulated in the repository.

2. Operational analytical data processing.

The basis of the concept of OLAP lies the principle of multidimensional data presentation. In 1993, EF CODD considered the shortcomings of the relational model, first of all, indicating the inability to "combine, view and analyze data from the point of view of the multiplicity of measurements, that is, the most understandable for corporate analysts in the method," and identified general requirements for OLAP systems expanding functionality relational DBMS and including multi-dimensional analysis as one of its characteristics.

By code, a multidimensional conceptual representation of data is a multiple perspective consisting of several independent measurements, along which certain sets of data can be analyzed.

Simultaneous analysis of several measurements is defined as multidimensional analysis. Each measurement includes data consolidation directions consisting of a series of consecutive levels of generalization, where each higher level corresponds to a greater degree of data aggregation on the appropriate measurement.

So, the measurement of the Contractor can be determined by the direction of consolidation consisting of the levels of generalization "Enterprise - Division - a department - employee." Measuring time may even include two consolidation directions - "Year-Quarter - a month - day" and "Week - Day", since the time counting by months and weeks are incompatible. In this case, it becomes possible to arbitrary choice of the desired level of detailing information on each of the measurements.

The descent operation (DRILLING DOWN) corresponds to the movement from the highest consolidation steps to the lower; On the contrary, the lift operation (ROLLING UP) means moving from the lower levels to the higher (Fig. 2).


Figure 2. Measurements and Directions Consolidation of Data

3. Requirements for the means of operational analytical processing.

The multidimensional approach arose almost simultaneously and in parallel with the relational. However, only starting from the middle of the nineties, or rather
1993, interest in IUUBD He began to acquire a universal character. It was this year that a new software article appeared one of the founders of the relational approach. E. Coddain which he formulated 12 basic requirements for implementing OLAP. (Table 1).

Table 1.

Multidimensional data representation

Tools must support multidimensional view of the data on the conceptual level.

Transparency

The user should not know what specific means are used to store and process data, as data are organized and where they are taken from.

Availability

Means must choose and communicate with the best to form a response to this request source of data. Tools should provide automatic display of their own logic circuit in various heterogeneous data sources.

Consistent performance

Performance practically should not depend on the number of measurements in the request.

Support architecture client server

Funds should work in the client-server architecture.

Equality of all measurements

None of the measurements should be basic, they all must be equal (symmetric).

Dynamic processing of sparse matrices

Uncertain values \u200b\u200bmust be stored and processed in the most efficient way.

Support multiplayer data mode

Tools should provide an opportunity to work more than one user.

Support for operations based on various measurements

All multidimensional operations (for example, aggregation) should be uniformly and coordinated to any number of any measurements.

Easy manipulation data

Funds must have the most convenient, natural and comfortable user interface.

Developed data submission tools

Funds must maintain various methods Visualization (presentation) of data.

Unlimited number of measurements and levels of data aggregation

There should be no restrictions on the number of supported measurements.

Rules for evaluating software products OLAP

A set of these requirements that served as the actual definition of OLAP should be considered as advisory, and specific products to assess the degree of approach to ideally complete compliance with all requirements.

Later, the definition of the code was reworked into the so-called FASMI test, which requires the OLAP application to provide the ability to quickly analyze the shared multidimensional information.

Remember the 12 Coddo rules too burdensome for most people. It turned out that you can summarize OLAP-definition only five keywords: Quick analysis of shared multidimensional information - or, briefly - FASMI (translated from English:F. ast. A. nalysis of. S. hared. M. ultidimensional I. nFORMATION).

This definition was first formulated in early 1995 and since then did not need to revise.

Fast ( Fast ) - indicates that the system should provide the issuance of most responses to users within about five seconds. At the same time, the simplest requests are processed within one second and very few - more than 20 seconds. Studies have shown that end users perceive the process unsuccessful if the results are not received after 30 seconds.

At first glance, it may seem surprising that when receiving a report for a minute, for which days were needed, the user very quickly begins to miss during expectations, and the project is far less successful than in the case of an instant answer, even at least less detailed analysis.

Analysis (analysis) means that the system can cope with any logical and statistical analysis characteristic of this applicationAnd ensures its saving in the form available to the end user.

It is not so important if this analysis is made in the supplier's own tools or in the associated external software product type of the spreadsheet, simply all the required functionality of the analysis must be provided with an intuitive way for end users. Analysis tools could include certain procedures, such as analyzing time series, cost distribution, currency transfers, search for purposes, changes in multidimensional structures, non-concentrated modeling, identifying exclusive situations, data extraction and other operations dependent. Such possibilities are widespread among the products, depending on the target orientation.

Shared (shared) This means that the system provides all confidentiality protection requirements (possibly to the cell level) and, if multiple access to recording is needed, provides a blocking of modifications at the appropriate level. Not all applications have the need to recording data. However, the number of such applications is growing, and the system must be able to handle multiple modifications timely, safe way.

Multidimensional (multidimensional) - this is a key requirement. If it was necessary to define OLAP in one word, they would have chosen it. The system should provide a multidimensional conceptual presentation of data, including full support for hierarchies and multiple hierarchies, since it is definitely the most logical way to analyze business and organizations. The minimum number of measurements that must be processed as it also depends on the application, and most OLAP products have a sufficient number of measurements for those markets that they are targeted.

Information Information - it's all. The necessary information must be obtained where it is necessary. However, much depends on the application. The power of various products is measured in terms of how many input data they can process, but not how much gigabytes they can store. Product power is very different - the largest OLAP products can operate at least a thousand times with a large number of data compared to the smallest. On this occasion, many factors should be taken into account, including data duplication required by the RAM, the use of disk space, operational indicators, integration with information storages, etc.

The FASMI test is a reasonable and understandable definition of goals, the achievement of which OLAP is focused.

4. ClassificationOLAP.- Products.

So, the essence of OLAP It is that the information source for analysis is presented in the form of a multidimensional cube, and it is possible to arbitrarily manipulate it and receive the necessary information cuts - reports. In this case, the end user sees a cube as a multidimensional dynamic table, which automatically summarizes the data (facts) in various cuts (measurements), and allows you to interactively manage calculations and a report form. Performing these operations is ensuredOLAP. - Machine (or machineOLAP-attacks).

To date, many products implemented in the worldOLAP. -technologies. To make it easier to navigate among them, use classificationsOLAP. - Products: According to the method of storage of data for analysis and locationOLAP. -cars. Consider each categoryOLAP products.

Classification for data storage

Multidimensional Cubes are based on the source and aggregate data. Both the source and aggregate data for cubes can be stored both in relational and multidimensional databases. Therefore, three methods of data storage are currently applied:MOLAP (Multidimensional Olap), Rolap (Relational Olap) and Holap (Hybrid Olap ). Respectively,OLAP. - Products according to the data storage method are divided into three similar categories:

1. In the case of MOLAP , initial and aggregate data is stored in a multidimensional database or in a multidimensional local Cuba.

2. In Rolap. - Products The source data is stored in relational databases or in flat local tables on the file server. The aggregate data can be placed in the service tables in the same database. Transformation of data from the relational database in multidimensional cubes occurs on requestOLAP-use.

3. In the case of useHolap. Architecture The initial data remains in the relational basis, and the units are placed in multidimensional. BuildingOLAP. -Kuba is performed on requestOLAP. - Relief based on relational and multidimensional data.

Classification at the placement OLAP.-cars.

For this signOLAP. - Products are divided byOLAP servers and OLAP-Clements:

· In server OLAP - Calculation and storage of aggregate data are performed by a separate process - server. The client application receives only the results of queries to multidimensional cubes that are stored on the server. SomeOLAP. -Servers support data storage only in relational bases, some are only in multidimensional. Many modernOLAP. -Servers support all three storage methods:Molap, Rolap and Holap.

Molap.

Molap is Multidimensional On-Line Analytical Processing,that is, multidimensional OLAP.This means that the storage server uses a multidimensional database (MBD). The meaning of the use of MBD is obvious. It can effectively store multidimensional data by its nature, providing quick service tools for databases. The data is transmitted from the data source into a multidimensional database, and then the database is exposed to aggregation. Preliminary calculation is something that accelerates OLAP queries, since the calculation of the consolidated data is already produced. The query time becomes the function of exclusively the time required to access a separate data fragment and calculation. This method supports the concept according to which the work is carried out once, and the results are then used again and again. Multidimensional databases are relatively new technology. The use of MBD has the same disadvantages as the majority of new technologies. Namely - they are not so stable as relational databases (RBD), and are not optimized to the same extent. Another weak point of the MBD is the impossibility of using most multidimensional databases in the data aggregation process, therefore it takes time so that the new information becomes available for analysis.

Rolap.

Rolap is Relational on-line Analytical Processing,that is relational OLAP.The term ROLAP indicates that the OLAP server is based on the relational database. The source data is entered into the relational database, usually according to the "Star" scheme or a snowflake scheme, which helps reduce the extraction time. The server provides a multidimensional data model using optimized SQL queries.

There are a number of reasons for selecting a relational, rather than a multidimensional database. RBD is a well-developed technology that has many possibilities for optimization. Using real conditions gave the result of a more worked product. In addition, the RBD support larger data volumes than MBD. They are just designed for such volumes. The main argument against the RBD is the complexity of the requests required to obtain information from a large database using SQL. Inexperienced SQL programmer could easily burden the valuable system resources attempts to perform some kind of request, which in the MBD is much simpler.

Aggregated / pre-aggregated data.

Fast query implementation is an imperative for OLAP. This is one of the basic principles of OLAP - the ability to intuitively manipulate data requires the rapid extraction of information. In general, the more computing it is necessary to produce a fragment of information, the slower the response occurs. Therefore, to keep a small query implementation time, information fragments, the appeal to which usually happens most often, but which require calculations, are pre-aggregation. That is, they are counted and then stored in the database as new data. As an example of a data type that is permissible to calculate in advance, it is possible to bring summary data - for example, sales indicators for months, quarters or years, for which daily indicators are really entered.

Various suppliers adhere to various methods of selecting parameters requiring pre-aggregation and the number of pre-calculated values. Approach to the aggregation affects the database and during the implementation of requests. If more values \u200b\u200bare calculated, the likelihood that the user requests an already calculated value increases, and therefore the response time is reduced, since it is not necessary to ask the initial value for calculating. However, if you calculate all possible values \u200b\u200b- this is not the best solution - in this case, the size of the database is significantly increasing, which will make it unmanaged, and the aggregation time will be too large. In addition, numerical values \u200b\u200bare added to the database, or if they change, this information should be reflected in pre-calculated values \u200b\u200bdepending on new data. Thus, the database update may also take a lot of time in the case of a large number of pre-calculated values. Since usually, during the aggregation, the database works autonomously, it is desirable that the aggregation time is not too long.

· Olap. -Client is designed differently. Building a multidimensional cube andOLAP. - Actings are performed in the memory of the client computer.OLAP. -cliners are also divided intoRolap and Molap. And some can support both data access options.

Each of these approaches, there are "pros" and "minuses". Contrary to the common opinion on the benefits of server tools in front of client, in general, a number of cases.OLAP. -clin for users can be more efficient and more profitableOLAP serve.

Development of analytical applications using client OLAP funds - the process is fast and does not require special preparation of the Contractor. The user who knows the physical implementation of the database can develop an analytical application independently without the attraction of an IT specialist.

When using the OLAP server, you must study 2 different systems, sometimes from various suppliers - to create cubes on the server, and to develop a client application.

The OLAP client provides a single visual interface to describe the cubes and setting up user interfaces.

So, in what cases, the application of the OLAP client for users can be more efficient and more profitable to use the OLAP server?

· Economic feasibility of applicationOLAP. -server occurs when the amount of data is very high and unbearable forOLAP. -clin, otherwise the use of the latter is more justified. In this caseOLAP. -client combines high performance characteristics and low cost.

· Powerful PC analysts - another argument in favorOLAP. -Clinents. When appliedOLAP. Slerver These power are not used.

Among the advantages of OLAP clients can also be called the following:

· Costs for implementation and maintenanceOLAP. -clin is significantly lower than the cost ofOLAP server.

· UsingOLAP. -clin with the built-in data transmission machine over the network is made once. While doingOLAP. - Operations of new data streams are not generated.

5. Principles of work OLAP.-Clinents.

Consider the process of creating an OLAP application using a client instrumental tool (Fig. 1).

Picture 1. Creating an OLAP application using a client ROLAP

The principle of operation of RolaP clients is a preliminary description of the semantic layer, which hides the physical structure of the source data. In this case, data sources can be: local tables, RDBD. The list of supported data sources is defined by a specific software product. After that, the user can independently manipulate objects understandable to it in terms of the subject area to create cubes and analytical interfaces.

The principle of operation of the OLAP server is different. In the OLAP server, when creating cubes, the user manipulates physical descriptions of the database. At the same time, custom descriptions are created in the Cuba itself. The OLAP server client is only configured to the cube.

When creating a semantic layer, data sources - SALES and DEAL tables are described by understandable end user terms and turn into "products" and "transactions". The "ID" field from the "Products" table is renamed to the "code", and "Name" - in the "product", etc.

Then the "Sales" business object is created. The business object is a flat table, based on a multidimensional cube. When creating a business object, the "Products" table and "transactions" are combined through the field "Code" of the goods. Since all fields of tables are not required to display in the report - the business object uses only the "product" fields, the "date" and "amount".

In our example on the basis of the "Sales" business facility, a report on sales of goods by month was created.

When working with an interactive report, the user can set the filtering and grouping conditions with the same simple movements "Mouse". At this point, the ROLAP client appeals to the data in the cache. The client of the OLAP server generates a new request to a multidimensional database. For example, applying a sales filter in sales report, you can get a report on sales of goods of interest to us.

All OLAP applications settings can be stored in a dedicated metadata repository, in an application or in a multidimensional database system repository. Implementation depends on the specific software product.

All that is included in these applications is a standard view on the interface, predetermined functions and structure as well as fast solutions For more or less standard situations. For example, financial packages are popular. Pre-created financial applications will allow specialists to use familiar financial instruments without having to design a database structure or generally accepted forms and reports.

The Internet is new form client. In addition, he carries a print of new technologies; lots of internet solutions Significantly different in their capabilities as a whole and as an OLAP solution - in particular. There are plenty of advantages in the formation of OLAP reports via the Internet. The most significant is the lack of need for specialized software to access information. It saves the enterprise a bunch of time and money.

6. Select the OLAP application architecture.

When implementing an information and analytical system, it is important not to be mistaken in choosing an OLAP application architecture. The literal translation of the term on-line Analytical Process - "Operational Analytical Processing" - is often perceived literally in the sense that the data entering the system is quickly analyzed. This is a misconception - the analysis of the analysis is not related to the actual data update time in the system. This characteristic refers to the response time of the OLAP system to user requests. At the same time, often the analyzed data is a snapshot of information "for yesterday", if, for example, data in the storages are updated once a day.

In this context, OLAP is more accurate as "Interactive Analytical Processing". It is the possibility of analyzing data in interactive mode that distinguishes the OLAP system from the systems for the preparation of regulated reports.

Another feature of interactive processing in the formulation of the OLAP OLAP OLAP is the ability to "combine, view and analyze data from the point of view of the multiplicity of measurements, i.e., the most understandable for corporate analysts in the manner." At the code itself, the term OLAP denotes a solely specific way to represent data at the conceptual level - multidimensional. At the physical level, data can be stored in relational databases, but in fact, OLAP instruments, as a rule, are operating with multidimensional databases, in which the data is ordered as a hypercube (Fig. 1).

Picture 1. OLAP. - cube (hypercub, metakub)

In this case, the relevance of these data is determined by the moment of filling the hypercuba with new data.

Obviously, the formation time of a multidimensional database significantly depends on the amount of data being loaded into it, so it is reasonable to limit this volume. But how not to narrow the ability to analyze and not deprive the user of access to all the information you are interested in? There are two alternative paths: Analyze Then Query ("First analyze - then request additional information") and Query then Analyze ("First request data - then analyzed").

Followers of the first path offer to upload a generalized information into a multidimensional database, for example, monthly, quarterly, annual results by divisions. And if you need to detail these data, the user is invited to form a report on a relational basis containing the required sample, for example, by day for a given unit or by month and staff of the selected division.

Supporters of the second path, on the contrary, offer the user, first of all, to decide on the data, which it is going to analyze and precisely download them to the microcub - a small multidimensional database. Both approaches differ in the conceptual level and have their advantages and disadvantages.

The advantages of the second approach should include "freshness" of information that the user receives in the form of a multidimensional report is "microcub". The microcub is formed based on just requested information from the current relational database. Working with a microcube is carried out in interactive mode - obtaining sections of information and its detail within the microcuba is instantly carried out. Another positive point is that the design of the structure and the microcuba filling is carried out by the user "on the fly", without the participation of the database administrator. However, the approach suffers from serious shortcomings. The user does not see the general picture and must be determined in advance with the direction of his research. Otherwise, the requested microcub may be too small and do not contain all the data of the data, and the user will have to request a new microcub, then the new one, then also. The Query Then Analyze approach implements the BusinessObjects tool to the company of the company of the same name and tools Platforms Contour companyIntersoft.Lab.

When you approach Analyze Then Query, the amount of data loaded into a multidimensional database may be quite large, the filling must be executed according to the regulations and can take quite a lot of time. However, all these drawbacks pay off afterwards when the user has access to almost all necessary data in any combination. Appeal to the source data in the relational database is carried out only as a last resort when necessary detailed informationFor example, on a specific invoice.

On the operation of a single multidimensional database, the number of users appealing to it practically does not affect. They only read the data available there in contrast to the Query Then Analyze approach, in which the number of microcubes in the limiting case can grow at the same speed as the number of users.

With this approach, the load on IT services is increasing, which other databases are also forced to serve even the relational databases. It is these services that are responsible for the timely automatic updating of data in multidimensional databases.

The most striking representatives of the Analyze Then Query approach are the tools PowerPlay and Impromptu company Cognos.

The choice and approach, and the instrument of its implementing, depending first of all of the target goal: always have to balance between budget savings and improve the quality of service of end users. It should be borne in mind that in the strategic plan, the creation of information and analytical systems is haunted by the goal of achieving a competitive advantage, and not avoiding expenses for automation. For example, a corporate information and analytical system can provide the necessary, timely and reliable information about the company, the publication of which for potential investors will ensure the transparency and predictability of this company, which will inevitably become the condition of its investment attractiveness.

7. Scope of application OLAP-technologies.

OLAP Apply wherever there is a task of analyzing multifactor data. In general, if there is some table with data, in which there are at least one descriptive column (measurement) and one column with numbers (measures or facts) OLAP tool, as a rule, will be an effective means of analyzing and generating reports.

Consider some applications of OLAP technologies taken from real life.

1. Sales.

Based on the analysis of the sales structure, questions are resolved necessary for making management decisions: the change in the range of goods, prices, closing and opening stores, branches, termination and signing contracts with dealers, or termination of advertising campaigns, etc.

2. Procurement.

Task back opposite sales analysis. Many enterprises buys components and materials from suppliers. Trade enterprises buy goods for resale. Possible tasks when analyzing procurement set, from planning money based on past experience, to controls for managersChoosing suppliers.

3. Prices.

Analysis of market prices is closed with procurement analysis. The purpose of this analysis is to optimize the costs, the choice of the most profitable proposals.

4. Marketing.

Under marketing analysis, we will keep in mind only the area of \u200b\u200bcustomer analysis or consumer customer services. The task of this analysis is the correct positioning of the goods, identifying groups of buyers for target advertising, the optimization of the range. The OLAP task in this case is to give the user a quick tool, at the rate of thought, receiving answers to questions, intuitively occurring data analysis.

5. Warehouse.

Analysis of the structure of residues in a warehouse in the context of types of goods, warehouses, analysis of the goods storage, an analysis of shipment by recipients and many other important types of analysis are possible if there is a warehouse accounting.

6. Money movement.

This is a whole area of \u200b\u200banalysis that has many schools and techniques. OLAP technology can serve as a tool for implementing or improving these techniques, but not to replace them. Cash rates of non-cash and cash in the R Azreza business operations, counterparties, currencies and time to optimize flows, liquidity, etc. are analyzed. The composition of measurements strongly depends on the specifics of the business, industry, techniques.

7. Budget.

One of the most fertile applications of OLAP technologies. Not a gift, no modern budgeting system is considered to be completed without the availability of OLAP tools for the budget analysis. Most budget reports are easily built based on OLAP systems. At the same time, reports respond to a very wide range of issues: analysis of the cost of expenses and income, comparing costs for certain articles from different units, analysis of the dynamics and trends of expenses for certain articles, cost analysis and profits.

8. Accounting accounts.

A classic balance sheet report consisting of an account number and containing incoming residues, turns and outgoing residues can be perfectly analyzed in the OLAP system. In addition, the OLAP system can automatically and very quickly calculate the consolidated balance sheets of a multifilia organization, balances per month, quarter and year, aggregated balance hierarchy balances, analytical balances based on analytical signs.

9. Financial statements.

A technologically built reporting system has nothing like a set of named indicators with values \u200b\u200bon the date that is required to be grouped and summed up in various cuts to obtain specific reports. When it is so, the display and printing reports are most simple and cheap are implemented in OLAP systems. In any case, the internal reporting system of the enterprise is not as conserved and can be rebuilt in order to save funds for technical work on the creation of reports and the possibilities of multidimensional operational analysis.

10. Attendance of the site.

The log file of the Internet server is multidimensionally by nature, and therefore suitable for OLAP analysis. Facts are: Number of visits, number of hits, time spent on the page and other information available in the log.

11. Production volumes.

This is another example of statistical analysis. Thus, it is possible to analyze the volumes of grown potatoes, the steel produced produced by the goods.

12. Consumption of consumables.

Imagine a plant consisting of dozens of shops in which cooling, washing liquids, oils, rags, sandpaper - hundreds of consumables are consuming. For accurate planning, cost optimization requires a thorough analysis of the actual consumption of consumables.

13. Use of premises.

Another type of statistical analysis. Examples: Analysis of the workload of learning audiences, leased to buildings and premises, using halls for conferences, etc.

14. Personnel fluidity in the enterprise.

Analysis of the flow of personnel in the enterprise in the context of branches, departments, professions, level of education, gender, age, time.

15. Passenger traffic.

Analysis of the number of tickets and sums in the context of seasons, directions, types of cars (classes), trains of trains (aircraft).

This list is not limited to the scope of applicationOLAP. - Technology. For example, consider technologyOLAP. - Analysis in sales.

8. Example of useOLAP. -The technologies for analysis in the sphere of sales.

Designing multidimensional data presentation forOLAP. "Analysis begins with the formation of a measurement card. For example, when analyzing sales, it may be advisable, to allocate individual parts of the market (developing, stable, large and small consumers, the likelihood of new consumers, etc.) and evaluate sales of products, territories, buyers, market segments, sales channels and Order sizes. These directions form a coordinate grid of a multidimensional presentation of sales - the structure of its measurements.

Since the activity of any enterprise proceeds in time, the first question that occurs when analyzing is the question of the dynamics of business development. The correct organization of the time axis will allow you to qualitatively answer this question. Usually the axis of time is divided for years, quarters and months. Perhaps even more crushing for weeks and days. The temporary measurement structure is formed, taking into account the frequency of data receipt; It can also be determined by the frequency of demanding information.

The measurement of the "group of goods" is being developed so as to maximize the degree of reflect the structure of the products sold. At the same time, it is important to comply with a certain balance that, on the one hand, to avoid excessive detail (the number of groups should be foreseeable), and on the other - not to miss a significant market segment.

The measurement of "customers" reflects the structure of sales on the territorial geographical basis. In each dimension, there may be a rarhy, for example, in this measurement it may be a structure: countries - regions - cities - customers.

To analyze the effectiveness of units, it is necessary to create a measurement. For example, two levels of hierarchy can be distinguished: departments and departments included in them, which should be reflected in the dimensional dimension.

In fact, measurements "Time", "Goods", "Customers" quite fully determine the space of the subject area.

Additionally, it is useful to break this space on the conditional areas, taking the calculated characteristics as a basis, for example, the ranges of transactions in value terms. Then the whole business can be divided into a number of valuable ranges in which it is carried out. In this example, it can be limited to the following indicators: sales of goods, the number of goods sold, the amount of income, the number of transactions, the number of customers, the volume of purchases from manufacturers.

OLAP - Cube for analysis will be viewed (Fig. 2):


Figure 2.OLAP. - Cube for sales analysis

That is the three-dimensional array in terms of OLAP and is called a cube. In fact, from the point of view of strict mathematics, such an array will not always be: in this Cuba, the number of elements in all dimensions should be the same, and there are no such restrictions from OLAP cubes. OLAP cube does not necessarily have to be three-dimensional. It can be two-, and multidimensional - depending on the task being solved. Serious OLAP products are calculated on the number of measurements of order 20. More simple desktop applications are maintained somewhere 6 measurements.

Not all the elements of the Cuba must be filled: if there is no information about sales of goods 2 Client 3 in the third quarter, the value in the appropriate cell is simply not defined.

However, the cube itself is not suitable for analysis. If you can still adequately submit or portray a three-dimensional cube, then with six- or nineteenimenal The situation is much worse. Therefore, conventional two-dimensional tables are removed before using multidimensional cube. This operation is called "cutting" cube. Analyst as if takes and "cuts" the measurements of the Cuba for its interests. By this way, the analyst receives a two-dimensional cut Cuba (report) and works with it. The report structure is presented in Figure 3.

Figure 3.Structure of an analytical report

We will cut our OLAP cube and we will receive a sales report for the third quarter, it will have the following form (Fig.4).

Figure 4. Sales Report for Third Quarter

You can cut a cube along another axis and get a report on sales of the group of goods 2 during the year (Fig. 5).

Figure 5.Quarterly report on sales of goods 2

Similarly, you can analyze the relationship with the client 4, cutting a cube by client label (Fig. 6)

Figure 6. Report on the supply of goods to the client 4

You can detail the report on months or talk about the supply of goods to a certain client branch.

data warehouse form on the basis of fixed periods of instant time of the databases of operational information system And perhaps various external sources. In data warehouses, database technology, OLAP, depth analysis, data visualization are used.

The main characteristics of data warehouses.

  • contains historical data;
  • stores details, as well as partially and fully generalized data;
  • the data is mainly static;
  • non-inflammatted, unstructured and heuristic data processing method;
  • medium and low transaction processing intensity;
  • unpredictable method of using data;
  • intended for analysis;
  • focused on subject areas;
  • support for making strategic decisions;
  • serves a relatively small number of senior workers.

OLAP The term (on-line Analytical Processing) is used to describe the data representation model and according to their processing technology in data warehouses. The OLAP uses a multidimensional view of aggregated data to ensure quick access to strategically important information for in-depth analysis. OLAP applications must have the following basic properties:

  • multidimensional data presentation;
  • Support for complex calculations;
  • right accounting of the time factor.

OLAP advantages:

  • raising performance production personnel, developers applied software. Timely access to strategic information.
  • providing users with sufficient opportunities to make their own changes in the scheme.
  • oLAP applications are based on data warehouse and the OLTP system, receiving actual data from them, which gives preservation control of integrity corporate data.
  • reduced load on the OLTP system and data warehouse.

OLAP and OLTP. Characteristics and main differences

OLAP. OLTP.
Data store It should include both internal corporate data and external data. the main source of information coming into the operational database is the activities of the Corporation, and for data analysis requires the attraction of external sources of information (for example, statistical reports)
The volume of analytical database is at least an order of magnitude more than operational. for reliable analysis and forecasting in data store It is necessary to have information about the activities of the Corporation and the state of the market for several years For operational processing requires data in the last few months
Data store It should contain uniformly presented and consistent information, the most appropriate content of operational databases. The component is necessary for the extraction and "cleaning" of information from different sources. In many major corporations, there are several operational IPs with their own databases (for historical reasons). Operational databases may contain semantically equivalent information presented in different formats, with different indication of its arrival time, sometimes even conflicting
A set of requests for an analytical database is impossible to predict. data warehouse There are in order to respond to non-elected analysts requests. You can only count on the fact that requests will not flow too often and affect large amounts of information. The dimensions of the analytical database stimulate the use of requests with aggregates (amount, minimum, maximum, mean etc.) Data processing systems are created per solution specific tasks. Information from the database is selected often and in small portions. Usually a set of requests to the operational database is already known when designing
With low variability of analytical databases (only when data loading), there are reasonable arrays, faster indexing methods for mass sample, storage of pre-aggregated data Data processing systems are very variable, which is taken into account in the DBMS used (the normalized structure of the database, the strings are stored disordered, b- trees for indexing, transactionality)
Information of analytical database is so critical for the corporation that a large protection granulation is required (individual access rights to certain lines and / or table columns) For data processing systems, usually grabs information protection At the table level

Code code for OLAP systems

In 1993, the code published labor called "OLAP for analyst users: what should it be." It has outlined the basic concepts of operational analytical processing and identified 12 rules to be satisfied with the products that provide the ability to perform operational analytical processing.

  1. Conceptual multidimensional representation. OLAP model should be multidimensional at its base. A multidimensional conceptual scheme or custom representation facilitate modeling and analysis as well, however, as a calculation.
  2. Transparency. The user is able to get all the necessary data from OLAP-Mashina, without even suspect where they come from. Regardless of whether the OLAP product is part of the user's tools or not, this fact must be invisible to the user. If OLAP is provided by the Customer Customer, then this fact also, if possible, should be invisible to the user. OLAP should be provided in the context of a truly open architecture, allowing the user wherever it is, to communicate with the help of an analytical tool with the server. In addition, transparency should be achieved and in the interaction of an analytical tool with homogeneous and heterogeneous database environments.
  3. Availability. OLAP must provide its own logical scheme For access in a heterogeneous database environment and perform appropriate conversions to provide data to the user. Moreover, it is necessary to take care in advance about where and how, and what types physical organization data will really be used. OLAP systema must only be accessed to actually requiring data, and not apply general principle "Kitchen funnel", which entails unnecessary input.
  4. Constant performance When developing reports. Performance Reporting does not need to significantly fall with an increase in the number of measurements and database sizes.
  5. Customer-Server architecture. It is required that the product is not only the servers, but also that the server component is intelligent enough for various clients to connect with minimum effort and programming.
  6. General multidimensionality. All measurements must be equivalent, each measurement must be equivalent to both in the structure, and in operational capabilities. True, additional operational capabilities are allowed for individual measurements (apparently, the time is implied), but such additional functions Must be provided to any measurement. Should not be so that the basic data structuresComputational or reporting formats were more characteristic of some one dimension.
  7. Dynamic governance robble matrices. OLAP systems must automatically customize their physical circuit pattern depending on the type of model, data volumes and database clearance.
  8. Multiplayer support. OLAP tool must provide opportunities joint access (Request and additions), integrity and safety.
  9. Unlimited cross-operations. All types of operations must be allowed for any measurements.
  10. Intuitive data manipulation. Data manipulation was carried out through direct actions on cells in viewing mode without using the menu and multiple operations.
  11. Flexible reception capabilities. Measurements must be posted in the report as it is necessary to the user.
  12. Unlimited

Introduction

Nowadays, almost any organization, especially among those who are traditionally focused on interaction with customers, do not work in our time without database management systems. Banks insurance companies, Aviation and other transport companies, supermarket chains, telecommunication and marketing firms, organizations employed in the service sector and others - they all collect and store in their databases of gigabytes of customers, products and services. The value of such information is undoubted. Such databases are called operating or transactional, since they are characterized by a huge number of small transactions, or read-write operations. Computer systemsAccounting for operations and actually access to transaction databases is customary to be called transaction operational systems (OLTP - on-Line Transactional Processing) or accounting systems.

Accounting systems are configured and optimized to perform the maximum number of transactions in short intervals. Usually, individual operations are very small and are not connected with each other. However, every data record that characterizes the interaction with the client (call to support service, cash operation, order by catalog, visiting the company's website, etc.) can be used to obtain qualitatively new information, namely to create reports and analyzing the activities of the company .

A set of analytical functions in accounting systems is usually very limited. The schemes used in the OLTP applications complicate the creation of even simple reports, since the data is most often distributed over the set tables, and complex combined operations must be performed for their aggregation. As a rule, attempts to create comprehensive reports require large computing power and lead to loss of performance.

In addition, constantly changing data is stored in accounting systems. As transactions are harvesting, the total values \u200b\u200bare changed very quickly, so two analyzes carried out in the interval in a few minutes can give different results. Most often, the analysis is performed at the end of the reporting period, otherwise the picture may be distorted. In addition, data required for analysis can be stored in several systems.

Some types of analysis require such structural changes that are invalid in the current operational environment. For example, you need to find out what happens if the company appears new products. A lively base is impossible to carry out such research. Consequently, an effective analysis is rarely possible to perform directly in the accounting system.

Solution Support Support Systems usually have the means to provide the user of aggregate data for various samples from the source set in a convenient to perceive and analysis. As a rule, such aggregate functions form a multidimensional (and, therefore, a non-relational) dataset (often called a hypercubus or a metakuch), the axes of which contain parameters, and cells - the aggregate data depending on them - and the same data can be stored in relational tables. Along every axis data can be organized in the form of a hierarchy representing various levels their details. Thanks to this data model, users can formulate complex requestsgenerate reports to obtain subsets of data.

It was precisely the interest in supporting decision-making systems that have become the main scope of OLAP (on-line Analytic Processing, operational analytical processing, operational analysis of data), which turns the "ore" of OLTP systems to the finished "product", which managers and analysts can directly use. This method allows analysts, managers and managers to "enter into the essence" of the accumulated data by rapid and coordinated access to a wide range of information representations.

Purpose term paper It is consideration of OLAP technology.

multidimensional analytical processing

Main part

1 OLAP Basic Information

The basis of the concept of OLAP lies the principle of multidimensional data presentation. In 1993, the term Olaped Edgar Codd. Having considered the shortcomings of the relational model, he first indicated the inability to "combine, view and analyze data from the point of view of the multiplicity of measurements, that is, the most clear for corporate analysts in the method," and determined general requirements for OLAP systems that expands the functionality of relational DBMS and comprising multidimensional Analysis as one of its characteristics.

In a large number of publications, the OLAP abbreviation is indicated not only a multidimensional view of the data, but also the storage of the data themselves in the multidimensional database. Generally speaking, it is incorrect, since the Code itself notes that "relational databases were, there are the most appropriate technology for storing corporate data. The need exists not in new technology DB, A, Rather, in the analysis tools that complement the functions of existing DBMS and are quite flexible to provide and automate different types of intellectual analysis inherent in OLAP. Such confusion leads to oppositions like "OLAP or ROLAP", which is not entirely correct because Rolap (Relational OLAP) At the conceptual level supports the entire thermal OLAP functionality. It seems more preferable to use for OLAP based on multidimensional DBMS of the Special Terminal MOLAP. By code, a multidimensional conceptual representation (Multi-Dimensional Conceptual View) is a multiple perspective consisting of several independent Measurements along which certain sets of data can be analyzed. Simultaneous analysis by multiple measurements is defined as multi-dimensional analysis. Each measurement includes data consolidation directions consisting of a series of consecutive levels of generalization, where each higher level ny corresponds to a greater degree of aggregation of data on the appropriate measurement. So, measurement.

The performer can be determined by the direction of consolidation consisting of levels of generalization "Enterprise - Division - a department - employee." Measuring time may even include two consolidation directions - "Year-Quarter - a month - day" and "Week - Day", since the time counting by months and weeks are incompatible. In this case, it becomes possible to arbitrary choice of the desired level of detailing information on each of the measurements. The descent operation (DRILLING DOWN) corresponds to the movement from the highest consolidation steps to the lower; On the contrary, the lifting operation (ROLLING UP) means moving from the lower levels to the highest.

Codd defined 12 rules to satisfy software class OLAP.

1.2 Requirements for the means of operational analytical processing

Multidimensional conceptual data presentation (Multi Dimensional Conceptual View). The conceptual representation of the data model in the OLAP product should be multidimensional in nature, that is, allow analysts to perform intuitive operations "analysis along and across" ("Slice and Dice"), rotation (ROTATE) and placement (Pivot) consolidation directions. Transparency. The user should not know what specific means are used to store and process data, as data are organized and where they come from.

Accessibility. An analyst should be able to perform an analysis within the framework of the overall conceptual scheme, but the data may remain under the control of the DBMS remaining from the old inheritance, while being attached to the overall analytical model. That is, the OLAP toolkit must apply its logical scheme to the physical data arrays, performing all transformations required to ensure a single, consistent and holistic user's side view.

Sustainable performance (Consistent Reporting Performance). With an increase in the number of measurements and sizes of the analytics database should not encounter any productivity to any reducing. Sustainable performance is necessary to maintain ease of use and freedom from complications that are required to bring OLAP to the end user.

Client - Server Architecture (Client-Server Architecture). Most of the data requiring operational analytical processing is stored in mainframe systems, and extracted with personal computers. Therefore, one of the requirements is the ability of OLAP products to work in the client-server environment. The main idea here is that the OLAP tool server component must be sufficiently intelligent and have the ability to build a common conceptual scheme based on generalization and consolidation of various logical and physical schemes of corporate databases to ensure the transparency effect.

Measurement equality (Generic dimensionality). All data measurements must be equal. Additional characteristics can be provided with separate measurements, but since all of them are symmetrical, this additional functionality can be provided to any dimension. The basic data structure, formulas and report formats should not be based on some kind of measurement.

Dynamic processing of rarefied matrices (Dynamic Sparse Matrix Handling). The OLAP tool must provide optimal processing of sparse matrices. The access speed should be stored regardless of the location of the data cells and be a constant value for models having a different number of measurements and different data spars.

Multiplayer support (Multi-User Support). Often, several analysts have the need to work simultaneously with one analytical model or create different models based on some corporate data. The OLAP tool must provide them with competitive access, ensure the integrity and protection of data.

Unlimited Cross-Dimensional Operations Support (Unrestricted Cross-Dimensional Operations). Calculations and manipulation of data according to any number of measurements should not prohibit or limit any relationship between data cells. Transformations requiring arbitrary definition should be set on a functional full-formular language.

Intuitive data manipulation (Intuitive Data Manipulation). Reorientation of consolidation directions, data details in columns and lines, aggregation and other manipulations, characteristic of the structure of the hierarchy of consolidation directions, must be performed as a convenient, natural and comfortable user interface.

Flexible report generation mechanism (Flexible Reporting). Various methods of data visualization must be maintained, that is, reports must be submitted in any possible orientation.

Unlimited measurement and aggregation levels (Unlimited Dimensions and Aggregation Levels). It is strongly recommended assumption in each serious OLAP tool at least fifteen, and better twenty, measurements in the analytical model.

2 components OLAP systems

2.1 Server. Client. the Internet

OLAP allows you to quickly and efficiently analyze the large amounts of data. The data is stored in a multidimensional form, which most closely reflects the natural state of real business data. In addition, OLAP provides users with a possibility faster and easier to receive summary data. With it, they can, if necessary, to deepen (Drill Down) into the contents of this data to obtain more detailed information.

The OLAP system consists of a variety of components. At the highest view level, the system includes a data source, OLAP server and client. The data source is a source from which data is taken to analyze. Data from the source is transferred or copied to the OLAP server, where they are systematized and prepared for faster subsequently forming responses to requests. The client is a user interface to the OLAP server. This section of the article describes the functions of each component and the value of the entire system as a whole. Sources. The source in OLAP systems is a server that supplies data for analysis. Depending on the use area of \u200b\u200bthe OLAP product, the source can serve as a data warehouse, inherited database containing general data, a set of tables that combine financial data or any combination of listed. The ability of the OLAP product to work with data from various sources is very important. The requirement of a single format or single baseIn which all the source data would be stored, not suitable database administrators. In addition, this approach reduces the flexibility and power of the OLAP product. Administrators and users believe that OLAP products that provide data extraction not only from different, but also from a plurality of sources, turn out to be more flexible and useful than those that have more stringent requirements.

Server. The applicatory part of the OLAP system is an OLAP server. This component performs all the work (depending on the system model), and stores all the information to which active access is ensured. Server architecture manages various concepts. In particular, the main functional characteristic of the OLAP product is to use multidimensional data (MMBD, MDDB) or relational (RDB, RDB) database. Aggregated / Pre-aggregated data

Fast query implementation is an imperative for OLAP. This is one of the basic principles of OLAP - the ability to intuitively manipulate data requires the rapid extraction of information. In general, the more computing it is necessary to produce a fragment of information, the slower the response occurs. Therefore, to keep a small query implementation time, information fragments, the appeal to which usually happens most often, but which require calculations, are pre-aggregation. That is, they are counted and then stored in the database as new data. As an example of a data type that is permissible to calculate in advance, it is possible to bring summary data - for example, sales indicators for months, quarters or years, for which daily indicators are really entered.

Various suppliers adhere to various methods of selecting parameters requiring pre-aggregation and the number of pre-calculated values. Approach to the aggregation affects the database and during the implementation of requests. If more values \u200b\u200bare calculated, the likelihood that the user requests an already calculated value increases, and therefore the response time is reduced, since it is not necessary to ask the initial value for calculating. However, if you calculate all possible values \u200b\u200b- this is not the best solution - in this case, the size of the database is significantly increasing, which will make it unmanaged, and the aggregation time will be too large. In addition, numerical values \u200b\u200bare added to the database, or if they change, this information should be reflected in pre-calculated values \u200b\u200bdepending on new data. Thus, the database update may also take a lot of time in the case of a large number of pre-calculated values. Since usually, during the aggregation, the database works autonomously, it is desirable that the aggregation time is not too long.

Client. The client is just what is used to represent and manipulate data in the database. The client may be quite simple - in the form of a table that includes such OLAP capabilities such as, for example, data rotation (pivoting) and deepening to data (drilling), and is a specialized, but the same simply means of viewing reports or be such The same powerful tool as created to order an application designed for complex manipulations with data. The Internet is a new form of the client. In addition, he carries a print of new technologies; Many Internet solutions differ significantly in their capabilities as a whole and as an OLAP solution - in particular. This section discusses the various functional properties of each type of customers.

Despite the fact that the server is like a "ridge" OLAP solutions, the client is no less important. The server can provide a solid foundation to facilitate data manipulations, but if the client is complicated or small, the user will not be able to use all the advantages of a powerful server. The client is so important that many suppliers focus their efforts solely on the development of the client. All that is included in these applications is a standard look at the interface, predefined functions and structure, as well as quick solutions for more or less standard situations. For example, financial packages are popular. Pre-created financial applications will allow specialists to use familiar financial instruments without having to design a database structure or generally accepted forms and reports. Request tools / Report generator. Request tools or report generator offers easy access to OLAP data. They are easy to use graphic interface and allow users to create reports to the movement of objects in the report by the method " drag And. Drop ". Whereas the traditional report generator provides the user with the ability to quickly release formatted reports, report generators supporting OLAP form current reports. The final product is a report having a deepening opportunity to the data to the level of details, rotation (pivating), support hierarchies and DR. Add-Ins (additions) of spreadsheets.

Today, in many business areas, various forms of corporate data analysis are manufactured using spreadsheets. In a sense, this is the perfect tool for creating data and view data. An analyst can create macros working with data in the selected direction, and the template can be designed in such a way that when data is input occurs, the formulas calculate the correct values, eliminating the need to repeatedly enter simple calculations.

Nevertheless, all this gives a "flat" report as a result, which means that as soon as it is created, it is difficult to consider it in various aspects. For example, a diagram displays information for some time period, let's say for a month. And if someone wants to see indicators per day (as opposed to data for a month), it will be necessary to create an absolutely new diagram. New data sets have to be defined, add new labels to the chart and make many other simple, but labor-intensive changes. In addition, there are a number of areas in which errors may be made, which in general reduces reliability. When OLAP is added to the table, it is possible to create a single diagram, and then subjected to various manipulations in order to provide the user with the necessary information without having burdening itself by creating all possible representations. Internet as a client. The new member of the OLAP client family is the Internet. There are plenty of advantages in the formation of OLAP reports via the Internet. The most significant is the lack of need for specialized software to access information. It saves the enterprise a bunch of time and money.

Each Internet product is specific. Some simplify the creation of Web pages, but have fewer flexibility. Others allow you to create data views, and then save them as static HTML files. All this makes it possible to view data via the Internet, but no more. It is impossible to actively manipulate data with their help.

There is another type of product - interactive and dynamic, turning such products into full-featured tools. Users can deepen into data, pivoting, measurement limit, etc. Before selecting a means of implementing the Internet, it is important to understand which functionality is required from the Web solution, and then determine which product the best way will embody this functionality.

Applications. Applications are a client type that uses OLAP databases. They are identical to the query tools and report generators described above, but, in addition, they contribute to the product broader functionality. The application usually has a greater power than the request tool.

Development. Usually, OLAP providers provide development environment to create users of their own configured applications. The development environment as a whole is a graphical interface that supports object-oriented application development. In addition, most suppliers provide APIs that can be used to integrate OLAP databases with other applications.

2.2 OLAP - Customers

OLAP clients with a built-in OLAP machine are installed on user PCs. They do not require a server for computing, and they are inherent zero administration. Such clients allow the user to tune in to the database existing ones; As a rule, it creates a dictionary that hides the physical structure of the data for its subject description, a clear specialist. After that, the OLAP client performs arbitrary requests and their results displays them in the OLAP table. In this table, in turn, the user can manipulate data and receive hundreds of various reports on the screen or on paper. OLAP clients designed to work with RDBD allow you to analyze the data available in the corporation, for example, stored in the OLTP database. However, their second assignment may be quick and cheap creation of repositories or data showcases - in this case, the organization's programmers need only to create a set of Type Tables in relational databases and data loading procedures. The most time-consuming part of the work is to write interfaces with numerous options for user queries and reports - is implemented in the OLAP client literally in a few hours. The final user for the development of such a program requires about 30 minutes. OLAP clients are supplied by database developers themselves, both multidimensional and relational. This SAS Corporate Reporter, which is almost reference for convenience and beauty by the product, Oracle Discoverer, MS Pivot Services and Pivot Table, etc. Many programs designed to work with MS OLAP Services are supplied within the framework of the OLAP in the mass campaign, which Conducts Microsoft Corporation. As a rule, they are improved Pivot Table options and are designed to use in MS Office or Web browser. These are products of Matryx, Knosys firms, etc., thanks to simplicity, cheapness and efficiency, acquired great popularity in the West.

3 OLAP Product Classification

3.1 Multidimensional OLAP

Currently, a large number of products are present on the market, which to varying degrees provide OLAP functionality. Providing a multidimensional conceptual representation by the user interface to the source database, all OLAP products are divided into three classes by type of source database.

1. The very first operational analytical processing systems (for example, Essbase ARBOR Software, Oracle Express Server Company) related to the MOLAP class, that is, they could only work with their own multidimensional databases. They are based on proprietary technologies for multidimensional DBMS and are the most expensive. These systems provide a complete OLAP processing cycle. They either include, in addition to the server component, own integrated client interface, or used to communicate with the user. external programs Work with spreadsheets. To maintain such systems, a special staff is required by installing, accompanied by system, the formation of data views for end users.

2. Systems for operational analytical processing of relational data (ROLAP) allow data stored in a relational basis in a multidimensional form, providing information to a multidimensional model through an intermediate layer of metadata. This class includes the DSS Suite of Microstrategy, Metacube Informix company, Decisionsuite Information Advantage and others. Software package Infovist, developed in Russia, in Ivanovo State Energy University, is also a system of this class. Rolap systems are well adapted to work with large storage. Like MOLAP systems, they require considerable service costs for information technology professionals and provide multiplayer operation.

3. Finally, hybrid systems (Hybrid Olap, Holap) are designed to combine advantages and minimize the shortcomings inherent in previous classes. Speedware Media / MR includes this class. According to developers, it combines analytical flexibility and MOLAP response speed with constant access to real data peculiar to ROLAP.

In addition to listed funds, there is another class - tools for generating requests and reports for desktop PCs, supplemented by OLAP functions or integrated with external means that perform features. These well-developed systems perform a sample of data from the source sources, convert them and placed in a dynamic multidimensional database operating on the client end user station. The main representatives of this class are BusinessObjects the company of the company, BrioQuery company BRIO TECHNOLOGY and POWERPLAY COGNOS. Overview of some OLAP products is given in the application.

In specialized DBMS based on multidimensional data presentation, the data is not organized in the form of relational tables, but in the form of ordered multidimensional arrays:

1) hypercubes (all the cells stored in the database must have the same dimension, that is, to be in the maximum full measurement basis) or

2) polycubes (each variable is stored with its own set of measurements, and all the associated complexity of processing is shifted to the internal mechanisms of the system).

The use of multidimensional databases in systems of operational analytical processing has the following advantages.

1. In the case of using multidimensional DBMS, the search and sampling of the data is carried out much faster than with a multidimensional conceptual look at the relational database, since the multidimensional database is denormalized, contains pre-aggregated indicators and provides optimized access to the requested cells.

2. Multidimensional DBMSs easily cope with the tasks of inclusion in the information model of various built-in functions, while objectively existing restrictions sQL Language Make these tasks based on relational DBMS quite complex, and sometimes impossible.

On the other hand, there are significant limitations.

1. Multidimensional DBMSs do not allow working with large databases. In addition, due to the denormalization and pre-performed aggregation, the amount of data in a multidimensional base, as a rule, corresponds to (by assessing the code) in 2.5-100 times the smaller volume of source detailed data.

2. Multidimensional DBMSs compared with relational are very inefficiently used external memory. In the overwhelming majority of cases, the information hypercube is strongly rarefied, and since the data is stored in an ordered form, uncertain values \u200b\u200bare deleted only by selecting the optimal sorting order, which allows you to organize data into the maximum continuous groups. But even in this case, the problem is solved only in part. In addition, the sorting procedure is most likely optimal from the point of view of storage, the order of sorting will most likely not coincide with the order that is most often used in queries. Therefore B. real systems You have to search for a compromise between the speed and redundancy of the disk space occupied by the database.

Consequently, the use of multidimensional DBMS is justified only under the following conditions.

1. The amount of source data for analysis is not too large (no more than a few gigabytes), that is, the level of data aggregation is quite high.

2. A set of information measurements is stable (since any change in their structure almost always requires a complete hypercube restructuring).

3. The response time of the system for non-elected requests is the most critical parameter.

4. A wide use of complex embedded functions is required to perform cross-dimensional calculations over the cells of the hypercuba, including the possibility of writing user functions.

Direct use of relational databases in systems of operational analytical processing has the following advantages.

1. In most cases, corporate data warehouses are implemented by means of relational DBMS, and ROLAP tools make it possible to analyze directly above them. In this case, the storage size is not such a critical parameter as in the case of MOLAP.

2. In the case of a variable dimension of the task, when changes to the measurement structure have to be made quite often, the ROLAP system with a dynamic dimension representation is an optimal solution, since these modifications do not require physical reorganization of the database.

3. Relational DBMSs provide a significantly higher level of data protection and good access rights to delimitation.

The main drawback of ROLAP compared to multidimensional DBMS is less performance. To ensure performance comparable to MOLAP, relational systems require a thorough study of the database diagram and index settings, that is, great efforts from the database administrators. Only when using star-shaped schemes, the performance of well-configured relational systems can be approached by the performance of systems based on multidimensional databases.

Description of the Star Scheme (Star Schema) and recommendations for its use are fully devoted to work. Its idea is that there are tables for each measurement, and all the facts are placed in one table, indexed by the multiple key compiled from the keys of individual measurements (Appendix A). Each ray of the star scheme sets, in the code of code, the direction of consolidation of data on the appropriate measurement.

In difficult tasks with multi-level measurements, it makes sense to refer to the Star Scheme Extensions - Constellation Scheme (FACT CONSTELLATION SCHEMA) and Snowflake Scheme (Snowflake Schema). In these cases, individual fact tables are created for possible combinations of generalization levels of various measurements (Appendix B). This allows you to achieve better performance, but often leads to data redundancy and significant complications in the database structure in which it turns out great amount Fact tables.

Increasing the number of facts of facts in the database may result in not only from the multiplicity of levels of different measurements, but also from the fact that in general the facts have different sets of measurements. When abstraction from individual measurements, the user must receive a projection of the maximum full hypercube, and far from always the values \u200b\u200bof the indicators in it should be the result of elementary summation. Thus, with a large number of independent measurements, it is necessary to support multiple fact tables that correspond to each possible combination of the measurement requests, which also leads to unequomomous use of external memory, an increase in the data load time in the database of the star diagram from external sources and administration difficulties.

Partially solve this problem of the SQL language expansion (GROUP by Cube operators, "Group by Rollup" and "group by grouping sets"); In addition, a compromise search mechanism is offered between redundancy and speed, recommending to create fact tables not for all possible measurement combinations. , but only for those that the cells of which cannot be obtained using the subsequent aggregation of more complete facts of the facts (Appendix B).

In any case, if the multidimensional model is implemented as a relational database, you should create long and "narrow" fact tables and relatively small and "wide" measurement tables. The fact tables contain the numerical values \u200b\u200bof the hypercube cells, and the remaining tables determine the multidimensional measurement basis containing their multi-dimensional basis. Some of the information can be obtained using a dynamic data aggregation distributed by uniformly normalized structures, although it should be remembered that the query includes aggregation with a highly damaged bd structure can be performed quite slowly.

The orientation on the presentation of multidimensional information using star-shaped relational models allows you to get rid of the problem of optimizing the storage of rarefied matrices, sharply facing multidimensional DBMS (where the product problem is solved by a special selection of the circuit). Although the whole entry is used to store each cell, which in addition to the values \u200b\u200bitself includes secondary keys - references to the measurement tables, non-existent values \u200b\u200bare simply not included in the fact table.

Conclusion

Having considered issues of work and application of technology OLAPPERED companies arise questions, the answers to which will allow you to choose the product best meets the needs of the user.

These are the following questions:

How do the data come from? - The data to be analyzed may be in various places. It is possible that the OLAP database will receive them from corporate data warehouse or from the OLTP system. If the OLAP product already has the ability to access a source of data, categorization and data cleaning processes are reduced.

What manipulations of the user performs on the data? -
As soon as the user has gained access to the database and began to perform analysis, it is important that it be able to operate the data accordingly. Depending on the needs of the user, it may be that a powerful report generator is needed or the ability to create and place dynamic web pages. At the same time, it may be more preferable to have a simple and quick to create your own applications at its disposal.

What is the total amount of data? - This is the most important factor in determining the OLAP database. Relational OLAP products are able to operate in large amounts of data better than multidimensional. If the amount of data does not require the use of the relational base, the multidimensional product can be used with no less success.

Who is the user? - When determining the OLAP client, the level of user qualification level is important. Some users are more convenient to integrate OLAP with a table, while others prefer a specialized application. Depending on the qualification of the user, the question of conducting training is also solved. A large company may wish to pay trainings for users, a smaller company can refuse them. The client should be such that users feel confident and can effectively use it.

Today, most global companies have moved to the use of OLAP as base technology To provide information to decision makers. Therefore, the principal question that needs to be determined is not to continue to apply spreadsheets as the main platform for reporting, budgeting and forecasting. Companies should ask themselves whether they are ready to lose competitive advantages using inaccurate, irrelevant and incomplete information before they ripe and consider alternative technologies.

Also, in conclusion, it should be noted that the analytical capabilities of OLAP technologies increase the use of data stored in the corporate storage of information, allowing the company to more effectively interact with their customers.

Glossary

Concept Definition
1 BI tools Tools and technologies used to access information. Include OLAP technology, Data Mining and complex analysis; Understanding tools and tools for building non-elected requests, toolbar for monitoring economic activity and corporate reporting generators.
2 On-line Analitic Processing, OLAP (operational analytical processing) Technology of analytical information processing in real time, including compilation and dynamic publication of reports and documents.
3 Slice and Dice (longitudinal and transverse sections, literally - "cutting on slices and cubes") The term used to describe the function of a complex analysis of data provided by OLAP tools. Selecting data from a multidimensional cube with specified values \u200b\u200band a given mutual measurement arrangement.
4 Rotation (Data) Data (Data Pivot) The process of rotating a table with data, i.e. conversion of columns into the string and vice versa.
5 Calculated Member Measurement element whose value is determined by the values \u200b\u200bof other elements (for example, mathematical or logical applications). The calculated element may be part of the OLAP server or be described by the user during the interactive session. The calculated element is any element that is not entered, but is calculated.
6 GLOBAL BUSINESS MODELS (Global Business Models) The type of data warehouse that provides access to information that is distributed across various enterprise systems and is under the control of various departments or departments with different databases and data models. This type of data warehouse is difficult to build due to the need to combine the efforts of users of various divisions to develop a common data model for the repository.
7 Data Mining (Data Mining) Technical techniques that use software toolsFor such a user, which, as a rule, cannot say in advance, which is specifically looking for, and may indicate only certain samples and search directions.
8 Client / Server (Client / Server) The technological approach consisting in separating the process to separate functions. The server executes several functions - Communication Management, maintaining a database service, etc. The client performs individual user functions - providing appropriate interfaces, performing inter-screen navigation, providing assistance features (HELP), etc.
9 Multi-dimensional, SUMBD (MULTI-DIMENSIONAL DATABASE, MDBS AND MDBMS) Powerful database, allowing users to analyze large data volumes. Database with a special storage organization - Cubes, providing high speed of working with data stored as a set of facts, measurements and pre-calculated aggregates.
10 DRILL DOWN (DRILL DOWN) The method of studying detailed data used in the analysis of the total data level. The levels of "deepening" depend on the degree of data detail in [Ranker.
11 Central Warehouse (Central Warehouse)

1. A database containing data collected from operating systems Organizations. It has a structure that is convenient for data analysis. Designed to support decision-making and creating a single information space of the Corporation.

2. Automation method covering all information systems managed from one place.

1 Golitsin O.L., Maksimov N.V., Popov I.I. Database: Tutorial. - M.: Forum: Infra-M, 2003. - 352 p.

2 Date K. Introduction to the database systems. - M.: Huka, 2005 - 246 p.

3 Elmanova N.V., Fedorov A.A. Introduction to Microsoft OLAP technology. - M.: Dialog Mafi, 2004. - 312 p.

4 Karpova TS Databases: Models, Development, Implementation. - SPb.: Peter, 2006. - 304 p.

5 Korovkin S. D., Levenz I. A., Ratmanova I. D., Starykh V. A., Schowelev L. V. Solution of the problem of a comprehensive operational analysis of data warehousing information // DBMS. - 2005. - № 5-6. - 47-51 p.

6 Kchedov N., Ivanov P. Products for intellectual analysis of COMPUTERWEEK-Moscow data. - 2003. - № 14-15. - 32-39 p.

7 PRIYALKOVSKY V.V. Difficult analysis of data of large volume: new perspectives of computerization // DBMS. - 2006. - № 4. - 71-83 p.

8 Sakharov A. A. The concept of building and implementing information systems oriented data // DBMS. - 2004. - № 4. - 55-70 p.

9 Ulman J. Basics of database systems. - M.: Finance and Statistics, 2003. - 312 c.

10 Hubbard J. Automated Database Design. - M.: Mir, 2007. - 294 p.


Korovkin S. D., Levenz I. A., Ratmanova I. D., Starykh V. A., Schowelev L. V. Solving the problem of an integrated operational analysis of data warehousing information // DBMS. - 2005. - № 5-6. - 47-51 p.

Ulman J. Basics of database systems. - M.: Finance and Statistics, 2003. - 312 c.

Barcegian A.A., Kupriyanov M.S. Data Analysis Technologies: DataMining, Visualmining, Textmining, Olap. - SPb.: BHV-Petersburg, 2007. - 532 p.

Elmanova N.V., Fedorov A.A. Introduction to Microsoft OLAP technology. - M.: Dialog Mafi, 2004. - 312 p.

Date K. Introduction to the database systems. - M.: Huka, 2005 - 246 p.

Golitsina O.L., Maksimov N.V., Popov I.I. Databases: Tutorial. - M.: Forum: Infra-M, 2003. - 352c.

Sahars A. A. The concept of construction and implementation of information systems oriented data analysis // DBMS. - 2004. - № 4. - 55-70 p.

Pržylkovsky V.V. Difficult analysis of data large volume: new perspectives of computerization // DBMS. - 2006. - № 4. - 71-83 p.

The concept of multidimensional data analysis is closely associated with operational analysis, which is performed by the OLAP systems.

OLAP (on-line Analytical Processing) - Technology of operational analytical data processing using methods and means for collecting, storing and analyzing multidimensional data to support decision-making processes.

The main purpose of OLAP systems - support for analytical activities, arbitrary (often used the term AD-HOC) of analyst users. The goal of OLAP analysis is to check the emerging hypotheses.

At the sources of OLAP technology is the founder of the relational approach E. Codd. In 1993, he published an article entitled "OLAP for analyst users: what should it be." This paper presents the main concepts of operational analytical processing and the following 12 requirements are defined, which must be satisfied with the products allowing operational analytical processing. Tokmakov G.P. Database. Concept of databases, relational data model, SQL languages. P. 51.

The following rules set forth by the code and defining OLAP are listed below.

1. Multidimensionality - the OLAP system at the conceptual level should submit data in the form of a multidimensional model, which simplifies the processes of analysis and perception of information.

2. Transparency - the OLAP system must hide from the user a real implementation of a multidimensional model, a method of organization, sources, processing and storage facilities.

3. Availability - the OLAP system should provide the user with a single, consistent and holistic data model, providing access to data regardless of how and where they are stored.

4. Constant performance when developing reports - the performance of OLAP systems should not be significantly reduced by increasing the number of measurements for which the analysis is performed.

5. Client-server architecture - the OLAP system must be able to work in the "Client-Server" environment, because Most of the data that is required today to be subject to operational analytical processing are stored distributed. The main idea here is that the OLAP tool server component should be sufficiently intelligent and allow us to build a general conceptual scheme based on generalization and consolidation of various logical and physical schemes of corporate databases to ensure transparency effect.

6. Measurement equal rights - the OLAP system must support a multidimensional model in which all measurements are equal. If necessary additional characteristics Can be provided with separate measurements, but this possibility must be provided to any dimension.

7. Dynamic control of racked matrices - the OLAP system should ensure optimal processing of sparse matrices. The access speed should be stored regardless of the location of the data cells and be a constant value for models having a different number of measurements and a different degree of data productivity.

8. Support for multiplayer mode - the OLAP system should provide an opportunity to work to several users together with one analytical model or create various models from uniform data for them. It is possible both reading and record data, so the system should ensure their integrity and safety.

9. Unlimited cross-operations - the OLAP system should ensure the preservation of the functional relations described using a certain formal language between the cells of the hypercube when performing any cut operation, rotation, consolidation or detail operations. The system must independently (automatically) perform the conversion of the set relationship, without requiring the user to redefine them.

10. Intuitive data manipulation - the OLAP system should provide a method for performing operations of cut, rotation, consolidation and detail over a hyperkub without having to make a variety of actions with the interface. Measurements defined in the analytical model must contain all the necessary information to perform the above operations.

11. Flexible reporting capabilities - the OLAP system must support various ways to visualize data, i.e. Reports must be submitted in any possible orientation. Reporting tools must provide synthesized data or information that is the following from the data model in its possible orientation. This means that strings, columns or pages should be shown simultaneously from 0 to n measurements, where N-- the number of measurements of the entire analytical model. In addition, each measurement of the contents shown in one entry, column or page must allow any subset of the elements (values) contained in the dimension in any order.

12. Unlimited dimension and number of aggregation levels - research on the possible number of necessary measurements required in the analytical model showed that up to 19 measurements can be used at the same time. It follows the ultimate recommendation to ensure that the analytical tool can simultaneously provide at least 15, and preferably 20 measurements. Moreover, each of the total dimensions should not be limited by the number of user-defined levels of aggregation levels and consolidation paths.

Additional regulations of the code.

The set of these requirements served as a de facto definition of OLAP, quite often causes various complaints, for example, rules 1, 2, 3, 6 are the requirements, and rules 10, 11 - informalized wishes. Tokmakov G.P. Database. Concept of databases, relational data model, SQL languages. P. 68 Thus, the listed 12 Code requirements do not allow to accurately determine OLAP. In 1995, the code for the list added the following six rules:

13. Batch extraction against interpretation - the OLAP system should equally efficiently provide access to both its own and external data.

14. Support all OLAP-analysis models - the OLAP system must maintain all four data analysis models defined by the code: categorical, interpreting, speculative and stereotypical.

15. Processing of abnormalized data - the OLAP system must be integrated with abnormal data sources. Data modifications made in OLAP medium should not lead to changes in data stored in the source external systems.

16. Saving OLAP results: storing them separately from the source data - an OLAP system operating in the read-write mode, after modifying the source data, the results should be saved separately. In other words, the security of the source data is ensured.

17. The exclusion of missing values-- OLAP-system, presenting these to the user, must discard all the missing values. In other words, missing values \u200b\u200bshould differ from zero values.

18. Processing of missing values \u200b\u200b- the OLAP system must ignore all the missing values \u200b\u200bwithout taking into account their source. This feature is associated with the 17th rule.

In addition, the Codd broke all 18 rules for the next four groups, calling them features. These groups received names in, S, R and D.

The main features (B) include the following rules:

Multidimensional conceptual representation of data (rule 1);

Intuitive data manipulation (rule 10);

Availability (rule 3);

Batch extraction against interpretation (rule 13);

Support for all OLAP analysis models (rule 14);

Architecture "Client-server" (rule 5);

Transparency (rule 2);

Multiplayer support (rule 8)

Special features (s):

Processing of abnormalized data (rule 15);

Saving OLAP results: storing them separately from the source data (rule 16);

Elimination of missing values \u200b\u200b(rule 17);

Processing of missing values \u200b\u200b(rule 18). Features of reporting (R):

Reporting flexibility (rule 11);

Standard report performance (rule 4);

Automatic configuration of the physical layer (modified original rule 7).

Measurement Management (D):

Universality of measurements (rule 6);

Unlimited number of measurements and aggregation levels (rule 12);

Unlimited operations between dimensions (rule 9).

The conditions for high competition and the growing dynamics of the external environment dictate increased requirements for enterprise management systems. The development of the theory and practice of management was accompanied by the emergence of new methods, technologies and models focused on improving the efficiency of activity. Methods and models in turn contributed to the emergence of analytical systems. The demand for analytical systems in Russia is high. Most interesting in terms of application of these systems in the financial sector: banks, insurance business, investment companies. The results of the work of analytical systems are required primarily to people whose decisions depends on the development of the company: managers, experts, analysts. Analytical systems allow you to solve consolidation tasks, reporting, optimization and forecasting. To date, it has not been a final classification of analytical systems, as there is no common system of definitions in terms used in this direction. The information structure of the enterprise can be represented by a sequence of levels, each of which is characterized by its processing and information management method, and has its own function in the management process. Thus, analytical systems will be located hierarchically at different levels of this infrastructure.

Level of transactional systems

Data warehouse level

The level of data showcases

OLAP level - systems

Level of analytical applications

OLAP - Systems - (Online Analytical Processing, Analytical Treatment In the present Time) - are the technology of comprehensive multidimensional data analysis. OLAP - Systems are applicable where there is a task of analyzing multifactor data. There are an effective means of analyzing and generating reports. The above data warehouses, data showcases and OLAP systems refer to business intelligence systems (BUSINESS INTELLIGENCE, BI).

Very often, information and analytical systems created on the direct use of decision-making persons are extremely simple in use, but are rigidly limited in functionality. Such static systems are called in the literature. Information systems Head (IPR), or Executive Information Systems (EIS). They contain predefined multiple requests and, being sufficient for everyday review, is unable to respond to all questions to available data that may arise when making decisions. The result of such a system, as a rule, are multi-page reports, after a thorough study of which the analyst appears new series questions. However, each new request, unforeseen when designing such a system, should be formally described formally, encoded by a programmer and is then executed. Waiting time in this case can make hours and days that is not always acceptable. Thus, the external simplicity of static SPPR, for which most of the customers of information and analytical systems are actively fighting, turns on the catastrophic loss of flexibility.



Dynamic SPPRs, on the contrary, are focused on the processing of non-elected (AD HOC) of analysts to data. The most deeply requirements for such systems reviewed E. F. Codd in the article, which posted the beginning of the concept of OLAP. The work of analysts with these systems is the interactive sequence of querying and studying their results.

But dynamic SPPRs can act not only in the field of operational analytical processing (OLAP); Support for making management decisions based on accumulated data can be performed in three basic areas.

Sphere of detailed data. This is the area of \u200b\u200baction of most systems aimed at finding information. In most cases, relational DBMSs are perfectly coping with tasks arising here. The generally accepted standard of manipulation language with relational data is SQL. Information and search engines that provide the end-user interface in the search tasks of detailed information can be used as add-ons both over separate transaction system databases and over common data storage.

Sphere of aggregated indicators. A comprehensive look at the information collected in the data warehouse, its generalization and aggregation, hypercubic representation and multidimensional analysis are tasks of operational analytical data processing systems (OLAP). Here you can or focus on special multidimensional DBMS, or remain within relational technologies. In the second case, pre-aggregated data can be collected in the database of a star-like type, or the information aggregation can be carried out on the fly in the process of scanning detailed tables of the relational database.

Sphere of patterns. Intelligent processing is performed by the methods of intelligent data analysis (Jaad, Data Mining), the main tasks of which are the search for functional and logical patterns in the accumulated information, the construction of models and rules that explain the found anomalies and / or predict the development of some processes.

Operational analytical data processing

The basis of the concept of OLAP lies the principle of multidimensional data presentation. In 1993, the EF Codd article considered the deficiencies of the relational model, first of all specifying the inability to "combine, view and analyze data from the point of view of the multiplicity of measurements, that is, the most understandable for corporate analysts in the way," and identified general requirements for OLAP systems expanding The functionality of relational DBMS and includes multi-dimensional analysis as one of its characteristics.

Classification of OLAP products according to the data representation method.

Currently, a large number of products are present on the market, which to varying degrees provide OLAP functionality. About 30 most famous are listed in the list of the review Web server http://www.olapreport.com/. Providing a multidimensional conceptual representation by the user interface to the source database, all OLAP products are divided into three classes by type of source database.

The most first operational analytical processing systems (for example, Essbase ARBOR Software, Oracle's Oracle Express Server Company) belonged to the MOLAP class, that is, they could only work with their own multidimensional databases. They are based on proprietary technologies for multidimensional DBMS and are the most expensive. These systems provide a complete OLAP processing cycle. They either include, in addition to the server component, their own integrated client interface is either used to communicate with the user external work programs with spreadsheets. To maintain such systems, a special staff is required by installing, accompanied by system, the formation of data views for end users.

The operational analytical data processing systems (ROLAP) provide data stored in the relational base, in multidimensional form, ensuring the transformation of information into a multidimensional model through the intermediate layer of metadata. Rolap systems are well adapted to work with large storage. Like MOLAP systems, they require considerable service costs for information technology professionals and provide multiplayer operation.

Finally, hybrid systems (Hybrid Olap, Holap) are designed to combine advantages and minimize the shortcomings inherent in previous classes. Speedware Media / MR includes this class. According to developers, it combines analytical flexibility and MOLAP response speed with constant access to real data peculiar to ROLAP.

Multidimensional OLAP (MOLAP)

In specialized DBMS based on multidimensional data presentation, the data is not organized in the form of relational tables, but in the form of ordered multidimensional arrays:

1) hypercubes (all the cells stored in the database must have the same dimension, that is, to be in the maximum full measurement basis) or

2) polycubes (each variable is stored with its own set of measurements, and all the associated complexity of processing is shifted to the internal mechanisms of the system).

The use of multidimensional databases in systems of operational analytical processing has the following advantages.

In the case of using multidimensional DBMS, the search and sample of data is carried out much faster than with a multidimensional conceptual look at the relational database, since the multidimensional database is denormalized, contains pre-aggregated indicators and provides optimized access to the requested cells.

Multidimensional DBMSs easily cope with the tasks of inclusion in the information model of a variety of built-in functions, while objectively existing SQL language restrictions make these tasks based on relational DBMSs quite complicated, and sometimes impossible.

On the other hand, there are significant limitations.

Multidimensional DBMSs do not allow working with large databases. In addition, due to the denormalization and pre-performed aggregation, the amount of data in a multidimensional base, as a rule, corresponds to (by assessing the code) in 2.5-100 times the smaller volume of source detailed data.

Multidimensional DBMSs compared with relational are very inefficiently using external memory. In the overwhelming majority of cases, the information hypercube is strongly rarefied, and since the data is stored in an ordered form, uncertain values \u200b\u200bare deleted only by selecting the optimal sorting order, which allows you to organize data into the maximum continuous groups. But even in this case, the problem is solved only in part. In addition, the sorting procedure is most likely optimal from the point of view of storage, the order of sorting will most likely not coincide with the order that is most often used in queries. Therefore, in real systems, it is necessary to search for a compromise between the speed and redundancy of the disk space occupied by the database.

Consequently, the use of multidimensional DBMS is justified only under the following conditions.

The amount of source data for analysis is not too large (no more than a few gigabytes), that is, the data aggregation level is quite high.

The set of information measurements is stable (since any change in their structure almost always requires a complete hypercube restructuring).

The response time of the system for non-elected requests is the most critical parameter.

A wide use of complex built-in functions is required to perform cross-dimensional calculations over the cells of the hypercube, including the possibility of writing user functions.

Relation OLAP (ROLAP)

Direct use of relational databases in systems of operational analytical processing has the following advantages.

In most cases, corporate data warehouses are implemented by means of relational DBMS, and ROLAP tools make it possible to analyze directly above them. In this case, the storage size is not such a critical parameter as in the case of MOLAP.

In the case of a variable dimension of the task, when changes to the measurement structure have to be made quite often, the rolap system with a dynamic representation of dimension is the best solution, since such modifications do not require physical reorganization of the database.

Relational DBMSs provide a significantly higher level of data protection and good access rights to delimitation.

The main drawback of ROLAP compared to multidimensional DBMS is less performance. To ensure performance comparable to MOLAP, relational systems require a thorough study of the database diagram and index settings, that is, great efforts from the database administrators. Only when using star-shaped schemes, the performance of well-configured relational systems can be approached by the performance of systems based on multidimensional databases.