How are OLAP, OLTP, data warehouses, analytics, analysis and data mining related?
Asked Answered
L

2

5

I'm trying to understand what OLAP, OLTP, data mining, analytics etc. are about, and I feel like my understanding about some of these concepts is still a bit vague. Information about these subjects tend to be explained in a very complex manner on the internet.


I feel like a question like this is likely to be closed since it's a very broad one, so I'll try to narrow it down into two questions:

Question 1:

After doing research I understand the following about these concepts, is it correct?

  • Analysis is decomposing something complex, to understand the inner workings better.
  • Analytics is predictive analysis on information that requires alot of math and statistics.
  • There's many type of databases, but they are either OLTP (transactional) or OLAP (analytical).
  • OLTP databases use ER diagrams, and are therefore easier to update because they are in normalized form.
  • In contrast, OLAP uses the denormalized star schema's and is therefore easier to query
  • OLAP is used for predictive analysis and OLTP is usually used in more practical situations since theres no redundancy.
  • Data warehouses is a type of OLAP database, and usually consists out of multiple other databases.
  • Data mining is a tool used in analytics, where u use computer software to find out relationships between data so you can predict things (e.g. customer behavior).

Question 2:

I'm especially confused about the difference between analytics and analysis. They say analytics is multidimensional analysis, but what is that supposed to mean?

Latinist answered 19/9, 2014 at 12:25 Comment(1)
Your confusion is easily understood, considering that some of these terms cross into marketspeak. Don't get hung up on the term analytics unless you are selling softwareRozellarozelle
L
13

I will try to explain you from the top of the pyramid:

Business Intelligence (what you didn't mentioned) is term in IT which stands for a complex system and gives useful informations about company from data.

So, BI systems has target: Clean, accurate and meaningful informations. Clean means there is no tech problems (missing keys, incomplete data ect). Accurate means accurate - BI systems are also used as fault checker of production database (logical faults - i.e invoice bill is too high, or inactive partner is used ect). It has been accomplished with rules. Meaningful is hard to explain, but in simple english, it's all your data (even excel table from the last meeting), in way you want.

So, BI system has back-end: It's data warehouse. DWH is nothing else than a database (instance, not software). It can be stored in RDBMS, analytical db (columnar or document store types), or NoSQL databases.

Data warehouse is term used usually for whole database that I explained above. There could be number of data-marts (if Kimball model is used) - more often, or relational system in 3rd normalized form (Inmon model) called enterprise data warehouse.

Data marts are tables inside DWH that are related (star schema, snowflake schema). Fact table (business process in denormalized form ) and dimension tables.

Each data mart represents one business process. Example: DWH has 3 data marts. One is retail sales, second is export, and third is import. In retail you can see total sales, qty sold, import price, profit (measures) by SKU, date, store, city ect (dimensions).

Loading data in DWH is called ETL(extract, transform, load).

  1. Extract data from multiple sources (ERP db, CRM db, excel files, web service...)

  2. Transform data (clean data, connect data from diff sources, match keys, mine data)

  3. Load data (Load transformed data in specific data marts)

edit because of comment: ETL process is usually created with ETL tool, or manually with some programming language (python, c# ect) and APIs.

ETL process is group of SQLs, procedures, scripts and rules related and separated in 3 parts (look above), controlled by meta data. It's either scheduled (every night, every few hours) or live (change data capture, triggers, transactions).

OLTP and OLAP are types of data processing. OLTP is used in transaction purpose, between database and software (usually only one way of input/output data). OLAP is for analitical purpose, and this means there is multiple sources, historical data, high select query performance, mined data.

edit because of comment: Data Processing is way how data is stored and accessed from database. So, based on of your needs, database is set in different way.

Image from http://datawarehouse4u.info/:

enter image description here

Data mining is the computational process of discovering patterns in large data sets. Mined data can give you more insight view of business process or even forecast.

Analysis is a verb, which in BI world means simplicity of getting asked information from data. Multidimensional analysis actually says how system is slicing your data (with dimensions inside cube). Wikipedia said that analysis of data is a process of inspecting data with the goal of discovering useful information.

Analytics is a noun and it represent a result of analysis process.

Don't get so much fuss about those two words.

Lough answered 20/9, 2014 at 18:22 Comment(5)
Wow thanks for the effort! Can DWH's perform ETL automatically on live databases, or does the database administrator have to perform ETL manually each time up-to-date data is required? The thing that confuses me is that both BI, data mining and analytics are focused on data processing, so it's hard for me to see the difference.Latinist
I edited my answer. Comment and ask more until you catch everything. It will be good for me, you and community.Lough
Sure :) Since ETL is done with a programming language I'll assume you have to execute a program manually to transfer data from multiple sources to a single one. I guess ETL is not limited to just DWH's? So if I have a badly designed transactional database, I could design a normalised one and perform ETL to transfer data from the first to the second (normalized) database?Latinist
Wiki defines "data processing" as "the collection and manipulation of items of data to produce meaningful information". I thought this was only the case with DWH's (i.e. OLAP database)? So is OLTP really a type of data processing?Latinist
In very rare cases ETL is done with programming language, almost always in ETL tool, but I wrote that in generic case. Some etl's can be done with stored procedures only. Even a custom writen software you can schedule it. Exactly, like I said, DWH is database instance, and ETL tool cannot recognize what is database for, so, yes, you can transfer data from one db to another. DWH is modern way to examine the data. Almost every erp, crm have reports. It's meaningful information from OLTP db. So, clearly, it fits in definition.Lough
Z
-1

I can tell you about Data mining as i had project on Data mining. Data mining is not a tool ,Its a method of mining data and different tools used for data mining is WEKA ,RAPID MINER etc. Data mining follows many algorithms which are inbuilt in tools like Weka ,Rapid miner. Algorithms like Clustering algorithm , assosiation algorithm etc. A simple example i can give you of data mining . Teacher is teaching science subject in a class by using different methods of teaching like using chalkboard,presentation,Practical. So now our aim is to find which method is suitable for students. Then we do survey and take students opinion 40 students like chalk board ,30 likes presentation and 20 likes practical method. So with help of this data we can make the rules for example Science subject should be taught by chalk board method. To knw different algorithms you can use google :D.

Zinazinah answered 19/9, 2014 at 12:40 Comment(4)
But how does it differ from analytics? Isn't analytics the same thing then? From what I understand, your example is both an example for analytics and data mining.Latinist
Analytics are the result of analysis and the form of presentation of those results. Analysis is the method or methods that can be used to analyze data and the process of analyzing it. Data mining is a specific statistical technique that traditional statisticians call fishing (and look down on). Data mining allows you to search through enormous quantities of data without having any idea what you are looking for. It identifies correlations simply through brute force analysis and neural network (learning) techniques.Zinazinah
I don't understand how you explained analytics, sorry... Could you try and explain it less formal? I understand what data mining is now :)Latinist
quora.com/… Thhis will help youZinazinah

© 2022 - 2025 — McMap. All rights reserved.