Apache Hadoop vs Google Bigdata [closed]
Asked Answered
E

1

7
  1. Can any one explain me the key difference between Apache Hadoop vs Google Bigdata
  2. Which one is better(hadoop or google big data).
Eliseoelish answered 16/5, 2015 at 13:33 Comment(1)
Or Big Query, but Google Platforms allow one to use eitherJernigan
P
16

Simple answer would be.. it depends on what you want to do with your data.

Hadoop is used for massive storage of data and batch processing of that data. It is very mature, popular and you have lot of libraries that support this technology. But if you want to do real time analysis, queries on your data hadoop is not suitable for it.

Google's Big Query was developed specially to solve this issue. You can do real time processing on your data using google's big query.

You can use Big Query in place of Hadoop or you can also use big query with Hadoop to query datasets produced from running MapReduce jobs.

So, it entirely depends on how you want to process your data. If batch processing model is required and sufficient you can use Hadoop and if you want real time processing you have to choose Google's.

Edit: You can also explore other technologies that you can use with Hadoop like Spark, Storm, Hive etc.. (and choose depending on your use case)

Some useful links for more exploration:

1: gavinbadcock's blog

2: cloudacademy's blog

Pahang answered 16/5, 2015 at 14:12 Comment(2)
That's true, but it's also worth noting there are many efforts in the Hadoop space that are trying to close that gap: Spark, Storm, Hive, Impala, Flink, and probably others I'm not thinking of. Hadoop is a bit of a loaded term, some people mean the original MapReduce framework (which is the batch processing you mention), some people mean any technology that is compatible with a Hadoop cluster.Quaker
Another key difference is most Hadoop clusters are locally provisioned (though there are cloud solutions available too), but BigQuery is only a cloud service. For some users, that is not acceptable (either for legal reasons or because the data is generated outside the cloud and is too large to easily get into the cloud).Quaker

© 2022 - 2024 — McMap. All rights reserved.