I have an Apache Spark Cluster(2.2.0) in standalone mode. Till now was running using HDFS to store the parquet files. I'm using the Hive Metastore Service of Apache Hive 1.2 to access, using the Thriftserver, Spark over JDBC.
Now I want to use S3 Object Storage instead HDFS. I have added the following configuration to my hive-site.xml:
<property>
<name>fs.s3a.access.key</name>
<value>access_key</value>
<description>Profitbricks Access Key</description>
</property>
<property>
<name>fs.s3a.secret.key</name>
<value>secret_key</value>
<description>Profitbricks Secret Key</description>
</property>
<property>
<name>fs.s3a.endpoint</name>
<value>s3-de-central.profitbricks.com</value>
<description>ProfitBricks S3 Object Storage Endpoint</description>
</property>
<property>
<name>fs.s3a.endpoint.http.port</name>
<value>80</value>
<description>ProfitBricks S3 Object Storage Endpoint HTTP Port</description>
</property>
<property>
<name>fs.s3a.endpoint.https.port</name>
<value>443</value>
<description>ProfitBricks S3 Object Storage Endpoint HTTPS Port</description>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>s3a://dev.spark.my_bucket/parquet/</value>
<description>Profitbricks S3 Object Storage Hive Warehouse Location</description>
</property>
I have the hive metastore in a MySQL 5.7 database. I have added to the Hive lib folder the following jar files:
- aws-java-sdk-1.7.4.jar
- hadoop-aws-2.7.3.jar
I have deleted the old hive metastore schema on MySQL and then I start the metastore service with the following command: hive --service metastore &
and I get the following error:
java.lang.NoClassDefFoundError: com/fasterxml/jackson/databind/ObjectMapper
at com.amazonaws.util.json.Jackson.<clinit>(Jackson.java:27)
at com.amazonaws.internal.config.InternalConfig.loadfrom(InternalConfig.java:182)
at com.amazonaws.internal.config.InternalConfig.load(InternalConfig.java:199)
at com.amazonaws.internal.config.InternalConfig$Factory.<clinit>(InternalConfig.java:232)
at com.amazonaws.ServiceNameFactory.getServiceName(ServiceNameFactory.java:34)
at com.amazonaws.AmazonWebServiceClient.computeServiceName(AmazonWebServiceClient.java:703)
at com.amazonaws.AmazonWebServiceClient.getServiceNameIntern(AmazonWebServiceClient.java:676)
at com.amazonaws.AmazonWebServiceClient.computeSignerByURI(AmazonWebServiceClient.java:278)
at com.amazonaws.AmazonWebServiceClient.setEndpoint(AmazonWebServiceClient.java:160)
at com.amazonaws.services.s3.AmazonS3Client.setEndpoint(AmazonS3Client.java:475)
at com.amazonaws.services.s3.AmazonS3Client.init(AmazonS3Client.java:447)
at com.amazonaws.services.s3.AmazonS3Client.<init>(AmazonS3Client.java:391)
at com.amazonaws.services.s3.AmazonS3Client.<init>(AmazonS3Client.java:371)
at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:235)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2811)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:100)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2848)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2830)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:389)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:356)
at org.apache.hadoop.hive.metastore.Warehouse.getFs(Warehouse.java:104)
at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:140)
at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:146)
at org.apache.hadoop.hive.metastore.Warehouse.getWhRoot(Warehouse.java:159)
at org.apache.hadoop.hive.metastore.Warehouse.getDefaultDatabasePath(Warehouse.java:177)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB_core(HiveMetaStore.java:601)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:620)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:461)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5762)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5757)
at org.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:5990)
at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:5915)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: java.lang.ClassNotFoundException: com.fasterxml.jackson.databind.ObjectMapper
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
The missing class belongs to the Jackson library, then I have copied the Jackson-*.jar located on my spark-2.2.0-bin-hadoop2.7/jars/ folder which are:
- jackson-annotations-2.6.5.jar
- jackson-core-2.6.5.jar
- jackson-core-asl-1.9.13.jar
- jackson-databind-2.6.5.jar
- jackson-jaxrs-1.9.13.jar
- jackson-mapper-asl-1.9.13.jar
- jackson-module-paranamer-2.6.5.jar
- jackson-module-scala_2.11-2.6.5.jar
- jackson-xc-1.9.13.jar
But then I got the following error:
2018-01-05 17:51:00,819 ERROR [main]: metastore.HiveMetaStore (HiveMetaStore.java:main(5920)) - Metastore Thrift Server threw an exception...
java.lang.NumberFormatException: For input string: "100M"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:589)
at java.lang.Long.parseLong(Long.java:631)
at org.apache.hadoop.conf.Configuration.getLong(Configuration.java:1319)
at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:248)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2811)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:100)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2848)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2830)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:389)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:356)
at org.apache.hadoop.hive.metastore.Warehouse.getFs(Warehouse.java:104)
at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:140)
at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:146)
at org.apache.hadoop.hive.metastore.Warehouse.getWhRoot(Warehouse.java:159)
at org.apache.hadoop.hive.metastore.Warehouse.getDefaultDatabasePath(Warehouse.java:177)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB_core(HiveMetaStore.java:601)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:620)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:461)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5762)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5757)
at org.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:5990)
at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:5915)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
I think the error here it have something to do with some jar version incompatibility but I'm not able to find the correct versions.
Can someone help me here?