I am running multiple queries on the hive. I have a Hadoop cluster with 6 nodes. Total vcores in the cluster is 21.
I need only 2 cores to be allocated to a python process so that the rest of the available cores will be used by another main process.
Code
from pyhive import hive
hive_host_name = "subdomain.domain.com"
hive_port = 20000
hive_user = "user"
hive_password = "password"
hive_database = "database"
conn = hive.Connection(host=hive_host_name, port=hive_port,username=hive_user, database=hive_database, configuration={})
cursor = conn.cursor()
cursor.execute('select count(distinct field) from somedata')