Questions tagged [cloudera-cdp]

The tag has no usage guidance, but it has a tag wiki.

Filter by
Sorted by
Tagged with
3 votes
0 answers
792 views

How do I Create Hive External table on top of ECS S3 object storage using "S3a//" protocol

I am trying to create Hive external table using Beeline on top of S3 object storage using "S3a//" scheme.I have followed the official cloudera documentation and configured the below ...
krupamay ghosal's user avatar
1 vote
1 answer
4k views

Unable to create Managed Hive Table after Hortonworks (HDP) to Cloudera (CDP) migration

We are testing our Hadoop applications as part of migrating from Hortonworks Data Platform (HDP v3.x) to Cloudera Data Platform (CDP) version 7.1. While testing, we found below issue while trying to ...
Vasanth Subramanian's user avatar
1 vote
1 answer
1k views

Read/Write with Nifi to Kafka in Cloudera Data Platform CDP public cloud

Nifi and Kafka are now both available in Cloudera Data Platform, CDP public cloud. Nifi is great at talking to everything and Kafka is a mainstream message bus, I just wondered: What are the minimal ...
Dennis Jaheruddin's user avatar
1 vote
1 answer
580 views

How to migrate roles from one apache ranger instance to other instance?

We are planning to make a replica cluster of existing CDP cluster. I can import/export policies but can not import/export roles. We have around 2k+ roles, using following api I can create role but ...
potatoaim's user avatar
  • 115
1 vote
0 answers
38 views

How can I process FHIR data in Cloudera CDP environment?

I'm developing a NiFi pipeline in CDP to process FHIR data that are stored in an external DB. Is there any specific tool that I can use in Apache NiFi to read and manipulate FHIR data? Or, as ...
Oppie's user avatar
  • 13
1 vote
0 answers
18 views

In CDP how to update OneViewofProfile Id through (VisitorId, browserID)?

anyone from CDP certified guide me on this. what is the better approach to update CDP profile id, and how these below use cases are same. When 2 user are using same device, do they got the same ...
Touristic Indian's user avatar
1 vote
1 answer
1k views

Hive managed table issue to create a hive table from a hdfs location in CDP

I have a CDP 7.3.1 where using sqoop , I have loaded data from Postgres database table into HDFS location /ts/gp/node. Now I am trying to create a hive table on this. I get the below error. Please ...
stacktesting's user avatar
1 vote
1 answer
526 views

Scala - How to read MQ message which exceed 4096 characters

Application Information: IBM MQ 9.2, Cloudera CDP 7.1.6, Spark 2.4.5 I am upgrading the spark code from Spark 1.6 to Spark 2.4.5. I have a json content (complex schema) push to the MQ Queue which the ...
Chia's user avatar
  • 11
0 votes
1 answer
287 views

Programmatic way to find the cluster version from CDSW - Cloudera Data Science Workbench

Is there any programmatic way to find out the cluster version(CDH6 or CDP7) from a CDSW session? Could any environment variable give a fool-proof way to determine the cluster version?
Blessy's user avatar
  • 500
0 votes
1 answer
174 views

Connecting to Impala DB using Dask Library

I am trying to connect to Impala DB through Dask Library to fetch all data from a table using the read_sql_table(). Need the connection string to connect to, I have tried using the connection string ...
harish's user avatar
  • 21
0 votes
1 answer
322 views

Connect HBase via Knox using HBase Java Client on CDP

I need to connect to HBase via Knox using HBase Java Client. I have Knox details as following Knox_Url: https://knox-host:port/gateway/cdp-proxy-api/hbase Username: knox_user_name Password: ...
saravanan's user avatar
  • 171
0 votes
0 answers
3 views

COMPUTE STATS IMPALA results in DiskErrorException

I'm trying to execute a compute stats (COMPUTE STATS db.table;) on one of my tables via IMPALA (on a ClouderaDataPlatform), but for this table only I'm encountering the following error: ...
Voxeldoodle's user avatar
0 votes
0 answers
7 views

Getting NoSuchMethodError while executing the spark job in CDP

I am getting the below error while executing the spark job in Cloudera as I have placed a jar file in hdfs and trying to initiate the sparkjob through the Oozie client and getting the below error. ...
BhairavaJosyula Raghunath's user avatar
0 votes
0 answers
63 views

Cloudera Enterprise (Community Edition) for RHEL 8

We are running a mini DWH platform with Cloudera Enterprise community version. Underlying Operating system is RHEL7 Version: Cloudera Express 6.0.1 (#610811 built by jenkins on 20181002-0044 git: ...
Ashu Rawat's user avatar
0 votes
1 answer
225 views

Is CDF feature possible using delta-spark on Cloudera distribution?

We have our application using the on-premise CDP (Cloudera) cluster for submitting pyspark jobs. Version of spark is 2.x We are now exploring the option to have CDC datasets processed and merge with ...
lbvirgo's user avatar
  • 384
0 votes
1 answer
247 views

How to find time difference between two timestamps in seconds and milliseconds in hive and impala

Need a help in finding time difference between two timestamps in seconds and milliseconds in hive and impala. We are using CDP cluster. Two columns are in string datatype with value in the format yyyy-...
Code Heaven's user avatar
0 votes
0 answers
87 views

Hue Pyspark connector using Livy - Increate spark driver memory for interactive sessions

We are using CDP private cloud 7.1.7 and have configured Hue connector for pyspark using livy. By default I can see the driver launches with 1GB memory and I need to increase this as some of the code ...
sajinma's user avatar
0 votes
1 answer
330 views

Issue of container OOM when writing Dataframe to parquet files in Spark Job

I'm using Machine Learning Workspace in Cloudera Data Platform (CDP). I created a session with 4vCPU/16 GiB Memory and enabled Spark 3.2.0. I'm using spark to load data of one month (the whole month ...
Ryan's user avatar
  • 63
0 votes
1 answer
158 views

Connection to remote Hadoop Cluster (CDP) through Linux server

I'm new to PySpark and I want to connect remote Hadoop Cluster (CDP) through Linux server by using spark-submit command. Any help would be appreciated. I need spark-submit command to connect remote ...
Sakib Sakharkar's user avatar
0 votes
1 answer
213 views

CDP spark cluster mode read hive table, Delegation Token can be issued only with kerberos or web authentication

my env cdp verison: 7.4.4 spark version:2.4.7.7.1.7.0-551 my java code is this my submit cmd: ./spark-submit --class com.abc.bdms.sparksql.SparkSQLDriver --master yarn --deploy-mode cluster --executor-...
hehe's user avatar
  • 313
0 votes
1 answer
240 views

Migration from HDP non-secure cluster to CDP secure cluster

We are running a migration of HDFS data from an HDP non-sercure cluster to CDP secure cluster, when I read the Cloudera documentation, they are mentioning "distcp" as a tool to handle the ...
YasbyM's user avatar
  • 1
0 votes
1 answer
74 views

Hive - create table - missing EOF at 'SORT' near ')'

I have this error when i try to execute the query (CREATE) below. Any suggest? ERROR: ------------------------------------------------------------------------- [sshexec] 2022-08-22 11:48:36: >> ...
Luca Archetti's user avatar
0 votes
1 answer
53 views

Apache NiFi on Cloudera Changing variables from Unauthorized Referencing Components to Referencing Processors

Goal is to move the processors that are using a variable from "Unauthorized Referencing Components" to "Referencing Processors" I've recently moved from HDP to CFM for Apache NiFi ...
Christopher Fowler's user avatar
0 votes
0 answers
487 views

case insensitive comparison in hive

I have a requirement where I need to do case-insensitive joins across the system and I don't wish to apply upper/lower functions. I tried setting TBLPROPERTIES('serialization.encoding'='...
HIMANSHU JAIN's user avatar
0 votes
1 answer
1k views

Apache Tez tasks on hold at the Application Master

I have a tez problem, when running about 14 queries at the same time, some of them get delays of more than 5 minutes, but the cluster utilization is just 14%. This is the message that I am talking ...
Marco's user avatar
  • 1,202
0 votes
1 answer
396 views

Cloudera CDP Private Cloud - Installation failed on hosts

I get an installation failed on hosts error while usng Cloudera Manager to install CDP 7.1.4 runtime on a trial basis. For this purpose I have spun up two VMs( Ubuntu 18), which use a NatNetwork to ...
Aleksandar Milosevic's user avatar