CDH 5 (Clourera) Single User Mode Installation
I am not going to write too much about Cloudera as there is tons of information available in the world of Google.What I am going to write here is the simple steps to setup Cloudera in Single User Mode...
View ArticleInstalling Google Chrome on Oracle Linux...
These steps are for Oracle Linux 64 bit:Create a file google.repo under /etc/yum.repos.d/ folder with the following content.[google-chrome]name=google-chrome –...
View ArticleHow to disable "information" messages in pyspark (python spark) console?
You may notice bunch of messages poping up, as showed below, on the console when you initiate spark console using pyspark.How do you disable these messages and only show the errors?[santhosh@localhost...
View ArticleException in secureMain java.io.IOException: failed to stat a path component:...
This is in CDH 5.8.2Data Node not able to start due to:FATALorg.apache.hadoop.hdfs.server.datanode.DataNodeException in secureMainjava.io.IOException: failed to stat a path component:...
View ArticleCDH 5.8.2 Login Page error 500
You may get this error when there is any kind of interruption in cloudera server process. In my case, I get this every time I bring up my pc after either hibernate or sleep.Just restart the cloudera...
View ArticleSpark Aggregate function with examples...
Aggregate function takes three parameters (arguments).1st parameter is the seed value which is (0,1) in most cases2nd parameter is the computation reduce function3rd parameter is the combine reduce...
View ArticleConvert xlsx to csv in Linux...
You can use Open Office for Linux to open an xlsx file and then save as csv. But this takes a bit of time to load the file in Open Office and then do convert.Also, what if there are multiple files that...
View ArticleMy First Amazon Skill...
I know this is irrelevant in my Oracle BLOB but hey this is my blog so I can blog anything I want :)Anyways, I got hooked on to this new gadget Amazon Echo over Thanks Giving and we are loving it.So, I...
View ArticleDatabases - RDBMS and NoSQL
If you are reading this blog post then you must have used at least one database in your career and wondering what's up with these new trend of "NoSQL".Before getting into NoSQL databases, let’s first...
View ArticleAWS (Amazon Web Services) Product Listing - Data(base) Related :as of June 2018
There is nothing much to blog about in this post as there is already way too much information listed below. Thought of putting it together after spending some decent amount of time reviewing AWS...
View ArticleLoading data from datafile into PostgreSQL
Loading data from a data file into PostgreSQL is easy with the command "copy".Steps involved in loading data:1] Create a table with the columns in the data file2] Use COPY to load data.That is simple....
View ArticleImport Fixed Width data file into PostgreSQL
As stated in my previous post, loading data file into PostgreSQL is easy with the COPY command.But, the problem I faced is with loading a data file which has no delimiter!!! Meaning the data file that...
View ArticleAWS Athena Vs Redshift...
In recent years data warehouse architecture has a huge shift towards Cloud-based.Here are the few benefits of cloud-based data warehousing comparing to traditional on-premise: Scalability, Cost,...
View ArticleHow to Add a shell script to Mac Applications...
Came across with this situation where I keep forgetting in which folder "that" particular shell script is at which I used a while ago... Damn, I should have placed it under "Applications" so that I...
View ArticleAWS Storage Systems Comparison - S3 vs EBS vs EFS
The table below shows the major differences (not all) between available AWS storage systems.Note I am not including Glacier here as Glacier is purely meant for archiving and it doesn't make sense to...
View ArticleConcatenated string to individual rows in Spark SQL...
I had this column named age_band which will have values like "55-64|65-74|75+"As you can see it contains age groups stored in as a string concatenated with '|' and each age group needs to be compared...
View ArticleConnecting DBeaver with Spark Databricks Cluster
Got the Simba JDBC drivers from databricks.Extracted the zip and then SimbaSparkJDBC41-2.6.3.1003.zip Adding Simba driver to DBeaver:In the DBeaver:Driver Manager Select New...
View ArticleModel-based and Model-free in Reinforcement Learning (RL)
I was going thru some predictive analytics reading and came across this interesting terms "Model-Based and Model-Free" and found the below excerpt from this pdf and thought of blogging it over here for...
View ArticleSplit a word based on case sensitive...
Received data like "AetnaMercyCareRBHA" in one of our records and showing this data in the reports is quite doesn't readable. So, the ask is to display the value like "Aetna Mercy Care RBHA"...
View ArticleDBeaver Not Responding in macOS
Not quite often but sometimes when either DBeaver or your macOS decides to frustrate you (particularly when you are already frustrated with the work 😉) then you will see DBeaver not starting up with...
View Articlesed - Add a line to all files
#below inserts a line at 1st row with the content "Added new line using sed " in all the files in the pathfor i in *; do sed -i '1s/^/Added new line using sed \n/' $i; done#below inserts a line at 3rd...
View Articleawk - Exclude or Include Records from a fixed-width file
The below command gets the records that have a value "new" in the given position (in this case the file was fixed-width). All other records which do not have the value "new" are ignored. awk '{if...
View ArticleSnowflake Constraint Column Details
At the time of this post, Snowflake did not have any dictionary view that gives the column names for constraints that are defined for a table. So, here is what I found to get that list:show primary...
View ArticleMicrosoft Teams Takes forever to upload a file
Got annoyed with Teams taking forever to upload a simple file. Found that clearing Teams Cache would help and here are the steps to do so in Mac1] Quit Teams2] Go to Finder and get to...
View ArticleSQL Server - Get all Tables and Columns from ALL Databases
This query below gives you All Tables with Columns in All the database in SQL Server:SET NOCOUNT ONDECLARE @AllTables table (DbName sysname,SchemaName sysname, TableName sysname,ColumnID sysname,...
View Article