Hadoop Big data – Technology, Blog

March 11, 2020June 7, 2020

What is contract testing

What is contract testing?

Contract testing is famous in microservices world. When we consider less numbers of microservices, then contract testing is not so necessary.However when you look at the Amazon or Netflix microservices mesh, it is countless. Such cases it is extremely difficult for developer or tester to maintain unit/automation test suite as per the microservices development chnages.
Contract testing is the best possible solution currently offered for this situation. In contract testing, a mocked service is created to represent the provider. There are commercial or opensource tools available to stimulate this. In short, contract is set of predefined requests and responses created by automation or developer team to do their testing.

March 1, 2020June 7, 2020

How to setup angular project

How to setup Angular project and generate components ?

install latest nodejs from nodejs.org
Go to git bash/powershell ISE/CMD
type following commands,
install Angular : npm install -g @angular/cli
ng new my-newweb-app
ng g c myFirstModule or type mg generate component myFirstModule (optional)
go to project folder ” my-newweb-app ” via commandline itself
type mg serve -o

Other programming notes

select [(ngModel)]=”value” (change)=”selected()”

option *ngFor=”let item of data” [ngValue]=”item”>{{item.name}}</option
</select

selected(){
alert(this.value.name)
}

data:Array = [
{id: 0, name: “Test “},
{id: 1, name: “Test2”}
];

div> <input ref-email placeholder=”text here” /> </div

February 26, 2020June 7, 2020

What is ELK

What is ELK (Elastic search, Logstash, Kibana) ?

ELK stack is a combination of 3 open-source products as below,

Elasticsearch, a search and analytics engine.
Logstash, that index the data to Elasticsearch. Logstash has a config file that has input,filter and output section. Config file looks like json file.
Kibana, a visualization tool which provides a web based GUI for a user. User can design the bar, plot reporting charts.
In order to make Kibana and Elasticsearch interact, you need to make both server up and running. Then logstash will index the data to Elasticsearch.
Then Kibana would read the data from Elasticsearch and visualize it.

February 24, 2020June 7, 2020

DevOps vs DevSecOps vs NetOps

What is DevOps vs DevSecOps vs NetOps ?

DevOps is the practice of using set of tools, processes and practices to get good agility in software implementation to customers. It combines Dev (software development ) and Ops (information operations). DevOps shortens the development lifecycles and provides Continous Integration & Delivery.
DevSecOps is when security features are included in DevOps CI/CD flow.
NetOps is when network services are packaged to DevOps.

February 24, 2020June 7, 2020

Jenkins job vs pipeline

What is Jenkins jobs and Jenkins pipeline ?

Jenkins jobs and Jenkins pipeline are essentially same, however pipeline is more staged flow of jobs. For Jenkins pipeline, Jenkinsfile is used.
What is Jenkinsfile?Jenkinsfile is a text file placed in the root project directory. The Jenkinsfile has multiple stages like build, unit test, sonar test, functional, regression, integration, performance testing, deployment etc.

November 30, 2017February 23, 2020

Summary on Cloud, Machine learning, Artificial intelligence and other emerging technologies

Salesforce

Salesforce.com is an American cloud computing company.
Salesforce offers Software as a Service (SaaS) platform which helps in Customer Relationship Management.
It has a multi-tenant architecture and subscriptions.

The following are the application clouds in Salesforce CRM.

Sales Cloud
2. Service Cloud
3. Marketing Cloud
4. Data cloud
5. App Cloud
6. Analytics Cloud
7. Community Cloud

Salesforce also offers Platform as a Service (PaaS) using Force.com sites.

People involved in Salesforce Implementation
1. End User ( Customer)
2. Administrator
3. Developer
4. Consultant

The following are the list of Salesforce Certifications.

Certified Administrator
2. Certified Advanced Administrator
3. Certified Sales Cloud Consultant
4. Certified Service Cloud Consultant
5. Certified Force.com Platform App Builder
6. Certified Force.com Platform Developer I
7. Certified Force.com Platform Developer II
8. Certified Technical Architect

What is Apex?

Apex is a programming language for salesforce (only).
Object Oriented Program, in which the data types have to defined.
Allows developers for flow execution in force.com platforms.
Enables developers to add business logic to most system events including button clicks, related record updates and visualforce pages.

Datatypes in Apex

Primitives

Apex primitives include the following datatypes.

Integer
Boolean
Decimal
Double
Date
Date Time
Time
String
Long
ID- Any valid salesforce.com Id.

sObjects
Any object that can be stored in force.com platform database.
sObject variable unlike primitive variable refers to row of data in salesforce. That is a complete record as a variable.

Hadoop MapReduce

Hadoop MapReduce is the main core components of Hadoop and is a programming model Hadoop MapReduce helps implementation for processing and generating large data sets, it uses parallel and distributed algorithms on a cluster. Hadoop MapReduce can handle large scale data: petabytes, exabytes.
Mapreduce framework converts each record of input into a key/value pair.

What is Blockchain and cryptocurrency

To understand more about Blockchain and crypto currency, lets explain about current Banking system works.

Current Banking systems : when a user does an online or ATM transaction , the centralized banking ledger verifies and confirm the authenticity of accounts. For that work, every bank or third party sites charges to user.

Blockchain is not like banking centralized ledger but this software uses a decentralized ledger across the thousands of computers and every transactions are updated in each and every ledger. That means everyone is aware of the transactions rather than a centralized bank stores all information and charges for that. There are volunteering systems who does this effort of maintaining all ledgers for block chain.

Block chain uses cryptography mythology to protect the ledger information so that no one can modify or destroy this.

Block chain concept is utilized by Crypto currency , online voting system, signature system, agreement systems etc.

Top 20 cryptocurrency 2017

bitcoin BTC
ethereum ETH
bitcoincash BCH
ethereumclassic ETC
litecoin LTC
einsteinium EMC2
dash DASH
ripple XRP
bitcoingold BTG
zcash ZEC
eos EOS
qtum QTUM
syscoin SYS
neo NEO
monero XMR
vertcoin VTC
iota IOT
powerledger POWR
omisego OMG
santiment SAN

December 30, 2016June 15, 2020

Hadoop Big Data quick summary

Hadoop – is a Java based programming framework that supports the processing of large data sets in a distributed computing environment
Hadoop – is based on Google File System (GFS)
Hadoop – uses thousands of nodes this is the key to improve performance.
Hadoop – is a Distributed File System or HDFS, which enables fast data transfer among the nodes.
Hadoop Configuration – has got the three modes of Hadoop configuration – Standalone, pseudo distributed, and fully distributed.
Hadoop MapReduce – Hadoop MapReduce is the core components of Hadoop and is a programming model and helps implementation for processing and generating large data sets, it uses parallel and distributed algorithms on a cluster. it can handle large scale data: petabytes, exabytes.
Mapreduce framework converts each record of input into a key/value pair.
Ubuntu Server – Ubuntu is a leading open-source platform. it helps in utilizing the infrastructure to users when they want to deploy a cloud, a web farm, or a Hadoop cluster.
HadoopDistributed File System (HDFS)- HadoopDistributed File System (HDFS) is a block-structured, distributed file system.
Distributed Cache – Distributed Cache is a Hadoop feature that helps cache files needed by applications.

Pig – is an Apache open-source project and one of the components of the Hadoop eco-system.
Pig – is a high-level data flow scripting language and runs on the Hadoopclusters.
Pig – uses HDFS for storing and retrieving data and Hadoop MapReduce for processing Big Data.

Hive – is a data warehouse system for Hadoop.
Hive – facilitates ad hoc queries and aids analysis of data sets stored in Hadoop.
Hive – provides an SQL like language called HiveQL(HQL)

Apache HBase – is a distributed, column oriented database.
Apache HBase – is built on top of HDFS.
Apache HBase – is an open-source, distributed, versioned, non relational database system.
Apache HBase – has two types of Nodes. 1. Master and 2. Region Server.

Cloudera – is a commercial vendor for deploying Hadoopin an enterprise.
Cloudera – offers ClouderaManager for system management, ClouderaNavigator for data management.

ZooKeeper – is an open source and high performance co ordination service for distributed applications.

Pivotal HD – is a commercially supported, enterprise capable distribution of Hadoop and it aims to accelerate data analytics projects.

Sqoop – Sqoop is an Apache Hadoop ecosystem project. Sqoop’s responsibility is to import or export operations across relational databases.

Apache Oozie – is a workflow scheduler system used to manage Apache Hadoop jobs/MapReduce jobs

Mahout – is library of machine learning algorithams, helps in clustering and Clustering allows the system to group various entities into separate clusters or groups based on certain characteristics or features.

Apache Cassandra – Apache Cassandra is an open source, freely distributed, high-performance, extremely scalable, and fault-tolerant post relational database.
Apache Spark – is a powerfull open source processing engine and general MapReduce like engine used for large-scale data processing.

Apache Ambari – Apache Ambari is a completely open operational tool or framework for provisioning, managing, and monitoring Apache Hadoop clusters.
Kerberos – is a third party authentication mechanism. It has a database of the users/services and their respective Kerberos passwords.

Java quick reference – Please click here

December 30, 2016February 23, 2020

Hadoop MapReduce

Hadoop MapReduce – Hadoop MapReduce is the main core components of Hadoop and is a programming model Hadoop MapReduce helps implementation for processing and generating large data sets, it uses parallel and distributed algorithms on a cluster. Hadoop MapReduce can handle large scale data: petabytes, exabytes.
Mapreduce framework converts each record of input into a key/value pair.

December 30, 2016February 23, 2020

Hadoop Distributed File System (HDFS)

Hadoop Distributed File System (HDFS)- HadoopDistributed File System (HDFS) is a block-structured, distributed file system.

December 30, 2016February 23, 2020

Hadoop Distributed Cache

Hadoop Distributed Cache – Distributed Cache is a Hadoop feature that helps cache files needed by applications.

What is contract testing?

How to setup Angular project and generate components ?

What is ELK (Elastic search, Logstash, Kibana) ?

What is DevOps vs DevSecOps vs NetOps ?

What is Jenkins jobs and Jenkins pipeline ?

Summary on Cloud, Machine learning, Artificial intelligence and other emerging technologies

Salesforce

Hadoop MapReduce

What is Blockchain and cryptocurrency

Top 20 cryptocurrency 2017

Hadoop Big Data quick summary

Pig – is an Apache open-source project and one of the components of the Hadoop eco-system.Pig – is a high-level data flow scripting language and runs on the Hadoopclusters.Pig – uses HDFS for storing and retrieving data and Hadoop MapReduce for processing Big Data.

Apache Cassandra – Apache Cassandra is an open source, freely distributed, high-performance, extremely scalable, and fault-tolerant post relational database.Apache Spark – is a powerfull open source processing engine and general MapReduce like engine used for large-scale data processing.

Java quick reference – Please click here

Pig – is an Apache open-source project and one of the components of the Hadoop eco-system.
Pig – is a high-level data flow scripting language and runs on the Hadoopclusters.
Pig – uses HDFS for storing and retrieving data and Hadoop MapReduce for processing Big Data.

Apache Cassandra – Apache Cassandra is an open source, freely distributed, high-performance, extremely scalable, and fault-tolerant post relational database.
Apache Spark – is a powerfull open source processing engine and general MapReduce like engine used for large-scale data processing.