Wednesday, November 28, 2012

CentOS(installed in VirtualBox) connects to internet automatically

CentOS(installed in VirtualBox) connects to internet automatically

1. Settings(VirutalBox):
choose Network, attached to NAT, check Cable connected, and then press OK.

2. CentOS
There's a network icon on the top right CentOS screen, click it, then click VPN Connections.
You will find System eto0, after that click Edit, and then check Connect automatically.

now it should work.

Thursday, November 22, 2012

Logging to file with slf4j & How to use RollingFileAppender

The example can be downloaded from:

1. How to Use slf4j logger.  [you can find maven dependency in pom.xml file]

2. How to use logback, [you can find it in logback.xml file]

3. How to use logback Rolling Policy. [ you can find in logback.xml file]


https://github.com/haifzhan/LoggingToFile.git

Wednesday, October 31, 2012

Partition and format a hard disk

http://www.idevelopment.info/data/Unix/Linux/LINUX_PartitioningandFormattingSecondHardDrive_ext3.shtml

Cassandra & Opscenter [based on DataStax instructions]


Goal:
Description of what we want to meet:
We have 6 instances are running on EC2. For our cassandra cluster, we need 3 data centers(DC1,DC2,and DC3), which means 2 cassandra nodes(RAC1 and RAC2 respectively) for each data center. DC1 locates in  us-west-2a, DC2 locates in us-west-2b and DC3 locates in us-west-2c. Once cassandra cluster is running,  then start Opscenter and opscenter-agent to monitor cassandra cluster.


Installation:
DataStax has a very easy way to install cassandra, opscenter and opscenter-agent on CentOS:
Cassandra must be installed on all nodes
Opscenter can be installed on one of the nodes, I instlled it on the first of the 6 nodes.
Opscenter-agent must be setup on all nodes


Configuration:
All configuration files:
  • /etc/cassandra/conf/cassandra.yaml
  • /etc/cassandra/conf/cassandra-topology.properties
  • /etc/opscenter/opscenterd.conf
  • /etc/opscenter/cluster/Default.conf    [generated by yourself]
  • /var/lib/opscenter-agent/conf/address.conf
How DataStax configure the above files?
How we configure the above file?

IPs:
6 nodes, their internal IP from node1 to node6 are:
node1 & 2 are in DC1, node3 & 4 are in DC2, and node5 & 6 are in DC3.

10.252.171.91 
10.253.0.234
10.249.30.92
10.249.7.76
10.244.155.181
10.244.164.144

We will discuss those 4 configuration files in order:

cassandra.yaml:

initial_token: 
commitlog_directory: /commit
seeds: "10.252.171.91,10.249.30.92,10.244.155.181"
listen_address: 10.253.0.234
broadcast_address:
rpc_address:
endpoint_snitch: PropertyFileSnitch


Now, I'll explain them one by one.


initial_token:
we assign values to intial_token of all 6 nodes, but it seems it does not work as what we expected, so we assign the token values manually using nodetool. The bad thing is once we reboot cassandra on any of the nodes, we need to removetoken using nodetool as well. [when start cassandra, start the seed nodes first may solve the "not using assigned value" problem, or perhaps cassandra was caching old data because the first thing they advise is to rm -rf /var/lib/cassandra/data/*]

DataStax has its own way to generate tokens for multiple data center cluster: http://www.datastax.com/docs/0.8/install/cluster_init#initializing-a-multi-node-or-multi-data-center-cluster
Here's what we got:

    "0": {
        "0": 0,
        "1": 85070591730234615865843651857942052864
    },
    "1": {
        "0": 56713727820156410577229101238628035242,
        "1": 141784319550391026443072753096570088106
    },
    "2": {
        "0": 28356863910078205288614550619314017621,
        "1": 113427455640312821154458202477256070485
    }




commitlog_directory:
we allocated a single disk for the commitlog, the commitlog will increase very fast, so a single disk can guarantee that cassandra's performance will not be affected because of the increasing size of commitlog.
The disk should be partitioned and formated before use it.

seeds:
It is comma separated as you have seen our own configuration. If you read it very carefully, you will find those 3 IP belongs to node1,3 and 5. node 1, 3 and 5 are in DC1, DC2 and DC3 respectively.

listen_address:
The listen_address the local machine's internal IP address.

broadcast_address:
we leave it to blank, which means it will be the same as listen_address. 

rpc_address:
we leave it to blank, which means it will be the same as listen_address.
The default of rpc_address is "localhost", the problem is that opscenter cannot connect to cassandra cluster, once we make it blank, opscenter works fine with cassandra cluster.

endpoint_snitch: 
we use PropertyFileSnitch,  the PropertyFileSnitch can help us configure our own data centers and racks.
This refers to cassandra-topology.properties file, which is our next step.


cassandra-topology.properties:
In this configuration file, we comment out all the default DCs and RACs, and setup our owns like this:

# Our Own DCs and RACs

10.252.171.91=DC1:RAC1
10.253.0.234=DC1:RAC2
10.249.30.92=DC2:RAC1
10.249.7.76=DC2:RAC2
10.244.155.181=DC3:RAC1
10.244.164.144=DC3:RAC2

default=DC2:RAC1

It is clear. One thing need to mention is the default=DC:RAC1, once a new cassandra node is added to the cluster, its data center and rack will be set to default.



opscenterd.conf:
The following is exactly what we did in our pscenterd.conf file:

[webserver]
port = 8888
interface = 10.252.171.91

[logging]
# level may be TRACE, DEBUG, INFO, WARN, or ERROR
level = DEBUG


[agents]
use_ssl = false



One important thing is DataStax says the interface could be set to 0.0.0.0, then it could always works. BUT, our own experience tells us, don't do that! Set it to your machine's exact IP address is the best way. The problem we met here is that when interface is 0.0.0.0 , opscenter cannot find cassandra cluster.
level = DEBUG can help you find out what is going on exactly.



/etc/opscenter/cluster/Default.conf    [generated by yourself]
This configuration file is generated by yourself. If you do not create this conf file. Opscenter cannot even know cassandra cluster exist via JMX and the thrift port. Here is what we did:

[jmx]
port = 7199

[cassandra]
seed_hosts = 10.252.171.91,10.249.30.92,10.244.155.181
api_port = 9160

jmx is what we already know when cassandra is installed. use_ssl = false, this command makes ssl disabled.
seed_hosts is the exactly the same as the seeds of cassandra.yaml. This helps opscenter find our existing cassandra cluster. api_port is the thrift port( also known as rpc_port in cassandra.yaml).

[jmx]],[cassandra], and api_port you can also define in opscenterd.conf and when you run opscenter, it will genereate the cluster folder and Default.conf for you.

So far so good, opscenter and cassandra cluter should work properly. Then turn off all of them, and restart cassandra first, and then start Opscenter.

Open a browser on your local machine, type in "https://external-ip:8888"
external-ip: The public IP of the machine where you install Opscenter.


Next step is to setup opscenter-agent, the best is to set automatically:
then change address.conf, add "use_ssl : 0" to the end of the file.

RESTART opscenter and opscenter-agent.





Alternatives:
If you want to use external ip addresses  for the cassandra cluster, what you need to do:
1. change listen_address and seeds to the current machine's public ip address in cassandra.yaml file.
2. use external ip addresses in the cassandra-topology.properties file.
3. seed_hosts of Default.conf should be the same as the seeds of cassandra.yaml



Pitfalls:
0. Oraclle JRE 6. Java 7 is not recommended.

1. Install Cassandra, Opscenter and opscenter-agent all using rpm, or install all using tar.gz, do not mix those two ways.

2. For all configuration files, all IP we used are internal IP.

3. List the required ports explicitly for the firewall rules.

4. When start cassandra, start the seed nodes first.

5. Perhaps cassandra was caching old data because the first thing they advise is to rm -rf /var/lib/cassandra/data/* . You would do above if you changed the cluster_name for example.

6.I think this is the reason the tokens didn't work before:


Purging Gossip State on a Node
Gossip information is also persisted locally by each node to use immediately next restart without having to wait for gossip. To clear gossip history on node restart (for example, if node IP addresses have changed), add the following line to the cassandra-env.sh file. This file is located in /usr/share/cassandra or /conf.
-Dcassandra.load_ring_state=false

7. when you are done editing hold down the "shift" key and press "zz" - that means save and exit in vi
if you need to look at the contents of a file use "less" not vi
less /etc/cassandrfa/conf/cassandra.yaml
that will *never* affect the application but vi certainly will.











Tuesday, October 30, 2012

ssh-keygen No-Password SSH login

Assume we have two systems, one is called frog and one is called fish, and you want to login fish on frog without password.

Here's what you should do.

You're already log into frog.
# ssh-keygen -tdsa
then follow the prompts and it will generate two files under .ssh/
called id_dsa and id_dsa.pub
then you copy the contents of  id_dsa.pub and paste it into the file .ssh/authorized_keys on fish, and save it.


Opscenter on Cassandra



This is JUST a draft: I will document all details later...


Installing Cassandra on CentOS:


Locations of Configuration Files:


Installing Opscenter on CentOS:
link: http://www.datastax.com/docs/opscenter/install/install_rhel
Advanced Configuration for Opscenter:
link: http://www.datastax.com/docs/opscenter/configure/configure_opscenter_adv

4 important configuration files:
 cassandra(version 1.1.6):
1. cassandra.yaml
2. cassandra-topology.porperties
opscenter(version 2.1.2):
1. opscenterd.conf
2. /etc/opscenter/clusters/Default.conf   [ this is the most important part for opscenter, if you do not have  ]

important things to remember:
I leave the broadcast_address and rpc_address to blank, and set the listen_address to the local machine's internal ip address. it works fine.

it always say no agent connected:
solution:
we should disable ssl both on opscenter and opscenter-agent

opscenter cannot connect with cassandra:

1.we should not mix the installation, which means we cannot install both the tar.gz and rpm at the same time, it will always get bad results.
2. rpc_address must not be localhost.  it should be the numbers
3. cluster_name "Test Cluster", when we change it to "Starscriber Cluster", cassandra will not work
4. at the beginning, we set the listen_address to be the internal ip and set broadcast_address to external ip address.and it did not work.

5.opscenter only need to be installed on one of the cassandra nodes.

6. commitlog  should have a separate disk.  once you defined this disk you should partition this disk and then format it, and then mount it and change it in the /etc/fstab file(it will mount it every time the system starts)

7. initial_token???

8. we have 6 cassandra nodes. DC1 has 2, DC2 2 and DC3 2.  for our seeds, we only put one ip of each DC into the seeds list.(seeds list for cassandra.yaml, for opscenter /etc/opscenter/clusters/cluster.conf)

9. cassandra owner don't forget this part!!!



Tuesday, September 25, 2012

Apache Cassandra Failover and load balancing

Apache Cassandra Failover and Load Balancing

I have 3-node cluster, my RF=3, W=ONE and R=QUORUM, once a node is down, another two node can work properly, if the downed node comes back, it also get update.

To solve load balancing, only need to change one line of the codes(from last tutorial) to :
//load balancing build a connection pool
cassandraHostConfigurator = new CassandraHostConfigurator("192.168.0.1:9160,192.168.0.2:9160,192.168.0.3:9160");
Hector will do load balancing internally.

To solve failover problem: please check here.
Be aware of the relationship between Replication factor and Consistency Level, please check here.

W+R > N.  N is replication factor

I'm using cassandra(1.1.5), so I modified
http://ac31004.blogspot.ca/2010/08/consistencylevel-in-hector-and.html's code to:



import me.prettyprint.cassandra.service.OperationType;
import me.prettyprint.hector.api.ConsistencyLevelPolicy;
import me.prettyprint.hector.api.HConsistencyLevel;

// W+R > N .  where N is Replication Factor
public final class MyConsistencyLevel implements ConsistencyLevelPolicy
{
    @Override
    public  HConsistencyLevel get(OperationType op)
    {
       switch (op)
       {
          case READ:
              return HConsistencyLevel.QUORUM;
          case WRITE:
              return HConsistencyLevel.ONE;
          default:
              return HConsistencyLevel.QUORUM;
       }
    }

    @Override
    public HConsistencyLevel get(OperationType op, String cfName)
    {
       return HConsistencyLevel.QUORUM;
    }
}

Add or modify those code in CassandraTest.java(you can find it in last tutorial.)

       ConsistencyLevelPolicy consistencyLevelPolicy = new MyConsistencyLevel();

        Keyspace kpo = HFactory.createKeyspace(keySpace, cluster);
        // set CL
        kpo.setConsistencyLevelPolicy(consistencyLevelPolicy);



Apache Cassandra Java Program(Hector)


Hector is a high level java client for apache cassandra.
If you want to know more features of it, click here.

I'll show you the java code for connecting to cassandra database.
The following codes for inserting and reading.

Main.java

import java.util.concurrent.LinkedBlockingQueue;

public class Main
{
    private static LinkedBlockingQueue queue = new LinkedBlockingQueue();
    private static int numThreads;

    public static void main(String[] args)
    {
        String host = "192.168.0.1"
        int port = 9160;
        //cluster number
        String cluster = "Test Cluster";
        //keyspace name
        String keySpace = "keyspace_demo";
        //column family
        String colFamily = "colFamily_demo";

        for(int i = 10; i <20 i="i">

        {
            queue.offer(i);
        }
           
        CassandraTest cassandraTest = new CassandraTest();    
         cassandraTest.run();        
    }
}




CassandraTest.java


import java.util.concurrent.LinkedBlockingQueue;
import me.prettyprint.cassandra.model.ConfigurableConsistencyLevel; import me.prettyprint.cassandra.serializers.IntegerSerializer; import me.prettyprint.cassandra.serializers.StringSerializer; import me.prettyprint.cassandra.service.CassandraHostConfigurator; import me.prettyprint.hector.api.Cluster; import me.prettyprint.hector.api.HConsistencyLevel; import me.prettyprint.hector.api.Keyspace; import me.prettyprint.hector.api.beans.HColumn; import me.prettyprint.hector.api.exceptions.HectorException; import me.prettyprint.hector.api.factory.HFactory; import me.prettyprint.hector.api.mutation.Mutator; import me.prettyprint.hector.api.query.ColumnQuery; import me.prettyprint.hector.api.query.QueryResult;
public class CassandraTest  {     private final String host;     private final int port;     private final String cCluster;     private final String keySpace;     private final String mode;     private String colFamily;     private String columnName = "number";     private LinkedBlockingQueue queue;          private CassandraHostConfigurator cassandraHostConfigurator;     private Cluster cluster;     private Mutator mutator;     private ColumnQuery columnQuery;     private QueryResult> result;          public CassandraTest(String _host, int _port, String _cCluster, String _keySpace, String _colFamily, LinkedBlockingQueue _queue)     {         host = _host;         port = _port;         cCluster = _cCluster;         keySpace = _keySpace;         colFamily = _colFamily;         queue = _queue;           setup();     }     private void setup()     {        cassandraHostConfigurator = new CassandraHostConfigurator(host+":"+port);        cluster = HFactory.getOrCreateCluster(cCluster, cassandraHostConfigurator);         Keyspace kpo = HFactory.createKeyspace(keySpace, cluster);         mutator = HFactory.createMutator(kpo, IntegerSerializer.get());         columnQuery = HFactory.createColumnQuery(kpo, IntegerSerializer.get(), StringSerializer.get(), IntegerSerializer.get());     }      //read and write      public void run()     {            while(queue.isEmpty() == false)         {             int i = queue.poll();             try             {                 int numberValue = 0;                     // read from column family                     try                      {                         columnQuery.setColumnFamily(colFamily).setKey(i).setName(pcm);                         result = columnQuery.execute();                         // get the column value                         numberValue = result.get().getValue();                     }                      catch (Exception e)                      {                                        }                                               // write into column family                     mutator.insert(i, colFamily, HFactory.createColumn(columnName, (numberValue + 1)));             }             catch (HectorException e)             {                 System.out.println("HectorException-" + i + ": " + e.getMessage());             }         }     } }


Apache Cassandra Cluster Setup


Apache Cassandra Cluster Setup

This tutorial is to build a 3-node cassandra cluster.

Step1: Download apache-cassandra and install it on your operating system.[more details in the previous tutorial]

Step2: Modify configuration fils, cassandra.yaml.
You can find it is under ../conf/cassandra.yaml
There are only three things you need to focus on if you just want to build a simple cassandra cluster:
     1. listen_address
     2. rpc_address
     3. seeds
     we'll take a glance at those one by one.
1. listen_address, Here is the description about it: 

# Address to bind to and tell other Cassandra nodes to connect to. You
# _must_ change this if you want multiple nodes to be able to
# communicate!

# Leaving it blank leaves it up to InetAddress.getLocalHost(). This
# will always do the Right Thing *if* the node is properly configured
# (hostname, name resolution, etc), and the Right Thing is to use the
# address associated with the hostname (it might not be).
#
# Setting this to 0.0.0.0 is always wrong.
listen_address: 
So, leave listen_address blank is a good choice.
2. rpc_address, here is the description:
# Address to broadcast to other Cassandra nodes
# Leaving this blank will set it to the same value as listen_address
# broadcast_address: 1.2.3.4

# The address to bind the Thrift RPC service to -- clients connect
# here. Unlike ListenAddress above, you *can* specify 0.0.0.0 here if
# you want Thrift to listen on all interfaces.
# Leaving this blank has the same effect it does for ListenAddress,
# (i.e. it will be based on the configured hostname of the node).
rpc_address: 
After reading the description, leave it blank it good.
3. seeds, this part you have to edit it, the description of it:
# any class that implements the SeedProvider interface and has a
# constructor that takes a Map of parameters will do.
seed_provider:
    # Addresses of hosts that are deemed contact points. 
    # Cassandra nodes use this list of hosts to find each other and learn
    # the topology of the ring.  You must change this if you are running
    # multiple nodes!
    - class_name: org.apache.cassandra.locator.SimpleSeedProvider
      parameters:
          # seeds is actually a comma-delimited list of addresses.
          # Ex: ",,"
          - seeds: "127.0.0.1"
You only need to change the last line "- seeds: "127.0.0.1", if your 3 IP address are 192.168.0.1, 192.168.0.2, and 192.168.0.3, your last line should be 
- seeds:"192.168.0.1, 192.168.0.2, 192.168.0.3"

Step3: restart cassandra, and be sure that the port can access to other servers.
you can turn iptables off using the command in terminal:
# sudo service iptables stop


Step4: Now to check whether it works, using Cassandra-CLI(command line interface):
# ../bin/nodetool -h 192.168.1.0.1 -p 7199 ring
Address         DC      Rack    Status State   Load        Owns    Token                                       
                                                                   127605887595351923798765477786913079296     
192.168.0.1    DC1     r1      Up     Normal  17.3 MB     33.33%  0                                           
192.168.0.2    DC1     r1      Up     Normal  17.4 MB     33.33%  42535295865117307932921825928971026432      
192.168.0.3    DC1     r1      Up     Normal  37.2 MB     33.33%  85070591730234615865843651857942052864 

Step5: Write new info to your cassandra database, and check all servers get the same date.

Tuesday, September 18, 2012

Install Apache Cassandra on CentOS


Install Apache Cassandra on CentOS:
1. Download cassandra:
# cd /opt
#wget ftp://apache.sunsite.ualberta.ca/pub/apache/cassandra/1.1.5/apache-cassandra-1.1.5-bin.tar.gz 
2.Create dirctories for the following keywords in cassandra.yaml file :
data_file_directories
commitlog_directory
saved_caches_directory
# sudo mkdir -p /var/lib/cassandra/data
# sudo mkdir -p /var/log/cassandra
# sudo mkdir -p /var/lib/cassandra/saved_caches
It is better to use separate disks for commitlog and data
# sudo mkdir -p /dev/shm/cassandra/commitlog
modify configuration file(cassandra.yaml), make sure the directories you just created match with the path in the configuration file.
Edit the log4j-server.properties, make sure the path is correct as created from above commands.
log4j.appender.R.File=/var/log/cassandra/system.log


3. Create softlink
# ln -s apache-cassandra-1.1.5 /opt/cassandra
4. Make cassandra as a service,
# sudo vim /etc/init.d/cassandra
and copy&paste the following script into /etc/init.d/cassandra
-------------------------------------------------------

#!/bin/bash
# init script for Cassandra.
# chkconfig: 2345 90 10
# description: Cassandra
# script slightly modified from
# http://blog.milford.io/2010/06/installing-apache-cassandra-on-centos/

. /etc/rc.d/init.d/functions

CASS_HOME=/opt/cassandra
CASS_BIN=$CASS_HOME/bin/cassandra
CASS_LOG=/var/log/cassandra/system.log
CASS_USER="root"
CASS_PID=/var/run/cassandra.pid

if [ ! -f $CASS_BIN ]; then
  echo "File not found: $CASS_BIN"
  exit 1
fi

RETVAL=0

start() {
  if [ -f $CASS_PID ] && checkpid `cat $CASS_PID`; then
    echo "Cassandra is already running."
    exit 0
  fi
  echo -n $"Starting $prog: "
  daemon --user $CASS_USER $CASS_BIN -p $CASS_PID >> $CASS_LOG 2>&1
  usleep 500000
  RETVAL=$?
  if [ "$RETVAL" = "0" ]; then
    echo_success
  else
    echo_failure
  fi
  echo
  return $RETVAL
}

stop() {
  # check if the process is already stopped by seeing if the pid file exists.
  if [ ! -f $CASS_PID ]; then
    echo "Cassandra is already stopped."
    exit 0
  fi
  echo -n $"Stopping $prog: "
  if kill `cat $CASS_PID`; then
    RETVAL=0
    echo_success
  else
    RETVAL=1
    echo_failure
  fi
  echo
  [ $RETVAL = 0 ]
}

status_fn() {
  if [ -f $CASS_PID ] && checkpid `cat $CASS_PID`; then
    echo "Cassandra is running."
    exit 0
  else
    echo "Cassandra is stopped."
    exit 1
  fi
}

case "$1" in
  start)
    start
    ;;
  stop)
    stop
    ;;
  status)
    status_fn
    ;;
  restart)
    stop
 usleep 500000
    start
    ;;
  *)
    echo $"Usage: $prog {start|stop|restart|status}"
    RETVAL=3
esac

exit $RETVAL
------------------------------------------------------
//end of service script
start or stop cassandra service
# sudo chmod +x /etc/init.d/cassandra
# sudo service cassandra start
# sudo service cassandra stop
bring cassandra alive when reboot:
# sudo chmod +x /etc/init.d/cassandra
# sudo chkconfig --add cassandra
# sudo chkconfig cassandra on

REFERENCE:




Thursday, August 30, 2012

Install MySQL and reset root password

Install MySQL 

Using yum to install MySQL on CentOS is very easy:
yum install mysql mysql-server mysql-devel
To use mysql with PHP, you should install :
yum install php-mysql
Make mysql as a service and start when you boot your system:
# chkconfig mysqld on
# service mysqld start

To reset your root password
1. Stop mysql using command:
# sudo service mysqld stop
2. Start up the mysql daemon and skip the grant tables which store the passwords.
# mysql safe --skip-grant-tables
3. Connect to mysql without a password
# mysql --user=root mysql
4. Reset root password
# update user set Password=PASSWORD('new_password') where user='root';
# flush privileges;
# exit;
Now you can login mysql using password.