Posts

iowait - Houston we have a problem - or?

Image
The metric iowait is approximately as cumbersome as the memory free in unix. From time to time there is a major issue with the servers not coping with the demands and the applications see high iowaits. I always stumble when I try to explain that the metric doesn't necessarily tell you the truth. It seems like everyone knows that iowait is the time spent by the CPU waiting for IO to complete and that sounds bad. Back in the days it was bad, there was one CPU and it couldn't do anything until the IO completed. But now a days we have more core's i.e. other processes can continue on core's that aren't blocked by the single IO-wait. To add further insult the increase of CPU-performance has outperformed the improvements of disk performance. SSD disks are still fairly expensive, at least for larger disks event though they do stand for a gigantic leap in performance improvements. This means that we have to factors (CPU speed and multi-cores) that mitigate the issue o...

To be or not to be - hacked - a "visitor"

It took it's time or I didn't even notice :-) before I got a visitor. And when it finaly happened it was not as exciting as I had hoped. So where did the attempts origin from? A typical week you would see something like this: 1.93.24.0/24 | CN | DXTNET Beijing Dian-Xin-Tong Network Technologies Co., Ltd.  1.93.0.0/16 | CN | CHINA169-BJ CNCGROUP IP network China169 Beijing Province Network  1.93.0.0/16 | CN | CNIX-AP China Networks Inter-Exchange  120.194.0.0/16 | CN | CMNET-V4HENAN-AS-AP Henan Mobile Communications Co.,Ltd  120.236.0.0/16 | CN | CMNET-GUANGDONG-AP China Mobile communications corporation  122.154.0.0/16 | TH | CAT-AP The Communication Authoity of Thailand, CAT  182.73.0.0/16 | IN | BBIL-AP BHARTI Airtel Ltd.  222.33.0.0/16 | CN | CTTNET China TieTong Telecommunications Corporation  88.198.0.0/16 | DE | HETZNER-AS Hetzner Online AG  91.236.116.0/24 | NL | P...

To be or not to be - Hacked

After christmas I learned that we had a suspected break-in in one of our production sites. The incident occured just before christmas and I was dumb found to find out about it after the holidays and even more so when the incident had been closed with a non conclusive result. I don't know if it just me; because in my book the suspicion alone is a "stop the world operations event". You are not content with and you just don't leave it in an inconclusive state... The reason we suspected it was that /var and /root and /sbin was gone from one of the machines. Yes - there are more non obvious ways to hide you're doings but this did remove any potential traces and did cause some havoc. Althoug I must confess that I would expect more from any one capable of penetrating us; either in the capacity of destruction or the sutelty of their presence. After Christmas I learned that we had a suspected break-in in one of our production sites. The incident occurred just before C...

Ubuntu 13.04 and Neo4j in just a few simple steps

Image
Ubuntu 13.04 from ( http://www.ubuntu.com/start-download?distro=desktop&bits=32&release=latest ) There is really no reason to write this since the instructions to install both Ubuntu 13.04 and Neo4j are sufficient. But then you always seems to end up with a tweak somewhere so... Just step though the Ubuntu 13.04 installation screens and fill in the values you need to change Reboot and you should a brand new Ubuntu installation to play with. Now lets add Neo4j. Fire up a terminal and write sudo -s to get root priviliges: From this point execute the following commands as root:      apt-get update    apt-get upgrade    wget -O - http://debian.neo4j.org/neotechnology.gpg.key | apt-key add -    echo 'deb http://debian.neo4j.org/repo stable/' > /etc/apt/sources.list.d/neo4j.list    apt-get update    apt-get install neo4j    service neo4j-service status Should p...

Part 2.7 - Disaster Recovery with SRM & VNC

Bubblebridge is in place but since its a full fledged CentOs 6.3 desktop distro why not have proper access to it. So I thought I would enable VNC on it. Installation Just to make sure check that vnc or rather tigervnc is installed executing the following: rpm -q tigervnc-server rpm -q tigervnc If not present the following should cure that : yum install tigervnc-server tigervnc Setup VNC user accounts. Unless you want to run as an already existing user the following would create a user (or if repeated) users for you. As root. useradd passwd In my case I'll go with the bubblebridge user the above is already fixed. Next step is to edit the server configuration file  /etc/sysconfig/vncservers adding the following at the end. VNCSERVERS="2: " VNCSERVERARGS[2]="-geometry 800x600" Now its time to setup the vncpasswd for the users by su:ing out of root to the users. su - vncpasswd exit Check that the server starts and...

Part 2.5 - Disaster Recovery with SRM and vSphere Replication

Image
Networking Revised After adding a few more NICs to bubblebridge for the production Vlan's I realize that this starts to become ridiculous. 50 Vlans means 51 NIC's in bubblebridge... There has to be a better way! So in my case I believe that this alternative setup for bubblebridge is more maintainable. In this scenario I create a vlan trunk for all vlans in the recovery setup. This means that there only has to be one NIC attached to the vlan trunk in the bubblebridge server. Unfortunately its still a ton of configuration to do. This is the alternative setup and the steps to reach it. As root - fix the 70-persistent-net.rules file. # This file was automatically generated by the /lib/udev/write_net_rules # program, run by the persistent-net-generator.rules rules file. # # You can modify it, as long as you keep each rule on a single # line, and change only the value of the NAME= key. # PCI device 0x15ad:0x07b0 (vmxnet3) SUBSYSTEM=="net"...

Part 2 - Disaster Recovery with SRM and vSphere Replication

Image
In the previous article we went through the installation and configuration of the SRM and vSphere infrastructure. The time has now come to actually doing some tests and failover some VM's.  In the simple scenario I expect everything to go smoothly but there are a few things that I'm concerned about at this point since the protected environment I'm ultimately is failing over isn't so simple. It has multiple dvSwitches, vLans, load balancer, firewall, ldap, a set of test drivers to verify the integrity of the system and access to the system in a sensible way for administrators so there are a few things that needs to be resolved. Protected Setup Test Failover The small test In this scenario I do a test failover of a single machine and verifies that it starts and that I can log into it. The steps are: Setup the machine running RHEL 5.5.  Install VMware tools  Kick of the vSphere Replication Create a Protection Group and ... no that didn...