Tuesday, March 12, 2013

Monitoring traffic rate on Apache Web (httpd)

Monitoring traffic rate on Apache Web (httpd)


One of the main issues when working on performance is to implant different monitors all along your application data flow.
I've used HitCounter (legacy liveperson code), Jamon (thanks to Dmitry Voronov)  and yammer metrics (thanks to Michael Rushanik)  within my tomcat REST application (detailed Post with my conclusions will come soon, hopefully)

But if you want also to check your apache web server and obtain the traffic rate it currently handle or the current number of opened connections, this post is for you ...

First of all, the status is ugly , not really graphic but it exists . It looks like that (the bold emphasize is mine)
http://tlvvp1/server-status  (you can also use http://tlvvp1/server-status?auto)

Apache Server Status for tlvvp1

Server Version: Apache/2.2.15 (Unix)
Server Built: Feb 13 2012 22:31:42

Current Time: Tuesday, 12-Mar-2013 10:29:54 EDT
Restart Time: Tuesday, 12-Mar-2013 10:25:10 EDT
Parent Server Generation: 0
Server uptime: 4 minutes 43 seconds
Total accesses: 1248792 - Total Traffic: 2.9 GB
CPU Usage: u122.3 s123.1 cu0 cs0 - 86.7% CPU load
4410 requests/sec - 10.5 MB/second - 2503 B/request
202 requests currently being processed, 15 idle workers

KKKKKKKKKKKK.KKKKKKWKKK_WKK.K._KWKKKK...KKKK.KKKKKKKK_.KKKKK_KKK
KKK_KKKKKKK_._KKK_.K_KKKK_KKKKKKKK__CKKKWKKK.KKLKKKKKKKKWKKKKKKK
KK.KKKKKKKKKKKKK.KKKKKKKKKK.KWWKKK.KKWKKKKKKKKK..K...KKK.KKKKKKK
KKW.KKKKKKKWKKKKKKKWKK_K.K.KKWKKKKKK.K_K.KKKKKKKK_K.............
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................

Scoreboard Key:
"_" Waiting for Connection, "S" Starting up, "R" Reading Request,
"W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup,
"C" Closing connection, "L" Logging, "G" Gracefully finishing,
"I" Idle cleanup of worker, "." Open slot with no current process

Srv    PID    Acc    M    CPU     SS    Req    Conn    Child    Slot    Client    VHost    Request
0-0    4399    173/6605/6605    K     1.25    0    4    422.9    15.77    15.77     192.168.13.8    vipr-CI.tlv.lpnet.com    GET /account/site1/visitorProfiles?visitorID=user1&visitor-gene
1-0    4400    162/6594/6594    K     1.16    0    1    396.0    15.74    15.74     192.168.13.8    vipr-CI.tlv.lpnet.com    GET /account/site1/visitorProfiles?visitorID=user1&visitor-gene
2-0    4401    139/6571/6571    K     1.41    0    4    339.8    15.69    15.69     192.168.13.8    vipr-CI.tlv.lpnet.com    GET /account/site1/visitorProfiles?visitorID=user1&visitor-gene
3-0    4402    131/6563/6563    K     1.25    0    3    320.2    15.67    15.67     192.168.13.8    vipr-CI.tlv.lpnet.com    GET /account/site1/visitorProfiles?visitorID=user1&visitor-gene
4-0    4403    175/6607/6607    K     1.26    0    2    427.8    15.77    15.77     192.168.13.8    vipr-CI.tlv.lpnet.com    GET /account/site1/visitorProfiles?visitorID=user1&visitor-gene
5-0    4404    153/6585/6585    K     1.28    0    4    374.0    15.72    15.72     192.168.13.8    vipr-CI.tlv.lpnet.com    GET /account/site1/visitorProfiles?visitorID=user1&visitor-gene
6-0    4405    131/6563/6563    K     1.39    0    3    320.2    15.67    15.67     192.168.13.8    vipr-CI.tlv.lpnet.com    GET /account/site1/visitorProfiles?visitorID=user1&visitor-gene
7-0    4406    155/6587/6587    K     1.29    0    4    378.9    15.72    15.72     192.168.13.8    vipr-CI.tlv.lpnet.com    GET /account/site1/visitorProfiles?visitorID=user1&visitor-gene
8-0    4407    155/6587/6587    K     1.24    0    1    378.9    15.72    15.72     192.168.13.8    vipr-CI.tlv.lpnet.com    GET /account/site1/visitorProfiles?visitorID=user1&visitor-gene
9-0    4408    125/6757/6757    K     1.27    0    3    305.5    16.13    16.13     192.168.13.8    vipr-CI.tlv.lpnet.com    GET /account/site1/visitorProfiles?visitorID=user1&visitor-gene
10-0    4409    147/6579/6579    K     1.28    0    4    359.3    15.70    15.70     192.168.13.8    vipr-CI.tlv.lpnet.com    GET /account/site1/visitorProfiles?visitorID=user1&visitor-gene
11-0    4410    120/6151/6151    K     1.26    0    4    293.3    14.75    14.75     192.168.13.8    vipr-CI.tlv.lpnet.com    GET /account/site1/visitorProfiles?visitorID=user1&visitor-gene
12-0    -    0/0/1207    .     0.30    262    0    0.0    0.00    2.88     ::1    vipr-CI.tlv.lpnet.com    OPTIONS * HTTP/1.0
13-0    4412    153/6585/6585    K     1.31    0    1    374.0    15.72    15.72     192.168.13.8    vipr-CI.tlv.lpnet.com    GET /account/site1/visitorProfiles?visitorID=user1&visitor-gene
14-0    4413    136/6568/6568    K     1.25    0    2    332.4    15.68    15.68     192.168.13.8    vipr-CI.tlv.lpnet.com    GET /account/site1/visitorProfiles?visitorID=user1&visitor-gene
15-0    4414    145/6578/6578    K     1.28    0    3    354.4    15.70    15.70     192.168.13.8    vipr-CI.tlv.lpnet.com    GET /account/site1/visitorProfiles?visitorID=user1&visitor-gene
16-0    4415    138/6570/6570    K     1.27    0    1    337.3    15.68    15.68     192.168.13.8    vipr-CI.tlv.lpnet.com    GET /account/site1/visitorProfiles?visitorID=user1&visitor-gene
17-0    4416    143/6575/6575    K     1.35    0    4    349.5    15.69    15.69     192.168.13.8    vipr-CI.tlv.lpnet.com    GET /account/site1/visitorProfiles?visitorID=user1&visitor-gene
18-0    4417    148/6580/6580    K     1.32    0    3    361.8    15.71    15.71     192.168.13.8    vipr-CI.tlv.lpnet.com    GET /account/site1/visitorProfiles?visitorID=user1&visitor-gene
19-0    4418    163/6595/6595    W     1.31    0    0    398.4    15.74    15.74     192.168.13.8    vipr-CI.tlv.lpnet.com    GET /account/site1/visitorProfiles?visitorID=user1&visitor-gene
20-0    4419    186/6417/6417    K     1.18    0    2    454.6    15.32    15.32     192.168.13.8    vipr-CI.tlv.lpnet.com    GET /account/site1/visitorProfiles?visitorID=user1&visitor-gene
21-0    4420    187/6217/6217    K     1.23    0    3    457.1    14.84    14.84     192.168.13.8    vipr-CI.tlv.lpnet.com    GET /account/site1/visitorProfiles?visitorID=user1&visitor-gene
22-0    4421    183/6213/6213    K     1.23    0    2    447.3    14.83    14.83     192.168.13.8    vipr-CI.tlv.lpnet.com    GET /account/site1/visitorProfiles?visitorID=user1&visitor-gene
23-0    4422    0/6030/6030    _     1.22    0    3    0.0    14.39    14.39     192.168.13.8    vipr-CI.tlv.lpnet.com    GET /account/site1/visitorProfiles?visitorID=user1&visitor-gene



In order to have this server-status, do the following:

in your file /etc/httpd/conf/httpd.conf  within your apache web server, make sure the following line is uncommented:
LoadModule status_module modules/mod_status.so



Then, in create the following file /etc/httpd/conf.d/status.conf
ExtendedStatus On
<Location /server-status>
SetHandler server-status


Order Allow,Deny
Allow from all
</Location>


In addition, you can also use the balancer-manager, which shows how the load balancing is handled (how many nodes in your proxy, what is the factor of each of them etc ...)
So in your applicative conf.d/your-cluster-definition.conf

<Location /balancer-manager> SetHandler balancer-manager
Order Allow,DenyAllow from all</Location>
You can then visualize it using http://tlvvp1:80/balancer-manager/   (use the right port, if customized)  and you'll see something like:


Load Balancer Manager for tlvvp1
Server Version: Apache/2.4.3 (Unix)
Server Built: Dec 26 2012 10:54:47

LoadBalancer Status for balancer://vipr-ci
MaxMembersStickySessionDisableFailoverTimeoutFailoverAttemptsMethodPathActive
3 [3 Used](None)Off02bybusyness/accountYes

Worker URLRouteRouteRedirFactorSetStatusElectedBusyLoadToFrom
ajp://tlvvpcas3.tlv.lpnet.com:800910Init Ok00000
ajp://tlvvpcas4.tlv.lpnet.com:800910Init Ok00000
ajp://tlvvpcas2.tlv.lpnet.com:800910Init Ok00000





Then, restart your web server
service LPApache restart   [or whatever your service is called ]


Once your apache is up and running, you can access the server-status using this URL
http://tlvvp1/server-status
(replace the host name [tlvvp1] by your own one)



Tuesday, March 5, 2013

java.net.SocketException: Too many open files

I've recently experienced this bad exception in my linux machine, during load test I've launched
java.net.SocketException: Too many open files

After some google search + consultation with my colleague in production, I've learned the following:

Type
ulimit -a

[root@tlvcib7 ~]# ulimit -a

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 62827
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 256pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1024
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

This means that I'm allowed to open up to 256 file descriptors.

In order to modify it, in your own shell, do the following
ulimit -n  65536  
from then, in your own shell, you'll have the limit set to 65536 (64*1024)

You can also define it in
/etc/security/limits.conf


For everyone
*       hard    nproc           65536
*        soft    nproc           65536

or more specifically for a given user

@student        hard    nproc           20
@faculty        soft    nproc           20

or as implemented in Shark @ LP (somehow, the above lines didn't do what I expected )

root hard nofile 32768
web hard nofile 32768

root soft nofile 32768
root soft nproc 32768
web soft nproc 32768
web soft nofile 32768
web          -    nofile          32768




The file descriptor being used are in
/proc/<pid>/fd/

For ex, if your java process is 13336


 ls /proc/13336/fd0    105  112  12   127  134  141  149  156  163  170  178  185  192  2    206  213  220  228  235  242  25   259  27  34  41  49  56  63  70  78  85  921    106  113  120  128  135  142  15   157  164  171  179  186  193  20   207  214  221  229  236  243  250  26   28  35  42  5   57  64  71  79  86  9310   107  114  121  129  136  143  150  158  165  172  18   187  194  200  208  215  222  23   237  244  251  260  29  36  43  50  58  65  72  8   87  94100  108  115  122  13   137  144  151  159  166  173  180  188  195  201  209  216  223  230  238  245  252  261  3   37  44  51  59  66  73  80  88  95101  109  116  123  130  138  145  152  16   167  174  181  189  196  202  21   217  224  231  239  246  253  262  30  38  45  52  6   67  74  81  89  96102  11   117  124  131  139  146  153  160  168  175  182  19   197  203  210  218  225  232  24   247  254  263  31  39  46  53  60  68  75  82  9   97103  110  118  125  132  14   147  154  161  169  176  183  190  198  204  211  219  226  233  240  248  255  264  32  4   47  54  61  69  76  83  90  98104  111  119  126  133  140  148  155  162  17   177  184  191  199  205  212  22   227  234  241  249  258  266  33  40  48  55  62  7   77  84  91  99