You are on page 1of 100

Issues

● Process crashes
● Stuck daemons
● Broken network connections
● Memory leaks
● File system errors
Solutions?
● How to deal with such problems?
● How to debug them?
● You can debug Ruby...
● You can't debug black boxes!
Plan of attack
● Debugging UNIX processes
● Networking issues
● At Ruby-level
● At C-level
● File system issues
● Approach
Ruby is a UNIX process
ps — UNIX tool for process monitoring
ps abilities
● View all process for specific:
● User
● Group
● Terminal
● PID
● Parent
● View all threads
● View trie-view of process-list
ps usage
sudo ps aux | grep ruby — looking for PIDs of
all ruby process
sudo ps auxH — show all process with threads
sudo ps auxf — tree-view of all system
process
ps aux | grep ruby
cris@home:/home/cris ps aux |grep ruby
cris 21067 0.2 4.0 140236 125756 pts/0 Sl+ 12:56 1:19 ruby
script/server --debugger
cris@home:/home/cris ps aux|grep firefox
cris 19296 43.1 8.0 424828 249048 ? Rl 10:43 256:04
/usr/lib/firefox-3.5.9/firefox
cris@home:/home/cris ps aux|grep nginx
root 1649 0.0 0.0 4664 736 ? Ss May17 0:00 nginx: master
process /usr/sbin/nginx
www-data 1650 0.0 0.0 5200 1824 ? S May17 0:02 nginx: worker
process
ps auxH
cris@home:/home/cris ps auxH|egrep 'ruby|firefox|nginx'
root 1649 0.0 0.0 4664 736 ? Ss May17 0:00 nginx: master process
www-data 1650 0.0 0.0 5200 1824 ? S May17 0:02 nginx: worker process
cris 19296 41.8 8.1 424828 250992 ? Rl 10:43 250:13 /usr/lib/firefox-3.5.9/firefox
cris 19296 0.0 8.1 424828 250992 ? Sl 10:43 0:02 /usr/lib/firefox-3.5.9/firefox
cris 19296 0.0 8.1 424828 250992 ? Sl 10:43 0:01 /usr/lib/firefox-3.5.9/firefox
cris 19296 1.3 8.1 424828 250992 ? Rl 10:43 8:15 /usr/lib/firefox-3.5.9/firefox
cris 19296 0.0 8.1 424828 250992 ? Sl 10:43 0:00 /usr/lib/firefox-3.5.9/firefox
cris 19296 0.0 8.1 424828 250992 ? Sl 10:43 0:00 /usr/lib/firefox-3.5.9/firefox
cris 19296 0.0 8.1 424828 250992 ? Sl 10:43 0:04 /usr/lib/firefox-3.5.9/firefox
cris 19296 0.0 8.1 424828 250992 ? Sl 10:43 0:00 /usr/lib/firefox-3.5.9/firefox
cris 19296 0.0 8.1 424828 250992 ? Sl 12:44 0:00 /usr/lib/firefox-3.5.9/firefox
cris 19296 0.0 8.1 424828 250992 ? Sl 15:01 0:00 /usr/lib/firefox-3.5.9/firefox
cris 21067 0.2 4.0 140236 125756 pts/0 Sl+ 12:56 1:00 ruby script/server --debugger
cris 21067 0.0 4.0 140236 125756 pts/0 Rl+ 12:56 0:19 ruby script/server --debugger
ps auxf
cris@home:/home/cris ps auxf
postgres 23499 0.0 0.0 42148 1164 ? S Feb27 0:56 /usr/bin/postgres -D /var/lib/postgresql
postgres 25278 0.0 0.3 42280 6976 ? Ss Apr09 0:01 \_ postgres: writer process
postgres 25279 0.0 0.0 42148 816 ? Ss Apr09 0:01 \_ postgres: wal writer process
postgres 25280 0.0 0.0 42280 1012 ? Ss Apr09 0:01 \_ postgres: autovacuum launcher process
postgres 25281 0.0 0.0 13604 936 ? Ss Apr09 0:09 \_ postgres: stats collector process
www-data 312 0.0 0.0 2556 572 ? Ss Mar02 4:41 redis-server /etc/redis.conf
www-data 13956 0.0 0.0 0 0 ? Z 17:43 0:00 \_ [redis-server] <defunct>
www-data 423 0.0 0.1 6556 2880 ? Sl 13:54 0:00 PassengerNginxHelperServer passenge
www-data 438 0.0 0.6 22512 12196 ? S 13:54 0:00 \_ Passenger spawn server
www-data 10722 1.0 4.3 91968 77148 ? S 17:27 0:09 \_ Passenger ApplicationSpawner:
www-data 442 0.0 0.0 6000 800 ? Ss 13:54 0:00 nginx: master process
www-data 451 0.0 0.1 6300 1836 ? S 13:54 0:00 \_ nginx: worker process
www-data 467 0.0 0.1 6300 1840 ? S 13:54 0:00 \_ nginx: worker process
www-data 9651 0.0 0.0 1892 512 ? Ss 14:37 0:00 QUEUE=default rake resque:run
www-data 9685 0.1 4.3 91960 77680 ? S 14:37 0:13 \_ resque-1.5.0: Waiting for default
www-data 9661 0.0 0.0 1888 504 ? Ss 14:37 0:00 QUEUE=uploader rake resque:run
www-data 9696 0.1 4.3 91956 77704 ? S 14:37 0:12 \_ resque-1.5.0: Waiting for uploader
kill — send signal to process
kill usage
kill -N PID — common form

kill -9 PID — send KILL signal

kill -HUP PID — send HUP signal

man kill — for list of all signals


kill usage example
cris@home:/home/cris ps aux | grep ruby
cris 23987 20.7 1.6 65968 51208 pts/0 Sl+ 21:01 0:08 ruby
script/server --debugger
cris@home:/home/cris kill -ABRT 23987
cris@home:/home/cris ps aux | grep ruby
cris@home:/home/cris
kill for Passenger debugging

ps aux|grep Passenger

kill -ABRT PID — print backtrace in log


killall name — globall kill

killall ruby — kills all ruby-process


How does Ruby do some side-effects in
OS?
Ruby is a handy wrapper around
UNIX API system calls
strace — UNIX system calls sniffering
tool
strace as a system call spy
Allow monitoring of all UNIX system calls:
● Any filesystem operations
● Any thread operations
● Any signals activity
● Any socket operations
● Any side-effect which can be done via system call
strace debugging use-cases
● Monitoring process activity
● Monitoring filesystem activity
● Monitoring network activity
● Investigating process crash reason
● Investigating process stuck reason
strace usage

strace -s 1000 -tt -p PID

where:
-s — max size of line
-tt — show time
-p — pid of monitoring process
strace usage example
cris@home:/home/cris strace -tt -s 1000 -p 23987
Process 23987 attached - interrupt to quit
09:03:38.907564 select(8, [7], [], [], NULL^[OF) = ? ERESTARTNOHAND (To be
restarted)
09:04:01.141896 --- SIGABRT (Aborted) @ 0 (0) ---
Process 23987 detached
strace filtering via grep
cris@home:/home/cris strace -tt -s 1000 -p 679 2>&1 | grep -v sigprocmask
Process 679 attached - interrupt to quit
09:15:43.725633 select(8, [7], [], [], NULL) = 1 (in [7])
09:15:48.007594 accept(7, {sa_family=AF_INET, sin_port=htons(38510),
sin_addr=inet_addr("127.0.0.1")}, [16]) = 5
09:15:48.007890 fcntl64(5, F_GETFL) = 0x2 (flags O_RDWR)
09:15:48.007956 fstat64(5, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
09:15:48.008090 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0xb7896000
09:15:48.008167 _llseek(5, 0, 0xbf9fa68c, SEEK_CUR) = -1 ESPIPE (Illegal seek)
09:15:48.008248 fcntl64(5, F_GETFL) = 0x2 (flags O_RDWR)
09:15:48.008304 fstat64(5, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
09:15:48.008430 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0xb7895000
09:15:48.008496 _llseek(5, 0, 0xbf9fa68c, SEEK_CUR) = -1 ESPIPE (Illegal seek)
09:15:48.008749 setsockopt(5, SOL_TCP, TCP_CORK, [1], 4) = 0
09:15:48.010250 select(6, [5], [], [], {0, 0}) = 1 (in [5], left {0, 0})
09:15:48.010734 gettimeofday({1274336148, 10755}, NULL) = 0
09:15:48.011137 select(8, [5 7], [], [], NULL) = 1 (in [5])
Networking: HTTP

RFC #2616 (HTTP 1.1)

http://tools.ietf.org/html/rfc2616
HTTP issues

● Session
● Cookies
● Mime-types
● Encoding
● AJAX-requests
HTTP investigation tools
Client Side:
Firefox: Firebug plugin (F12)
Safari, Chrome: Developer Tools (Ctrl-Shift-i)
Opera: DragonFly (Ctrl-Shift-i)
Server Side:
telnet, nc — good at server-side
curl, wget — more complex HTTP-requests
Firebug for HTTP-debug
Debugging HTTP with WebKit
Debugging HTTP with Opera
All text-oriented protocols can be
debugged via telnet
Telnet
● HTTP
● Memcache
● Redis
● POP3
● STOMP
Debugging HTTP via telnet
cris@home:/home/cris telnet google.com.ua 80
Trying 74.125.87.103...
Connected to google.com.ua.
Escape character is '^]'.
GET http://www.google.com.ua/ HTTP/1.0
Host: www.google.com.ua

HTTP/1.0 200 OK
Date: Mon, 17 May 2010 20:37:38 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=windows-1251
Set-Cookie: PREF=ID=a2699d840401b148:TM=1274128658:LM=1274128658:S=nCiLvHHjeNDpnIFo;
expires=Wed, 16-May-2012 20:37:38 GMT; path=/; domain=.google.com.ua
Server: gws
X-XSS-Protection: 1; mode=block

<!doctype html><html><head><meta http-equiv="content-type" content="text/html;


charset=windows-1251"><title>Google</title><script>window.google={kEI:
====================== CUT =================================
tachEvent("onload",l);google.timers.load.t.prt=(f=(new Date).getTime());
})();
</script>Connection closed by foreign host.
Debugging Resque/Redis via telnet
cris@home:/home/cris telnet localhost 6379
Trying 127.0.0.1...
Connected to localhost.localdomain.
Escape character is '^]'.
monitor
+OK
monitor
lpop resque:queue:forwarder
lpop resque:queue:mail
set resque:worker:home:4150:mail 111
{"payload":{"args":[],"class":"Workers::Importer"},"queue":"mail","run_at":"Mon
May 17 21:29:05 +0000 2010"}
ltrim resque:queue:mail 1 0
incrby resque:stat:processed 1
incrby resque:stat:processed:home:4141:forwarder 1
del resque:worker:home:4141:forwarder
lpop resque:queue:forwarder
lpop resque:queue:statistics
incrby resque:stat:processed 1
incrby resque:stat:processed:home:4150:mail 1
del resque:worker:home:4150:mail
Networking: TCP/IP
TCP/IP
● Communicate via sockets
● Socket states:
● LISTEN
● ESTABLISHED
● TIME_WAIT2/CLOSE_WAIT
● CLOSED
● Socket = IP + PORT (Internet socket)
netstat — network activity viewer
netstat
● Allow to view all Internet sockets
● Allow to view all Unix-domain sockets
● Allow to check state of connections
● Allow to view all active services and used ports
netstat usage

sudo netstat -anp

-a — show listening and non-listening sockets


-n — skip domain name resolving(more fast)
-p — show the PID and name of the program
--inet — show only Internet sockets
Sockets in LISTEN state

cris@home:/home/cris sudo netstat -anp --inet | head


Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:5672 0.0.0.0:* LISTEN 1691/beam.smp
tcp 0 0 127.0.0.1:6379 0.0.0.0:* LISTEN 1774/redis-server
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 1539/nginx
tcp 0 0 0.0.0.0:4369 0.0.0.0:* LISTEN 1665/epmd
tcp 0 0 0.0.0.0:3000 0.0.0.0:* LISTEN 30643/ruby
tcp 0 0 127.0.0.1:5432 0.0.0.0:* LISTEN 16747/postgres
tcp 0 0 127.0.0.1:4000 0.0.0.0:* LISTEN 355/ruby
Sockets in ESTABLISHED state

cris@home:/home/cris sudo netstat -anp --inet|grep ':80 '


tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 1539/nginx
tcp 0 0 127.0.1.1:80 127.0.1.1:43896 ESTABLISHED 1540/nginx: worker
tcp 0 0 127.0.1.1:43896 127.0.1.1:80 ESTABLISHED 24408/firefox
tcp 0 0 127.0.1.1:43900 127.0.1.1:80 ESTABLISHED 24408/firefox
tcp 0 0 127.0.1.1:43899 127.0.1.1:80 ESTABLISHED 24408/firefox
tcp 0 0 127.0.1.1:43904 127.0.1.1:80 ESTABLISHED 24408/firefox
tcp 0 0 127.0.1.1:80 127.0.1.1:43894 ESTABLISHED 1540/nginx: worker
tcp 0 0 127.0.1.1:80 127.0.1.1:43899 ESTABLISHED 1540/nginx: worker
tcp 0 0 127.0.1.1:80 127.0.1.1:43900 ESTABLISHED 1540/nginx: worker
tcp 0 0 127.0.1.1:43894 127.0.1.1:80 ESTABLISHED 24408/firefox
tcp 0 0 127.0.1.1:80 127.0.1.1:43904 ESTABLISHED 1540/nginx: worker
HTTP Sockets after some time

cris@home:/home/cris sudo netstat -anp --inet|grep ':80 '


tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 1539/nginx
tcp 0 0 127.0.1.1:80 127.0.1.1:43896 FIN_WAIT2 -
tcp 1 0 127.0.1.1:43896 127.0.1.1:80 CLOSE_WAIT 24408/firefox
tcp 1 0 127.0.1.1:43900 127.0.1.1:80 CLOSE_WAIT 24408/firefox
tcp 1 0 127.0.1.1:43899 127.0.1.1:80 CLOSE_WAIT 24408/firefox
tcp 1 0 127.0.1.1:43904 127.0.1.1:80 CLOSE_WAIT 24408/firefox
tcp 0 0 127.0.1.1:80 127.0.1.1:43894 FIN_WAIT2 -
tcp 0 0 127.0.1.1:80 127.0.1.1:43899 FIN_WAIT2 -
tcp 0 0 127.0.1.1:80 127.0.1.1:43900 FIN_WAIT2 -
tcp 1 0 127.0.1.1:43894 127.0.1.1:80 CLOSE_WAIT 24408/firefox
tcp 0 0 127.0.1.1:80 127.0.1.1:43904 FIN_WAIT2 -
HTTP sockets in TIME_WAIT state

cris@home:/home/cris sudo netstat -anp|grep ':80 '


tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 1539/nginx
tcp 0 0 127.0.1.1:80 127.0.1.1:43896 TIME_WAIT -
tcp 0 0 127.0.1.1:80 127.0.1.1:43894 TIME_WAIT -
tcp 0 0 127.0.1.1:80 127.0.1.1:43899 TIME_WAIT -
tcp 0 0 127.0.1.1:80 127.0.1.1:43900 TIME_WAIT -
tcp 0 0 127.0.1.1:80 127.0.1.1:43904 TIME_WAIT -
netstat debugging use-cases
● Check network connection state
● Detect which port use service
● Check service port binding
● Check on DOS
● Check on brocken connections
● Check on leaked connections
How to see what is going on on the
wire?
tcpdump — traffic monitoring tool
tcpdump

● Low level debug of network activity


● Allow to see all packets on selected TCP/IP
connection
● Require understanding of debugged protocol
tcpdump

TCP/IP: RFC #793

http://www.faqs.org/rfcs/rfc793.html
tcpdump

sudo tcpdump -iany -A -s 0 tcp port 3000

where:
-i — interface
-s — packet size(default only 68 byte)
-A — print in ASCII
tcp port 3000 - is a expression (man pcap-filter)
tcpdump in action
cris@home:/home/cris sudo tcpdump -iany -A -s 0 tcp port 3000
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
22:52:15.133580 IP localhost.53478 > localhost.3000: Flags [S], seq 19847865, win 32792, options [mss
16396,sackOK,TS val 23239272 ecr 0,nop,wscale 6], length 0
E..<@.@.@...........................;=....@....
.b.h........
22:52:15.133595 IP localhost.3000 > localhost.53478: Flags [S.], seq 18122975, ack 19847866, win 32768,
options [mss 16396,sackOK,TS val 23239272 ecr 23239272,nop,wscale 6], length 0
E..<..@.@.<...............................@....
.b.h.b.h....
22:52:15.133607 IP localhost.53478 > localhost.3000: Flags [.], ack 1, win 513, options [nop,nop,TS val
23239272 ecr 23239272], length 0
E..4@.@.@..................................
.b.h.b.h
22:52:15.133630 IP localhost.53478 > localhost.3000: Flags [P.], seq 1:1423, ack 1, win 513, options
[nop,nop,TS val 23239272 ecr 23239272], length 1422
E...@.@.@..`...............................
.b.h.b.hGET /users/4?_=1274298735118 HTTP/1.0
X-Real-IP: 127.0.1.1
X-Forwarded-For: 127.0.1.1
Host: 127.0.0.1
Connection: close
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100401 Ubuntu/9.10 (karmic)
Firefox/3.5.9
Accept: */*
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
X-Requested-With: XMLHttpRequest
Referer: http://127.0.0.1/users
Cookie: __utma=38596661.1128088787.1267800550.1274193594.1274197735.24;
tcpdump debugging use-cases
● Check network activity on selected
port/interface
● Check whether we realy use some connection
● Check what exactly we send
● Catch all data, which are sent
tcpdump advanced usage example

tcpdump -iany -A -s 0 'tcp port 80 and ((tcp[13] &


2 == 2) or tcp[13] & 1 == 1 or tcp[13] & 8 == 8)'
Wireshark — tcpdump with GUI
● Allow to catch and analize packet traffic
● Has nice GUI interface
● Many options for configuration
● Many filters available
● You should know your protocol(read RFCs)
● Has console version: tcshark
● Can interoperate with tcpdump
Wireshark at work
From novice to seasoned
● puts (O__o)
● irb
● script/console
● log files
● ruby-debug
irb and script/server

● Execute commands directly


● Check loaded files:
● $" - list of all loaded files
● $: - search path for loaded files
ruby-debug
● Forget about silly IDE
● Production server has no GUI
● Master it
● Use it
ruby-debug usage

require 'rubygems'
require 'ruby-debug'

debugger
puts "hello debug world"
Ruby-debug: commands
n — execute next statement
s — 'step into', enter into statement
p command — execute 'command'
l — list current sources
Enter — repeat last command
h — help
h command — help for particular commands
ruby-debug with Rails

Run in command line: script/server —debugger


Put in sources: debugger
C-level of Ruby
● Interpretator crashes
● Issues with threading
● Memory leaks
● Understanding of inner work
C-level of Ruby
● Ruby Hacking Guide:
http://rhg.rubyforge.org/
http://github.com/tmm1/ruby-hacking-guide/tree/master/en/
● Aman Gupta about Threads:
http://www.slideshare.net/tmm1/threaded-awesome-1922719
● Aman Gupta about Garbage Collector:
http://www.scribd.com/doc/27174770/Garbage-Collection-and-
the-Ruby-Heap
● Ruby C sources:
http://www.ruby-lang.org/en/community/ruby-core/
C-level quiz
● nil.id, true.id, false.id
● 0.id, 1.id, …
● nil.id issue
● When we block on something?
● Why while working with big chunks of data,
interpretator leaks?
gdb — C-level debugging
● Allow to attach to leave process
● Allow to inspect memory
● Allow to debug C-code
● Allow to debug native threads
● Investigate core-dumps
gdb main weakness

You should have strong knowledge of


Ruby VM
gdb.rb by Aman Gupta

Debugging Ruby at C-level with luxury


of Ruby-level
gdb.rb
● Allow to inspect state of all executing threads
● Allow to deal with memory-leaks
● Allow to know where your thread stuck!
gdb.rb
gem install gdb.rb

Git: http://github.com/tmm1/gdb.rb

Slides:
http://www.slideshare.net/tmm1/debugging-ruby
Division by half
Division by half
● Memory leaks
● Performance bottlenecks
● Process crash
Division by half: mongrel leaks
1. Divide by half on controllers level
2. Divide by half on actions level
3. Divide by half on sources level
Memory leaks
Memory leaks in the wild
● Usage of method 'load'
● Working with big data arrays
● Selecting big data arrays from DB
load 'file'

Loads sources and doesn't cleanup


previous
Big data chunks
Conservative Garbage Collector
● Random pointers from C stack to big data
chunks
● Memory chunks for object heaps
http://izumi.plan99.net/blog/index.php/2007/10/
12/how-the-ruby-heap-is-implemented/
Memory leak detecting tools
Ruby-level:
bleak_house gem
gdb.rb gem

C-level:
Valgrind
File system debugging
● Leaked file descriptors(forget to close file)
● Leaked tempfiles
● Monitoring read/write file operations
● Monitoring filesystem activity
lsof — list opened files
lsof — netstat for files
lsof
● Allow to see all opened files for particular:
● User
● Process
● Directory or any device
● And also all opened sockets
● Allow to detect who works with some file
● And much more... (see man lsof)
lsof or netstat

sudo lsof -i -U -n

works like:

netstat -anp --inet


sudo lsof -p PID
cris@home:/home/cris sudo lsof -p 1540
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
nginx 1540 www-data cwd DIR 8,5 4096 2 /
nginx 1540 www-data rtd DIR 8,5 4096 2 /
nginx 1540 www-data txt REG 8,5 573192 226325 /usr/sbin/nginx
nginx 1540 www-data mem REG 8,5 83608 676 /lib/libz.so.1.2.3.3
- - - - - - - - - - - - - - - - - - - - - CUT - - - - - - - - - - - - - - - - - - - - - - -
nginx 1540 www-data mem REG 8,5 38504 219471 /lib/tls/i686/cmov/libnss_ni...
nginx 1540 www-data mem REG 8,5 113320 64 /lib/ld-2.10.1.so
nginx 1540 www-data mem REG 8,5 1315612 263329 /lib/i686/cmov/libcrypto.so...
nginx 1540 www-data DEL REG 0,9 5994 /dev/zero
nginx 1540 www-data 0u CHR 1,3 0t0 1232 /dev/null
nginx 1540 www-data 1u CHR 1,3 0t0 1232 /dev/null
nginx 1540 www-data 2w REG 8,5 7551063 238136 /var/log/nginx/error.log
nginx 1540 www-data 3w REG 8,5 1576266 240871 /var/log/nginx/error.log.1
(deleted)
nginx 1540 www-data 4w REG 8,5 7551063 238136 /var/log/nginx/error.log
nginx 1540 www-data 5w REG 8,5 1794602 197711 /var/log/nginx/access.log
nginx 1540 www-data 6w REG 8,5 0 238137 /var/log/nginx/local.access.log
nginx 1540 www-data 7u IPv4 5993 0t0 TCP *:www (LISTEN)
nginx 1540 www-data 9u unix 0xed792c00 0t0 5996 socket
nginx 1540 www-data 10u 0000 0,7 0 549 anon_inode
strace for monitoring file operations
strace on nginx
cris@home:/home/cris sudo strace -p 1540
Process 1540 attached - interrupt to quit
open("/var/www/nginx-default/index.html", O_RDONLY|O_LARGEFILE) = 11
fstat64(11, {st_mode=S_IFREG|0644, st_size=151, ...}) = 0
pread64(11, "<html>\n<head>\n<title>Welcome to "..., 151, 0) = 151
writev(6, [{"HTTP/1.1 200 OK\r\nServer: nginx/0"..., 225}, {"7a\r\n", 4},
{"\37\213\10\0\0\0\0\0\0\3", 10}, {"m\216M\n\200 \
20\205\367\235\302\274\200\264\237\274Fk\265A\245\311\1\31\250n\337D-[=x"..., 112},
{"\r\n0\r\n\r\n", 7}], 5) = 358
write(5, "127.0.0.1 - - [15/May/2010:09:54"..., 176) = 176
close(11) = 0
open("/var/www/nginx-default/favicon.ico", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file
or directory)
write(8, "2010/05/15 09:54:17 [error] 1540"..., 219) = 219
writev(6, [{"HTTP/1.1 404 Not Found\r\nServer: "..., 186}, {"84\r\n", 4},
{"\37\213\10\0\0\0\0\0\0\3", 10},
{"\263\311(\311\315\261\343\345\262\311HML\261\263)\311,\311I\265310Q\360\313/Qp\313/"...,
122}, {"\r\n0\r\n\r\n", 7}], 5) = 329
write(5, "127.0.0.1 - - [15/May/2010:09:54"..., 187) = 187
Process 1540 detached
Log Files
Good Log is the best debugging tool
Dealing with Log Files
● Where is logfile located?
● What if logging doesn't configured?
● What if we have several log files?
● How to monitor changes in logfile?
Logfile: where are you?

sudo lsof -p PID


sudo lsof -p PID

cris@home:/home/cris sudo lsof -p 1540


COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
- - - - - - - - - - - - - - - - - - - - - CUT - - - - - - - - - - - - - - - - - - - - - - -
nginx 1540 www-data 0u CHR 1,3 0t0 1232 /dev/null
nginx 1540 www-data 1u CHR 1,3 0t0 1232 /dev/null
nginx 1540 www-data 2w REG 8,5 7551063 238136 /var/log/nginx/error.log
nginx 1540 www-data 3w REG 8,5 1576266 240871 /var/log/nginx/error.log.1
(deleted)
nginx 1540 www-data 4w REG 8,5 7551063 238136 /var/log/nginx/error.log
nginx 1540 www-data 5w REG 8,5 1794602 197711 /var/log/nginx/access.log
nginx 1540 www-data 6w REG 8,5 0 238137 /var/log/nginx/local.access.log
nginx 1540 www-data 7u IPv4 5993 0t0 TCP *:www (LISTEN)
nginx 1540 www-data 9u unix 0xed792c00 0t0 5996 socket
nginx 1540 www-data 10u 0000 0,7 0 549 anon_inode
Logfile: logging is off

sudo strace -p PID -e write -s 1000

where:
-s — number of bytes to show
-e expression — expression for event
filtering
Logfile: dealing with group

tail -f acces.log error.log local.access.log


Logfile: monitoring with luxury
Old way: tail -f access.log
New way: less +F access.log

Advanced form: less -nrf +F file.log


where:
-n — skip line-counting
-r — show in color
-f — force to show even with binary symbols
+F — less command for switching in to
monitoring mode
less +F logfile
● Ctrl-C — break into view mode
● Shift-G — move to the end of file
● Shift-F — back to monitoring
Approach
● Bug is a lesson and isn't a punishment
● Dig one level dipper
● Improve your tools knowledge
● Improve your environment for debug
● Be patient, bugs don't like nervous, scared programmers
● Give me reproduce and it will be fixed
● Compare workable with brocken
● If nothing helps, forget about issue and try to do
something unrelated
● Practice

You might also like