You are on page 1of 11

Production - Diagnostics Guide

Revision History:
Date

Author

04/14/2013 Shrinivas
Narayani/Sanjaka
Malinda

Change Summary
Created the document.

Version
1.0

References:
Key

Document Location & Name

Contents
1

Introduction .................................................................................................................................................................... 2

Step 1: Database Issue .................................................................................................................................................... 3

Step 2: Application Issue ................................................................................................................................................. 3

Step 4: Denial of Service Attack ...................................................................................................................................... 4

5 How to Use Log4View to Analyze Guru.com Application Logs ....................................................................................... 4


5.1
Message View ............................................................................................................................................................. 7
5.2

Message Details .......................................................................................................................................................... 7

5.3

Logger Tree ................................................................................................................................................................. 7

5.4

Search Messages ......................................................................................................................................................... 7

5.5

Filter View ................................................................................................................................................................... 7

5.6

Message Chart (Version 3.x) ....................................................................................................................................... 7

How to Use Log Parser to Analyze IIS logs ...................................................................................................................... 7

7 How to Use Debug Diag ................................................................................................................................................ 10


7.1
Steps to check any application crash report on debug diag ..................................................................................... 10
7.2

Analyze Dump file. .................................................................................................................................................... 11

7.3

Archive/Delete Dump file.......................................................................................................................................... 11

Figures

Last Saved 2013-10-18 15:26:00 by Shrinivas Narayani

Page 1

1 Introduction
Following document describes to diagnostics steps be taken at the time site outage.

Last Saved 2013-10-18 15:26:00 by Shrinivas Narayani

Page 2

2 Step 1: Database Issue


Whenever site goes out of service especially when all the instances on ELB shows out of service typically Database is a
primary cause of outage.
Symptoms:
1. Loss of connection between web server and RDS: Error message similar to following will be logged in the Guru Log .
File. See Analyze Guru Log File
A network-related or instance-specific error occurred while establishing a connection to SQL Server" for
the projets who execute the query sql
Resolution:
Try connecting to RDS directly using query analyzer.
If you are not able to connect to database using query analyzer get in touch with AWS support to fix the connectivity
issue.
2. Time out Errors: Error message similar to following be logged in GuruLogFile. See Analyze GuruLog
Timeout expired. The timeout period elapsed prior to completion of the operation or the server
is not responding.

Resolution:
Run Sp_who command to check long running query. TBD
Check the ouput of Trace Flag. (Check enabling Trace Flag). TBD

3 Step 2: Application Issue


Symptoms:
Site is not accessible from local machine itself. (From the webserver)
Application Errors are logged in event viewer.
Resolution
Check debug diag for any crash reports. If the dump file count has increased then it is an application pool crash Count of
dump file should be zero under normal condition.
We may not get quick resolution in case of application crash. So Reset the IIS or restart the web server.
Move the dump files to a fixed location (TBD) to make the dump file count on debug diag to zero.
Analyze the dump for any resolution.

See also: How to use debug diag.

Last Saved 2013-10-18 15:26:00 by Shrinivas Narayani

Page 3

4 Step 4: Denial of Service Attack


Symptoms
One or more server fails to respond.
Slow site.
Causes
Concurrent request from a particular IP which could be bad crawler, spam harvester, or dictionary attacker.
Resolutions
Use log parser to analyze latest (at the time of montis alert) IIS logs.
Check for request pattern and response status. If they are lot of request for invalid Urls and 404 status codes in the log
then requesting client IP could be a bad ip.
Look up past activities of that IP in the following site https://www.projecthoneypot.org/search_ip.php.
Based on the activity history ban the IP from accessing the site in the future.

5 How to Use Log4View to Analyze Guru.com Application Logs


Log4View is installed on prod filestore machine. Configuration file are located on desktop of prod-file store.
Ex: Opening Prod-Guru Log opens Guru Logs from all the web servers.
Similarly there are Configuration files for GuruPayment logs, OpenId logs etc.
Log4View are can be set to show the real-time logs by clicking the play button as shown in Figure 1.
Default configuration can be changed to filter the logs based on criteria provided. Ex: Log files can be filtered to show
the logs for particular duration as shown in Figure 2.

Last Saved 2013-10-18 15:26:00 by Shrinivas Narayani

Page 4

Figure 1 Application Configuration Shortcuts

Last Saved 2013-10-18 15:26:00 by Shrinivas Narayani

Page 5

Figure 2: Main Window

Figure 3: Popup Menu

Figure 4: Color configuration

Last Saved 2013-10-18 15:26:00 by Shrinivas Narayani

Page 6

5.1

Message View
The Message View shows a list of all selected messages. Message layout and color can be freely customized. The
context menu offers are powerful selection of different filter techniques.

5.2

Message Details
All details of the currently selected message are shown here. Resizable text boxes allow event the inspection of
complex formatted log output like call stacks.

5.3

Logger Tree
The logger tree shows all loggers in a hierarchical order and permits setting log level filters for each logger
individually. The logger tree shows the selected log level of each logger.

5.4

Search Messages
Use wildcards or regular expressions to find log messages.

5.5

Filter View
Here you can comfortably manage additional filters (wildcards or regular expressions based), which extend the
limitations of logger, log level or time based filtering.

5.6

Message Chart (Version 3.x)


Visualize loggers as state chart, line chart or histogram.

6 How to Use Log Parser to Analyze IIS logs


The program has been installed at C:\Users\guru\Desktop\LPS, and also has a short cut on the file-store machine
desktop (LPS).

Last Saved 2013-10-18 15:26:00 by Shrinivas Narayani

Page 7

Figure 5: Main Window of Log Parser Studio.

A.
Click this button to open a log IIS logs. Add one more IIS logs from one or more Webserver. There is shared
folder logs(\\10.0.1.21\logs) on every webserver which maps to c:\intetpub\wwwroot\IISlogs
B.

Click this button to execute the query. Enter the query in the textarea (E) before executing the query.

C.

Tabs for creating multiple queries.

D.

Panel to show the query result.

E.

Text area to enter custom queries to filter the logs.

Click this button to get the chart view of the result.

Note: More information is available at C:\Users\guru\Desktop\LPS\ LPS_Manual.pdf file.


Customized Query
Important custom query are listed below. There are also documented under C:\Users\guru\Desktop\LPS folder\ is
Query404_500.txt

Find the url, Server IP, Client IP, Status (200, 404 or etc.) and time on specific time period
================================================================================

Last Saved 2013-10-18 15:26:00 by Shrinivas Narayani

Page 8

SELECT cs-uri-stem as URL, s-ip as Server1, c-ip as Client ,sc-status as code, time as Time
FROM '[LOGFILEPATH]' where (time > '06:36:03') and (time < '06:41:53') and (date= '2013-04-01')
order by Time

----------------------------------------------------------------------------------------------------Find the specific client IP Address and Specific count based on number of hits by particular IP on specific time period
================================================================================================
==
SELECT
c-ip as Client ,count(c-ip)
FROM '[LOGFILEPATH]' where (time > '19:20:03') and (date= '2013-04-01')
group by c-ip order by count(c-ip)

Find Pages with 500 errors


============================
SELECT cs-uri-stem as Url, sc-status as code, COUNT(cs-uri-stem) AS Hits FROM '[LOGFILEPATH]'
WHERE (sc-status >= 500) GROUP BY cs-uri-stem, code ORDER BY Hits DESC

---------------------------------------------------------------------------------------------------Find 404 Requests


=======================
SELECT cs-uri-stem as Url, sc-status as code, COUNT(cs-uri-stem) AS Hits FROM '[LOGFILEPATH]'
WHERE (sc-status = 404) GROUP BY cs-uri-stem, code ORDER BY Hits DESC

----------------------------------------------------------------------------------------------------Find the Slowest Pages


==========================
SELECT TOP 100 cs-uri-stem AS Url, MIN(time-taken) as [Min], AVG(time-taken) AS [Avg], max(timetaken) AS [Max],
count(time-taken) AS Hits FROM '[LOGFILEPATH]' GROUP BY Url ORDER BY [Avg] DESC

-----------------------------------------------------------------------------------------------------

Find Pages with 500 errors on specific time period


================================================
SELECT cs-uri-stem as Url, sc-status as code, COUNT(cs-uri-stem) AS Hits FROM '[LOGFILEPATH]'
WHERE (sc-status >= 500) and (time > '06:41:53') and (time < '06:45:53') and (date= '2013-0328') GROUP BY cs-uri-stem, code ORDER BY Hits DESC

----------------------------------------------------------------------------------------------------Find 404 Requests on specific time period


==========================================

Last Saved 2013-10-18 15:26:00 by Shrinivas Narayani

Page 9

SELECT cs-uri-stem as Url, sc-status as code, COUNT(cs-uri-stem) AS Hits FROM '[LOGFILEPATH]'


WHERE (sc-status = 404) and (sc-status >= 500) and (time > '06:41:53') and (time < '06:45:53')
and (date= '2013-03-28') GROUP BY cs-uri-stem, code ORDER BY Hits DESC

----------------------------------------------------------------------------------------------------Find the Slowest Pages on specific time period


=============================================
SELECT TOP 100 cs-uri-stem AS Url, MIN(time-taken) as [Min], AVG(time-taken) AS [Avg], max(timetaken) AS [Max],
count(time-taken) AS Hits FROM '[LOGFILEPATH]' where (sc-status >= 500) and (time > '06:41:53')
and (time < '06:45:53') and (date= '2013-03-28') GROUP BY Url ORDER BY [Avg] DESC

----------------------------------------------------------------------------------------------------Find the url and specifc status(200, 404 or etc) on specific time period
================================================================
SELECT cs-uri-stem AS Url, sc-status
FROM '[LOGFILEPATH]' where (time > '06:36:03') and (time < '06:41:53') and (date= '2013-03-28')

-----------------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------------------------Find the specific URL and Specific count based on number of hits by particular URL on specific time period
==================================================================================================
=======================
SELECT cs-uri-stem as Url,
WHERE (time > '06:41:53')
DESC

COUNT(cs-uri-stem) AS Hits FROM '[LOGFILEPATH]'


and (date= '2013-03-28') GROUP BY cs-uri-stem, code ORDER BY Hits

7 How to Use Debug Diag


Debug diag is installed on all the web server. It is used for capturing dump files in case of application crash. Tool is
configured capture application crash for DefaultAppPool used by marketplace web application and ClassicAppNetPool
used for Virtual directories ( GuruFiles, Upload and Video)

7.1

Steps to check any application crash report on debug diag


1. Click start-programs- DebugDiag to open the tool.

Last Saved 2013-10-18 15:26:00 by Shrinivas Narayani

Page 10

2. Click the Rules tab.


3. Check the user dump count. Ideally user dump count should zero if the site is functioning normally.
4. If the dump file count has increased then analyze the dump files using following steps.

7.2

Analyze Dump file.

Click on advance analysis tab on Debug diag.


Click Add dataFiles to add the latest dump file located under DumpPath specified under Rules tab.
Click Start Analysis.

Figure 6 : Debug diag

7.3

Archive/Delete Dump file

After analysis is complete the dump file should be moved to another location in order the bring the dump file count to
zero.
Note that size of dump file is 1GB. Files should be deleted periodically in order to maintain the disc space.

Last Saved 2013-10-18 15:26:00 by Shrinivas Narayani

Page 11

You might also like