Debugging cgi-scripts: Difference between revisions

From genomewiki
Jump to navigationJump to search
No edit summary
 
(15 intermediate revisions by 3 users not shown)
Line 2: Line 2:
* [http://genome-source.cse.ucsc.edu/gitweb/?p=kent.git;a=blob;f=src/product/README.debug README.debug] in the source tree
* [http://genome-source.cse.ucsc.edu/gitweb/?p=kent.git;a=blob;f=src/product/README.debug README.debug] in the source tree
* [https://lists.soe.ucsc.edu/pipermail/genome-mirror/2010-March/001677.html Debug the cgi-scripts with GDB]
* [https://lists.soe.ucsc.edu/pipermail/genome-mirror/2010-March/001677.html Debug the cgi-scripts with GDB]
== Debugging with GDB ==


Complete instructions:
Complete instructions:


make sure you have compiled with -ggdb by adding  
make sure you have compiled with -ggdb and without optimizations (so we can see all variables) by adding  
   export COPT=-ggdb
   export COPT="-O0 -ggdb"
to your .bashrc (if using bash).  
to your .bashrc (if using bash). Or add to .cshrc (if using csh or tcsh)
  setenv COPT "-O0 -ggdb"
 
You might need to make clean; make cgi afterwards.
You might need to make clean; make cgi afterwards.
Also make sure that the CGIs use the right hg.conf. Run
Also make sure that the CGIs use the right hg.conf. Run
Line 13: Line 18:
Then:
Then:
   cd cgi-bin
   cd cgi-bin
   gdb hgc
   gdb --args hgc 'hgsid=4777921&c=chr21&o=27542938&t=27543085&g=pubsDevBlat&i=1000235064'
Paste the parameters from your browser like this:
  run 'hgsid=4777921&c=chr21&o=27542938&t=27543085&g=pubsDevBlat&i=1000235064'


To not forget the quotes, do not include the question mark.
To not forget the quotes, do not include the question mark from your internet browser.


To get a stacktrace
To get a stacktrace of the place where it's aborting:
   break errAbort
   break errAbort
  run 'hgsid=4831549&c=chr4&o=111542364&t=111542367&g=spMut&i=iridogoniodysgenesis+2'
  run
  where
  where
 
== Get coredumps from CGIs ==
Add this to the apache virtualhost CGI-BIN directory config to make errabort.c produce coredumps instead of errAbort error messages. You can then call gdb with
  gdb /usr/local/apache/cgi-bin/hgTracks
 
Add this to the main apache config /etc/apache2/apache2.conf to make apache allow coredumps
  CoreDumpDirectory /tmp
 
If you're using apparmor, you'll need to deactivate it or change apparmor's config (/etc/default/apport).
 
  sudo vi /etc/default/apport
  add enabled=1
 
  sudo vi /etc/sysctl.conf
  add
  kernel.core_pattern=/usr/local/dump/core.%e.%p.%s.%t
  fs.suid_dumpable=2
 
 
  sudo vi /etc/security/limits.conf
  add
  root soft core unlimited
  root hard core unlimited
  www-data soft core unlimited
  www-data hard core unlimited
 
Don't forget to restart apache.
 
== Finding memory problems with valgrind ==
 
Sometimes the program crashes at random places, because the stack or other datastructures have been destroyed by rogue code. You need valgrind to find the buggy code.
 
Run the program like this:
  valgrind --tool=memcheck --leak-check=yes pslMap ~max/pslMapProblem.psl ~max/pslMap-dm3-refseq.psl out.temp
 
== CGI is too slow ==
First thing to try: Add the "measureTiming=1" parameter to the CGI call.
 
If you still have no idea, you can ctrl-C and look for where it's stuck. Or run gprof, to show how much CPU time each function takes, or valgrind, which includes most of the I/O time.
 
If you cannot ctrl-c because it's a CGI that needs very special POST parameters, you can attach to a running CGI to see where it's stuck:
  sudo gdb /usr/local/apache/cgi-bin/hgLiftOver `pidof hgLiftOver`
 
== Profiling with gprof ==
 
First, recompile with another gcc option added or add it to your .bashrc
  export COPT='-ggdb -pg'
 
Running the programs now will create a file gmon.out in the current working directory.
 
Run hgTracks (e.g. through apache), go to the cgi-bin directory and run gprof on the newly created gprof file:
  gprof hgTracks gmon.out | less
 
hgTracks with the default tracks gave me this today:
 
  Each sample counts as 0.01 seconds.
    %  cumulative  self              self    total         
  time  seconds  seconds    calls  ms/call  ms/call  name   
  17.65      0.06    0.06  1145954    0.00    0.00  hashLookup
    8.82      0.09    0.03  281068    0.00    0.00  cloneString
    5.88      0.11    0.02  113781    0.00    0.00  hashAdd
    5.88      0.13    0.02  113781    0.00    0.00  hashAddN
    5.88      0.15    0.02    67666    0.00    0.00  lmCloneString
    4.41      0.17    0.02                            lmCloneMem
    2.94      0.18    0.01  1055248    0.00    0.00  hashFindVal
 
== Profiling with valgrind ==
 
Gprof shows you only CPU time. If you're stuck in I/O somewhere, gprof won't show it. You need to do ctrl-c a few times (best) or you can use valgrind again
  valgrind --tool=callgrind --dump-instr=yes --simulate-cache=yes --collect-jumps=yes hgTracks
  callgrind_annotate callgrind.out.<yourPID> | less
 
The tool kCacheGrind allows better inspection of the results than callgrind_annotate, but is a GUI program. It's on the big dev VM.
 
== How to set the right hg.conf for CGIs on the command line ==
 
There are two ways: change to CGI-BIN or stay in src/hg/hgTracks. See above for the variable to direct hgTracks to the right hg.conf.
 
''Otherways, Angie sez:
 
I actually want this setting from my ~/.hg.conf:
  udc.cacheDir=/data/tmp/angie/udcCacheCmdLine
 
Otherwise, since my gdb is running as angie not apache, there is a permissions error when trying to update udcCache files. But then, when I run hgTracks on the command line, I generally run in ~/kent/src/hg/hgTracks not /usr/local/apache/cgi-bin-angie.  So changing the hgConfig logic to look for ./hg.conf would not affect my gdb usage.  (my ~/kent/src/hg/trash is a symlink to /usr/local/apache/trash, and hg/.gitignore has trash and */ct/*) 
 
BTW this is the entire non-comment contents of my ~/.hg.conf :
 
  include /usr/local/apache/cgi-bin-angie/hg.conf
  db.user=XXXX
  db.password=XXXX
  udc.cacheDir=/data/tmp/angie/udcCacheCmdLine
 
So there is very little difference between cgi-bin-angie/hg.conf and ~/.hg.conf .  Why not use your ~/.hg.conf for gdb debugging?''

Latest revision as of 17:50, 10 August 2016

See also:


Debugging with GDB

Complete instructions:

make sure you have compiled with -ggdb and without optimizations (so we can see all variables) by adding

 export COPT="-O0 -ggdb"

to your .bashrc (if using bash). Or add to .cshrc (if using csh or tcsh)

  setenv COPT "-O0 -ggdb"

You might need to make clean; make cgi afterwards. Also make sure that the CGIs use the right hg.conf. Run

 export HGDB_CONF=<PATHTOCGIS>/hg.conf

Then:

 cd cgi-bin
 gdb --args hgc 'hgsid=4777921&c=chr21&o=27542938&t=27543085&g=pubsDevBlat&i=1000235064'

To not forget the quotes, do not include the question mark from your internet browser.

To get a stacktrace of the place where it's aborting:

 break errAbort
 run
 where

Get coredumps from CGIs

Add this to the apache virtualhost CGI-BIN directory config to make errabort.c produce coredumps instead of errAbort error messages. You can then call gdb with

 gdb /usr/local/apache/cgi-bin/hgTracks 

Add this to the main apache config /etc/apache2/apache2.conf to make apache allow coredumps

 CoreDumpDirectory /tmp

If you're using apparmor, you'll need to deactivate it or change apparmor's config (/etc/default/apport).

 sudo vi /etc/default/apport
 add  enabled=1
 sudo vi /etc/sysctl.conf
 add 
 kernel.core_pattern=/usr/local/dump/core.%e.%p.%s.%t
 fs.suid_dumpable=2


 sudo vi /etc/security/limits.conf
 add
 root soft core unlimited
 root hard core unlimited
 www-data soft core unlimited
 www-data hard core unlimited

Don't forget to restart apache.

Finding memory problems with valgrind

Sometimes the program crashes at random places, because the stack or other datastructures have been destroyed by rogue code. You need valgrind to find the buggy code.

Run the program like this:

 valgrind --tool=memcheck --leak-check=yes pslMap ~max/pslMapProblem.psl ~max/pslMap-dm3-refseq.psl out.temp

CGI is too slow

First thing to try: Add the "measureTiming=1" parameter to the CGI call.

If you still have no idea, you can ctrl-C and look for where it's stuck. Or run gprof, to show how much CPU time each function takes, or valgrind, which includes most of the I/O time.

If you cannot ctrl-c because it's a CGI that needs very special POST parameters, you can attach to a running CGI to see where it's stuck:

 sudo gdb /usr/local/apache/cgi-bin/hgLiftOver `pidof hgLiftOver`

Profiling with gprof

First, recompile with another gcc option added or add it to your .bashrc

  export COPT='-ggdb -pg'

Running the programs now will create a file gmon.out in the current working directory.

Run hgTracks (e.g. through apache), go to the cgi-bin directory and run gprof on the newly created gprof file:

 gprof hgTracks gmon.out | less

hgTracks with the default tracks gave me this today:

 Each sample counts as 0.01 seconds.
   %   cumulative   self              self     total           
  time   seconds   seconds    calls  ms/call  ms/call  name     
  17.65      0.06     0.06  1145954     0.00     0.00  hashLookup
   8.82      0.09     0.03   281068     0.00     0.00  cloneString
   5.88      0.11     0.02   113781     0.00     0.00  hashAdd
   5.88      0.13     0.02   113781     0.00     0.00  hashAddN
   5.88      0.15     0.02    67666     0.00     0.00  lmCloneString
   4.41      0.17     0.02                             lmCloneMem
   2.94      0.18     0.01  1055248     0.00     0.00  hashFindVal

Profiling with valgrind

Gprof shows you only CPU time. If you're stuck in I/O somewhere, gprof won't show it. You need to do ctrl-c a few times (best) or you can use valgrind again

 valgrind --tool=callgrind --dump-instr=yes --simulate-cache=yes --collect-jumps=yes hgTracks
 callgrind_annotate callgrind.out.<yourPID> | less

The tool kCacheGrind allows better inspection of the results than callgrind_annotate, but is a GUI program. It's on the big dev VM.

How to set the right hg.conf for CGIs on the command line

There are two ways: change to CGI-BIN or stay in src/hg/hgTracks. See above for the variable to direct hgTracks to the right hg.conf.

Otherways, Angie sez:

I actually want this setting from my ~/.hg.conf:

 udc.cacheDir=/data/tmp/angie/udcCacheCmdLine

Otherwise, since my gdb is running as angie not apache, there is a permissions error when trying to update udcCache files. But then, when I run hgTracks on the command line, I generally run in ~/kent/src/hg/hgTracks not /usr/local/apache/cgi-bin-angie. So changing the hgConfig logic to look for ./hg.conf would not affect my gdb usage. (my ~/kent/src/hg/trash is a symlink to /usr/local/apache/trash, and hg/.gitignore has trash and */ct/*)

BTW this is the entire non-comment contents of my ~/.hg.conf :

 include /usr/local/apache/cgi-bin-angie/hg.conf
 db.user=XXXX
 db.password=XXXX
 udc.cacheDir=/data/tmp/angie/udcCacheCmdLine

So there is very little difference between cgi-bin-angie/hg.conf and ~/.hg.conf . Why not use your ~/.hg.conf for gdb debugging?