Ensembl data load: Difference between revisions
From genomewiki
Jump to navigationJump to search
Line 28: | Line 28: | ||
$HOME/cvs_checkout/ensembl-pipeline/scripts/analysis_setup.pl $DBSPEC -read -file repeatmask_ana.conf | $HOME/cvs_checkout/ensembl-pipeline/scripts/analysis_setup.pl $DBSPEC -read -file repeatmask_ana.conf | ||
* see what happened: | * see what happened: | ||
SELECT * from analysis | SELECT * from analysis\G | ||
*************************** 1. row *************************** | *************************** 1. row *************************** | ||
analysis_id: 1 | analysis_id: 1 | ||
created: 2010-09-13 16:50:16 | |||
logic_name: SubmitContig | |||
db: NULL | |||
db_version: NULL | |||
db_file: NULL | |||
program: NULL | |||
program_version: NULL | |||
program_file: NULL | |||
parameters: NULL | |||
module: Dummy | |||
module_version: NULL | |||
gff_source: NULL | |||
gff_feature: NULL | |||
*************************** 2. row *************************** | |||
analysis_id: 2 | |||
created: 2010-09-13 16:14:11 | created: 2010-09-13 16:14:11 | ||
logic_name: RepeatMask | logic_name: RepeatMask |
Revision as of 15:50, 13 September 2010
Load Repeatmasker file
- The make things easier, let's set a little shortcut:
export DBSPEC="-dbhost 127.0.0.1 -dbuser ens-training -dbport 3306 -dbname mouse37_mini_ref -dbpass workshop"
- Run repeatmasker on a fasta file:
RepeatMasker -species mouse -qq -dir <full_path_to_output_directory> $HOME/workshop/genebuild/test_seqs/test_sequence_to_repeatmask.fa
- Create a "dummy analysis file" which will simply select the sequences to analyse (here: contigs), e.g. create a file submit_ana.conf:
[SubmitContig] module=Dummy input_id_type=CONTIG
- Load the "dummy analysis"
$HOME/cvs_checkout/ensembl-pipeline/scripts/analysis_setup.pl $DBSPEC -read -file repeatmask_ana.conf
- Define the real analysis, e.g. repeatmask_ana.conf
[RepeatMask] db=repbase db_version=0129 db_file=repbase program=RepeatMask program_version=3.1.8 program_file=/path/to/repmasker/RepeatMask parameters=-nolow -species mouse -s module=RepeatMask gff_source=RepeatMask gff_feature=repeat input_id_type=CONTIG
- load the analysis into the mysql database
$HOME/cvs_checkout/ensembl-pipeline/scripts/analysis_setup.pl $DBSPEC -read -file repeatmask_ana.conf
- see what happened:
SELECT * from analysis\G *************************** 1. row *************************** analysis_id: 1 created: 2010-09-13 16:50:16 logic_name: SubmitContig db: NULL db_version: NULL db_file: NULL program: NULL program_version: NULL program_file: NULL parameters: NULL module: Dummy module_version: NULL gff_source: NULL gff_feature: NULL
*************************** 2. row *************************** analysis_id: 2 created: 2010-09-13 16:14:11 logic_name: RepeatMask db: repbase db_version: 0129 db_file: repbase program: RepeatMask program_version: 3.1.8 program_file: /path/to/repmasker/RepeatMask parameters: -nolow -species mouse -s module: RepeatMask module_version: NULL gff_source: RepeatMask gff_feature: repeat