C-Hunter

2007.05.01 Gangman Yi

C-Hunter is a new clustering algorithm which incorporates knowledge of gene function derived from Gene Ontology, with the organization of genes on chromosomes. In order to use C-Hunter program, basic data sets are needed. All data sets for eight species(AT,CE,DM,DR,EC,HS,MM & SC), Data/Map file, GO, gene2accession and gene2go are supplied with C-Hunter program together. But, if you want to use new data sets, you can download from each website and you can make them again. C-Hunter program can be compiled under the Unix/Linux/Windows(Cygwin) environment, if the compiler supports STL.

Installation

tar -zxvf C_Hunter_v.1.2.tar.gz
cd C_Hunter_v.1.2
./install

Procedure for preparing data sets

If you plan to use a pre-made dataset, you don’t need to perform this procedure.

Download go.obo text file (include Molecular Function, Biological Process, Cellular) at http://www.geneontology.org/page/download-ontology
Download gene2accession at NCBI, ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/
Download gene2go at NCBI, ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/
Make Chromosome list file (text file by tab-separated format)
- 1st_Column = Chromsome Number
- 2nd_Column = Accession
- 3rd_Column = GI
Run “obo2scheme.py” with 1 Gene Ontology data
- Usage : python obo2scheme.py go.obo (file from #1)
- The output file name will be “scheme.GO.data”
Run “Make_GO_Map_Data.php” with gene2accession, gene2go and Chr_list
- Usage : php Make_GO_Map_Data.php gene2accession, gene2go, Chr_list, Ouput file name
- This converting program makes map files and GO data files.

Identifying clusters of functionally related genes in genomes

Installation

Procedure for preparing data sets

Download