Fundamentals of XML and BSML

Table2.3 Main attributes of the Sequence element comment Usually used to indicate a displayable description of the sequence record. See also the title attribute db-source Used to identify a public database, such as GenBank, EMBL, or the DNA Database of Japan (DDBJ). See also the ic-acckey ic-acckey An accession number used to uniquely identify a sequence record within the international consortium of nucleotide sequence databases. The consortium consists of GenBank, the EMBL Nucleotide Sequence...

Getting Started with BSML

Listing 2.1 The SARS virus, encoded in BSML < xml version 1.0 encoding UTF-8 > < -- SARS Coronavirus Urbani, complete genome. --> < -- Accession Number AY278741 --> < Definitions> < Sequences> < Sequence id AY278741 length 29727> < Seq-data> For brevity, sequence is truncated. < Seq-data> < Sequence> < Sequences> < Definitions> < Bsml> inspect and interact with BSML documents. This makes for much more exciting and interactive examples. Listing 2.1...

Using Ncbi Efetch and XMLSAX

Now that you understand the basics of XML SAX , you can apply this knowledge to dynamically retrieve and parse sequence data from NCBI. Fortunately for us, NCBI provides a web service, called EFetch that simplifies the process of retrieving sequence records. EFetch is actually an example of a REST-based web service (for details on REST-based web services, refer to Chapter 9). In a nutshell, client applications connect to EFetch via HTTP and specify search criteria with a set of URL parameters....

Bibliography

Barillot, XML, bioinformatics and data integration, Bioinformatics 2001 17 2 115-125. 2 Barillot, E. and F. Achard, XML a lingua franca for science , Trends in Biotechnology 2000 18 3 Bray, Tim, xml.com. Annotated XML Specification. 4 Cerami, Ethan, Web Services Essentials. 1st edn. Beijing Sebastopol, CA O'Reilly, 2002. 5 Chicurel, M., Bioinformatics bringing it all together, Nature 2002 419 6908 751, 753, 755 passim. 6 Dowell, R. D., R. M. Jokerst, A. Day,...