Chem-Bio Informatics for Physicists
By Tsuguchika Kaminuma, Ph.D. (kaminuma@cbi.or.jp)

Contents
Introduction                        Lecture Materials                 Websites of bio-informatics
Session 1.  Views of Life
Session 2.  Molecular Computation
Session 3.  Sequence, Sequence, and Sequence!
Session 4.  3D Structure of Biomolecules
Session 5.  Modeling Cell World
Session 6.  Frontiers of Bio Sciences
Session 7.  Quest for Drugs and Safety Control of Chemicals
Session 8.  Genome Based Clinical Medicine
Session 9.  Collaborative Projects
Session 10.  Books for Physicists who are interested in biology
Appendix.

A note for lectures that will be delivered under auspices of Institute of Physics, National Center for Natural Science and Technology during 25-29 November, 2002 in Hanoi and under auspices of Center for Bio-Medical Physics during 2-4 December in Ho Chi Min City, Vietnam

By Tsuguchika Kaminuma, Ph.D. (kaminuma@cbi.or.jp)







Introduction                    Back to Top

Why I give these lectures?

In 1981 I founded a multidisciplinary research society called the Chem-Bio Informatics Association that now becomes the Chem-Bio Informatics Society. The member of this society consists of researchers from universities, national institutions, and industry laboratories. Area of interest of this society covers;

1.      Molecular Computing

2.      Molecular Recognition

3.      Bioinformatics and Computational Biology

4.      Data Analyses of Genome Wide Experiments

5.      Information and Computing Infrastructure for Pharmacology and Toxicology

6.      Disease Modeling

7.      Other topics including emerging IT and wet technologies.

It offers monthly seminars, organize annual meetings, and publish an online journal called CBI Journal.

Computational Chemistry and Bioinformatics

Though “bioinformatics” becomes very popular, we have emphasized the importance of this discipline in addition to molecular computing. Molecular computing is the heart of computational chemistry that has been ever accelerated by rapid advance of computing power. In fact bioinformatics deeply relates to molecular computing, and these two disciplines play vital roles in advancing

1.      Biological Sciences

2.      Drug Development via Computer-Aided Drug Design

3.      Safety Control of Chemicals via Computational Toxicology

4.      Environmental Problems such as bioremediation and clean energy.

Many leading computer companies look biomedical field as the next big market and their target, and a new word “BioIT” was coined.

Possible Research Projects in Vietnam

If we consider chemical computing and bioinformatics in Vietnam, two important subjects emerges;

1.      Computational Toxicology for dioxin and other chemicals

2.      Biochemical Prospecting

There may be no question about the importance of the first subject. The second subject may need some explanation. By biochemical prospecting I mean research for searching useful chemicals and useful biological organisms that contains useful chemical ingredients or offer useful materials for food, drug, or other means. Medicinal plants and useful plant hunting are good examples of this research.

In Japan the so called Chinese Traditional Medicine is still used routinely. The problem of Chinese Traditional Medicine is its complex ingredients and the lack of evidence in modern medical sense. All drugs admitted in modern regulation are of single ingredient. Even a drug consist of single chemical may hit multi-targets (biomolecules). Therefore is extremely difficult to prove the effects of multi-chemical agents by modern laboratory experiments and clinical trails.

Same problem exists for proving efficacy and danger (side effects) of foods, designer foods (functional foods), and supplements. However methodologies developed in the field of rational drug design are gradually getting into these neighboring sciences. Two research groups one in Singapore and one in China recently published papers on these problems. However because of the emerging powerful techniques of genome information and genome wide simultaneous measurements by gene chips, proteomics, and metabolonomics, it becomes realistic to attack these problems scientifically. 

But for that you must

1.                organize multi-disciplinary research team consists of both wet experiment expertise and theoretical and computational specialists of chemical computing and bioinformatics.

2.                assemble good hardware and software tools and integrate them into powerful infrastructure of your research

3.                have good contact with advanced research groups.

My lecture may gives you basic knowledge for you to think about such projects and the CBI Society members will be a good future potential collaborators of these projects.

Purpose of this Lecture

1. Introduce physics graduate students and researchers in other fields to emerging biological sciences and technologies, and show them that there are a lot of interesting problems that can be approached by those who have sound background in theoretical model building and computation.

2. Introduce informational and computing resources in computational chemistry, bioinformatics, biological computing, and biomedical sciences and how to utilize them.

3. Stimulate collaborations between experimental researchers and theoretical researchers in biosciences and biotechnologies in order to start new projects in Vietnam.

4. Suggest further collaboration of the participants with the members of The Chem-Bio Informatics Society in Japan, and give hints on planning projects for Vietnam researchers.


Lecture Materials
Almost all of the lecture materials are selected from world wide websites, edited, and put on the web site of the Chem-Bio Informatics Society website http://www.cbi.or.jp/exp/cbi/vietnam/02_lecture.html  
or in Local Website of IOP http://thule/smp/CBI Digital TL Materias.htm  
Participants are recommended to download the lecture materials prior to attend the lectures. The highly recommended materials are marked by “$

Basic Materials

Following materials are selected as the most basic for my lecture. They are reading assignment materials.

1, Introduction to modern biology

The Road to DNA (down load from web$)

MIT Biology Hypertext, Chap. Chemistry Review, Large Molecule, Cell Biology, Central Dogma, Prokaryote Genetics and Gene expression (down load from web$) 

 

2. Developing Chemical Databases

Nakano’s note on Chemical Database (e-mail)

X. Qiao, and others, A 3D Structure Database of Components from Chinese Traditional Medical Herbs (paper copy, send by EMS)

Kaminuma, Vietnam Medicinal Plant DB (will bring)

 

3. Molecular Calculation

NIH Molecular Model Tutorial (web$)

Introduction to Macromolecular Simulation (web$)

CHARMM Tutorial (web$)

Papers on Fragment Molecular Orbital Method (e-mail)

 

4. Bioinformatics

Bioinformatics Tool Guide (e-mail)

Use’s Guide to the Human Genome (web$)

 

5. ADME-Tox and Computational Toxicology

LCBRA (web$)

Koyano’s paper on dioxin (web$)

ADME QSAR L.Afzelius and S. Eikins papers (web$)

 

6. Computer-aided Drug Design

National Institute on Drug Abuse Research Monograph Series 134, Medincation Development: Drug Discovery, Database, and Computer-Aided Drug Design (web$)

R. Abagyan’s paper (copy send), H.A. Carlson’s paper (web$)

 

7. PHII Project 

Kaminuma/CBI (e-mail)

Univ. of Singapore, Bioinformatics Group (web$)

X.Chen, CLiBE paper (paper copy)

 

8. Pathway/Network to Disease

Nuclear Receptor dependent pathways and networks (e-mail)

 

9. C.elegans 

CERS (web, OHP)


References
Reference documents (books and papers) and On-line Reference Sites are given at the end of each session for further study.

Exercises:
Quizzes and problems will be found in the website materials. These are highly recommendable for further study.
 
Session 1.  Views of Life                    Back to Top

Biologist’s and Chemist’s View of Life

The Cell Theory
All living organisms are built up of cells.
1839  Microscopy observation by Schleiden and Schwann (German)
1860  Hereditary transmission through the sperm and egg

Mendelian Laws
Discovery of Genes: Each gene can exist in variety of different forms called alleles. A gene for each hereditary trait is given by each parent to each of its offspring. Later it was found that the physical basis for this behavior is in the distribution of homologous chromosomes during meiosis.
1865          Gregor Mendel published his work
1900  William Bateson rediscovered Mendel’s work.

Theory of Evolution 
Darwin’s and Wallace’s theory of evolution by natural selection:
Today’s complex plants and animals are derived by a continuous evolutionary progression from first primitive organisms.
Alfred Russel Wallace (British) 

1859              Charles Darwin, Origin of Species

Genetic information is contained in, and transmitted by, DNA.

1943              Owald Avery (Canadian/America) used pneumonia bacterium.

X-ray Crystallography

1912              Bragg solved structure of NaCl at Cabendish Lab.

1937   Max Peruz started hemoglobin analysis under Bernal 

1947              Kendrew started muscle protein myoglobin

1951   Pauling proposed helical configuration (later called alpha helix) would be important element in protein structure.

1953     Complementary Double Helix Structure Model of DNA by Crick and Watoson

1959   First protein structures were solved by Peruz and Kendrew Technological Breakthrough

In 1953 an essential breakthrough occurred in X-ray crystallography that the attachment of heavy atoms to protein molecules could logically lead from the diffraction data to correct structures.

Advances of electronic computers enabled to carry heavy calculations required for crystallographic data analysis.

A Physicist’s view

1943 Series of lectures at Trinity College in Dublin

Erwin.O. Schrodinger, What is Life ?, Cambridge Univ. Press, 1944

His book recruited many brilliant young physicists to biology after the war. The book still stimulates many researchers who have theoretical mind including biologists like Gerald Edelman.

A Mathematician’s view

All that can be calculated can be calculated by a Turing Machine.

Alan Turing, Turing Machine as a Model of Computer

An Informatics view

Self-reproducing machine needs a long tape like DNA or RNA. Biological organisms are just like molecular Turing machines!

John von Neumann (Complied by Arther W. Burks), Theory of Self-Reproduing Automata, University of Illinois Press, 1966, 

Role of Experimental Physics in Modern Biology

Measurement of Structures of Living Systems

  Optical Microscope (Nomarski Optics), Electron Microscope

X-ray crystallography

SOR (Synchrotron Orbital Radiation)

NMR (Nuclear Magnetic Resonance)

Mass Spectroscopy(MS), AMS (Accelerated MS)

Classification of Life

Taxa : Eucaryotes (fungi, plants, animals), Archae, Eubacteria

Procaryote vs. Eucaryote

Uni-cellular organism vs. Multicellular organism

Model organisms

Bacteria/E.coli(Escherichia coli), Yeast(uni-celluar eucaryote), Worm/C.elegans(Caenorabditis elegans), Fly/Drosophila melanogaster, Vertabrate/Zebrafish and Puffish?, Mammalian/Rat and Mouse, Plants/Alabidopsis thaliana and Rice, Homo sapiens/Human

Molecules in Life

There are four basic types of macromolecules in life. 
Sugars, fatty acids, amino acids, nucleotides

Structures in Cells 

Membrane

Cytoplasm

Nucleus

Mitochondria/Chlorophyll

Functions of Cells

Genome, the proramming codes of cells

Protein Synthesis: Transcription and translation of the genetic codes

  DNA Replication

  Other biosynthesis and metabolism

Energy Conversion

  ATP is the currency of various bio-energies.

Molecular Communications

  Phosphorylation by kinase vs. diphosphorylation by phosphatase

References and Reference Sites

MIT Biology Hypertextbook

NIGMC Digital Textbooks on Life Sciences

DOE Primer on Molecular Genetics

WWW Virtual Library of Cell Biology

The American Society for Cell Biology

Cell Biology Education

B. Alberts et al. , Molecular Biology of the Cell, Garland, 1994

B. Alberts et al. Essential Cell Biology: An Introduction to the Molecular Biology of the Cell, Garland, 1998

 

 

Session 2. Molecular Computation                    Back to Top

Molecular Representations
Molecular Registration
   CAS No : Chemical Abstract Service Registry Number

Molecular Formula and Molecular Drawing
ChemDraw
  3D atomic coordinates and Molecular Graphics
   CCDC(Cambridge Crystallographic Data Center)
   RasMol

Molecular Structure and Nomenclature 

   MDL, ISIS
   ChemFinder

Chemical/Molecular Database

Molecular Computation
What can we compute? 
Structure and Reactivity

  Big Commercial Vendor: Accelrys

Classical models : Molecular Mechanics

  AMBER : Force Field Caluculation

  Macro Model/MOPAC:Schrodinger

Quantum chemistry: semi-empirical and ab initio MO methods

  MOPAC
  GAMESS
 GAUSSIAN
  FMOM : Fragmented Molecular Orbital Method 

Molecular Dynamics
  CHARMM

Reference Sites
WWW Computational Chemistry Resources
NLM Chemical Information

Exercises for Session 2 

I. Developing a chemical database and put it on the web.
Find some examples of chemical databases that have 3D structure data.
 Drug Database

Carcinogenic chemicals
Endocrine Disruptors
Attribute of chemicals
  Names and ID numbers, CAS Registry Numbers

  Molecular formula and structure representation

  3D atomic coordinates
Generate all possible structures of the dioxins.
EXCEL to ACCESS
Put chemical database on the web.

A Chemical database of Vietnam Medicinal Plants


 
Session 3. Sequence, Sequence, and Sequence!                   Back to Top

Life and Computer
Similarity and difference of living organisms and computer
The Central dogma of molecular biology
Computer is a Turing Machine

A history of interference between computer technology and life science


Genome-The Code of Life

Success of The Human Genome Projects
Advances of sequencing technology
The first complete sequencing of virusφX174 genome
Sequencing of Model Organisms

Where can we find the genome sequence data and how to use that ?



Reference Sites
Access to Genome Databases
A User’s Guide to the Human Genome
Bioinformatics, Cold Spring Harbor Laboratory Press
 
Session 4. 3D Structure of Biomolecules                    Back to Top

Protein Structure: PDB

Structure Genomics-A Post Genome Challenge


Molecular Graphics and Modeling for Biomolecules
  UCSF Computer Graphics Lab
  UIUC Theoretical Biophysics Group
  NIH The Center for Molecular Modeling

Docking Study of Xenobiotic Chemicals and Target Biomolecules
  AutoDock : The Scripps Research Institute

Peptide-Protein Interaction
DOT : San Diego Super Computer Center (SDSC)

Simulation of Protein Folding -A Grand Challenge in Chemical Computing
IBM Blue Gene Project


Session 5. Modeling Cell World                    Back to Top

Genome Wide Simultaneous Measurements
DNA chip and Microarray
Proteomics
Metabolomics/Metabonomics
Protein-Protein Interaction

Mapping Molecular Interactions
Biochemical Synthesis and Metabolism Maps
Cell Signaling Pathways and Networks
Systems Theory
  Cell Simulator : Virtual Cell







Session 6. Frontiers of Bio Sciences                   Back to Top

Developmental Biology-The True Mystery of Life

Development : The process from zygot to adult

Two big events in life - gastrulation and neurulation

C.elegans the most known multicellular organism

Developmental genes, proteins, chemicals, and pathways

  Gilbert, Development Biology?
  Lewis Wolpert, Principles of Development 2nd, Oxford Univ. Press, 2002


Endocrine System
  Endocrine Disruptor Hypothesis
    Colborn,
  IPCS, Global Assessment of the State-of-the-Science of Endocrine Disruptors
  Chemical Database for Endocrine Disruptors

Neural System and Brain


Immune System


Science of Cancer
  NCI Tutorial
Abnormal cell proliferation
Apoptosis – Program cell death
  Chemical Carcinogen
    NCI Database
    IARC (International Agency for Research on Cancer) Monographs
 
Session 7. Quest for Drugs and Safety Control of Chemicals                   Back to Top
   
ADME : Absorption, Distribution, Metabolism, Excretion
Fate of Drugs and Xenobiotic Chemicals in living organisms

Good effects and bad effects are two sides of the same coin.

The Concept of Receptors and Ligands

Internal Targets of Xenobiotic Chemicals

Target hunting – search for disease related genes

QSAR when the targets are unknown
  CoMFA?

Docking study when the targets are known

Virtual Screening – Computer aided HTS



Session 8. Genome Based Clinical Medicine                   Back to Top

From Cell Models to Physiological Models
  Cancer
  Cardio Vascular Disease
  Obesity
Diabetes

Genetic Variation and Personalized Medicine
SNPs/Micro Satellites
Pharmacogenomics and FDA Policy






Session 9. Collaborative Projects                    Back to Top
   
In this last session I would like to propose some ideas on further collaboration between Vietnam researchers and Japanese researchers who are members of the Chem-Bio Informatics Society. The two collaborative projects described bellow are deeply interrelated. I think the most important resources for these projects are talented and well-trained researchers. I see great hope on your group in this aspect.

1.   CBI Grand Challenges
Right now I am still an active member of the Society I founded, the Chem-Bio Informatics Society. Supported by 37 industries (mostly pharmaceutical and computer industries) the society is growing its size and its influence among both academic and industry sectors in Japan. The society is going to organize its fourth annual meeting in Tokyo during 17-19 September. I suggest that your group may establish similar nonprofit, research-oriented, academic, industry and government complex. Such a complex may contribute not only to scientific community but also industry and business of Vietnam. Your group may send some of your researchers to the CBI Society member researchers for training and collaboration on the following topics:
(1) Large scale molecular computing
Programming for Fragment MO Method
      PC/Linux Clusters, Grid Computing
(2) Chemical substance databases and QSAR
(3) Virtual Screening: Focused Library and Docking Study
(4) Micro AI: Genome Wide Measurement Data Interpreter
(5) Disease Modeling: such as obesity or diabetes
(6) Computational Toxicology: for dioxins and other chemicals
   The Society's home page (www.cbi.or.jp) was poor in its content, but it will be more enriched in the future.

2.   Biochemical Prospecting
      The word "Chemical Prospecting" means to search useful natural chemical such as ingredients of medicinal plants. By "Biochemical Prospecting" I mean the research project to search useful plants and other organisms such as marine organisms and their useful chemical components. In addition to medicinal plants food industries are now looking "Functional Foods" or "Designer's Foods" that contain active compounds proved to be good for human health.
    Last December when I visited Hanoi I had interesting discussions with
Dr. Le Thi Xuan on medicinal plant hunting. I set as our first collaborative goal to produce digital files on Vietnam medicinal plants based on the two books. The first book is the English two volumes book which Dr. Le Thi Xuan gave me, and the second one is a Vietnam Traditional Medicine book which Mr. Hidaka showed me. Now I asked some of my assistants to produce digital files that consist of Latin and Vietnam names, source plant names and structures of chemicals contained in the two books. We could not input Vietnam characters neither could we input the English text. We did this work as just a trail, and would like to leave addition works for Vietnam colleges if possible. Since our files are only for internal use, we did not care for the copyright at this time, but I am interesting to discuss with the publishers to get the permission to use them in future.
In a wider perspective this kind of work will be categorized into what is called "chemical prospecting" or "biochemical prospecting" which search useful natural products. I am very much interesting in this subject but unless we have enough fund it is unrealistic to pursue such a project. So I have been trying to get some government funds for this topic but so far I did not have succeeded.
Some websites that we considered important on this subject are linked at the CBI Society Home Page that includes:
-   WHO Report on Traditional Medicine
-   Commercial Companies successfully working on this subject
-   Asian research groups working on this subject
Their information is highly useful.


 
Appendix                    Back to Top

1. Where can you study bioinformatics ?
2. New Topics in Informatics and Computing
3. Food Safety and Health Centers

Session 10. Books for Physicists who are interested in biology  Back to Top

Below is Kaminuma’s personal collection of books that are highly recommendable for physicists who are interested in life. These books touch on such important and fundamental topics as how physicists should approach to biology, life as computing machine, life as developing machine, and how life accumulated these kinds of properties through long history of evolution.

John Maddox, What Remains To Be Discovered, Macmillan, London, 1998

Freeman Dyson, The Future of Physics, Physics Today, 1970, also in F. Dyson, From Eros to Gaia, Pantheon Books, NY, 1992, pp.151-159

John von Neumann, The Computer and the Brain, Yale Univ., New Haven, 1958

Manfred Eigen and Ruthild Winkler, Laws of the Game-How Principles of Nature Govern Chance, Princeton Univ. Press, 1993 (translated from German edition, Naturgesetze steuern den Zufall, R. Piper & Co., Verlag, Munich, 1965)

Richard P. Feynman (J.G. Hey and R.W. Allen eds.), Feynman Lecture on Computation, Addison Wesley, 1996

Richard P. Feynman, There’s Plenty of Room at the Bottom, A Talk to American Physical Society on December 29, 1959 at Caltech, also in Richard P. Feynman (Jeffery Robbins ed.), The Pleasure of Finding Things Out, Perseus Pub., 1999, pp.117-139

Murray Gell-Mann, The Quarks and the Jaguar-Adventures in the Simple and the Complex, Little, Brown and Company, London, 1994

Stuart Kauffman, AT HOME IN THE UNIVERSE-The Search for the Lawsof
Self-Organization and Complexity, Oxford Univ., 1995

Peter Coveney and Roger Highfield, Frontiers of Complexity-The Search for Order in a Chaotic World, Random House, NY, 1995

Levin Kelly, Out of Control-The New Biology of Machines, Fourth Estate Limited, London, 1994

Roger Penrose, The Emperor’s New Mind-Concerning Computers, Minds, and Laws of Physics, Oxford Univ. Press/Penguin Books, 1989

Roger Penrose, Shadows of the Mind, Oxford Univ. Press, 1994

John H. Holland, Hidden Order-How Adaptation Builds Complexity, Addison-Wesley, 1995

Ian Stewart, Life’s Other Secret-The New Mathematics of the Living World, Penguin Books Ltd., London, 1998

Richard Dawkins, The Selfish Gene (new edition), Oxford Univ. Press, NY, 1989 (First edition in 1976)

Richard Dawkins, the extended phenotype-The long reach of the gene, Oxford Univ. Press, 1982

Richard Dawkins, The Blind Watchmaker-Why the evidence of evolution reveals a universe without design, Norton, 1987

Christopher Wills, The Wisdom of the Genes-A New Pathway to Evolution, HarperCollins, 1989

Robert J. Richards, The Meaning of Evolution, University of Chicago Press, 1992

George C. Williams, Natural Selection-Domains, Levels, and Challenges, Oxford Univ. Press, 1992

Stephen Jay Gould, Wonderful Life-The Burgess Shale and the Nature of History, Norton, NY, 1989

Stephen Jay Gould, Life’s Grandeur-The Spread of Excellence from Plato to Darwin, Random House, 1997

Simon Conway Morris, the Crucible Creation-The Burgess Shale and the Rise of Animals, Oxford Univ. Press, 1998

Official Address
Tsuguchika Kaminuma,
Rm 301, Iida Building, 4-3-16 Yoga, Setagaya-ku, Tokyo, 158-0097, Japan
Phone 81-3-5491-2403, FAX 81-3-5491-5462

Curriculum vitae of Tsuguchika Kaminuma
Dr. Kaminuma is currently working as a freelance researcher. He is also a board member of Chem-Bio Informatics Society and the president of his own company, Biodynamics, Inc.
Dr. Kaminuma was born in Kanagawa Prefecture, Japan in 1940. He finished undergraduate work at International Christian University in Tokyo in 1964, received a Master of Science degree in Physics from Yale University in 1966 and Ph.D. in Physics from University of Hawaii in 1970. He worked on Pattern Recognition as a research assistant to Prof. Michael Satoshi Watanabe during 1966-1971 at University of Hawaii. He worked on computer application to biomedicine at a Laboratory of Hitachi Inc. (1971-1976), at Tokyo Metropolitan Institute of Medical Science (1976-1989), and at National Institute of Medical Sciences (1989-2001). He retired from National Institute of Medical Science in March 2001. He had taught several universities including University of Yamaguchi, University of Tokyo, University of Tokai, and Nara Advanced Science and Technology Graduate School. He founded Chem-Bio Informatics Society and its journal, and worked for IPCS(International Program on Chemical Safety) of WHO as a coordinator of Japanese researchers and a Program Advisory Board member. He published many research papers and wrote books in both Japanese and English.

Back to Top