## INFO 295: Mathematical Methods for Information Science

### Tue/Thu 2:55-4:10 PM

Information Science Building, 301 College Ave, Room 130

4 credits, S/U Optional

Co-requisite: Math 231 or equivalent

This course teaches basic mathematical concepts in information
modeling. Topics to be covered include graph theory, discrete
probability, finite automata, Markov models and hidden Markov models.
We will be guided by examples and applications from various areas of
information science such as: the structure of the Web, genome
sequences, natural languages, and signal processing.

**Professor:** Paul Ginsparg (301 College Ave, 5-7371,
ginsparg@cs.cornell.edu)

**Office hours**: Wednesdays 2-3 PM (or by appointment)

**Course website:** http://courses.cit.cornell.edu/cis295_2004fa/ (this page)

# Note: This is the course site for Fall 2004

The course site for Fall 2007 is here

### Lecture 1 (Thu 26 Aug 04)

Intro, overview of topics to be discussed. Background discussion of
Set
Theory.

(See also
CIS 295 Spring 2004)
### Lecture 2 (Tue 31 Aug 04)

Probability and Counting 1
(Probability space, events, binomial coefficients, conditional probability)
### Lecture 3 (Thu 2 Sep 04)

Probability and Counting 2
(Bayes law, random variables, expectation)
### Lecture 4 (Tue 7 Sep 04)

Continued
Probability and Counting 2,
discussed www-admin spam,
introduced A plan for spam
(Graham, 2002).

**Problem Sets 1,2**
### Lecture 5 (Thu 9 Sep 04)

Finish
Probability and Counting 2,
continue A plan for spam
(specifically "naive Bayes"
and combining
probabilities).
### Lecture 6 (Tue 14 Sep 04)

Finish A plan for spam,
plus more details on "naive Bayes".
### Lecture 7 (Thu 16 Sep 04)

Some final comments on combining probabilities
( p(L or R | O) vs p(O | L and R) ).

Started Graph Theory 1 (Basic definitions, isomorphisms, paths)
### Lecture 8 (Tue 21 Sep 04)

More Graph Theory 2 (Eulerian and Hamiltonian circuits)

(See also Hamiltonian circuit)
### Lecture 9 (Thu 23 Sep 04)

More Graph Theory 3 (Trees, planarity, colorings)

**Problem Sets 3,4**
### Lecture 10 (Tue 28 Sep 04)

More Graph Theory 4 (Graph algorithms)

Mentioned
Vannevar Bush, As We May Think, Atlantic Monthly, July 1945
### Lecture 11 (Thu 30 Sep 04)

The World Wide Web: A very short personal history, Tim Berners-Lee

The Anatomy of a
Large-Scale Hypertextual Web Search Engine (Brin/Page, 1998), equivalent
html
### Lecture 12 (Tue 5 Oct 04)

internet != www.
Illustrated Perron-Frobenius theorem, continued Brin/Page (1998),
illustrated
calculation of page rank and other features used in ordering heuristics,
mentioned
"Google bombing" (NYTimes, 22 Jan 04)
### Lecture 13 (Thu 7 Oct 04)

Authoritative
sources in a hyperlinked environment (Kleinberg, 1998)

digression on small world networks:
Collective dynamics of 'small-world' networks (Watts/Strogatz, 1998)

Quantitative methods applied to social networks.
See Who is the best connected Scientist? (Newman, 2003)
### (12 Oct 04, fall break)

### Lecture 14 (Thu 14 Oct 04)

Finite State Automata 1 (alphabet, DFA, regular sets)
### Lecture 15 (Tue 19 Oct 04)

Some comments on recessive X-linked carrier
probabilities (e.g., Duchenne Muscular Dystrophy) and relation to problem
of casino with some loaded dice.
Discussion of counting equal heads/tails coin flips
using Stirling's formula (anticipating the voting
problem).
### Lecture 16 (Thu 21 Oct 04)

Finite State Automata 2 (NFA, subset construction)
### Lecture 17 (Tue 26 Oct 04)

midterm
### Lecture 18 (Thu 28 Oct 04)

more on Finite State Automata 2 (NFA, subset construction)

(comment on analog vs digital computers),
example of subset construction
### Lecture 19 (Tue 2 Nov 04)

Finite State Automata 3 (Regular expressions)
### Lecture 20 (Thu 4 Nov 04)

review of some midterm answers.
connection between DFA's and regular expressions, DFA's for language
parsing

**Problem Sets 5,6**
### Lecture 21 (Tue 9 Nov 04)

review rest of midterm answers.
construction of automaton for A* from automaton for language A.
### Lecture 22 (Thu 11 Nov 04)

ultimately periodiocity and constraints on lengths of elements of regular sets.
more examples of language parsing
### Lecture 23 (Tue 16 Nov 04)

continue examples of language parsing.
completion of finite state automata and regular expressions
(examples).
start discussion of
Markov Chains 1
(Basic definitions, transition matrix)
### Lecture 24 (Thu 18 Nov 04)

introductory parts from
A Tutorial on Hidden Markov Models and Selected Applications in Speech
Recognition (Rabiner, 1989), 257-259

See also "Biological Sequence Analysis", by Durbin, Eddy, Krogh, Mitchison
(ISBN: 0521629713) pp 46-48 (Class handout)
### Lecture 25 (Tue 23 Nov 04)

6 color problem

Markov Chains 2 (Ergodic chains, stationary distributions)

Start HMMs from Rabiner review, pp 259-261 (linked from previous lecture)

**Problem Set 7**
### Lecture 26 (Tue 30 Nov 04)

Continue HMMs from Rabiner review, pp 261-263 (linked from lecture 24)
### Lecture 27 (Thu 2 Dec 04)

Worked out example of Viterbi by hand.

More comments on HMMs from Rabiner review, pp 263-264
(and sketched pp 275-284) and Durbin et al pp 48-57 (earlier class handout).
### Final Exam

Take Home: available noon Tues 7 Dec, due 1 p.m. Fri 10 Dec (should take no
more than a few hours...)

Background text for Lectures 2,3: Chpt 4 of Rosen, *Discrete Mathematics
and its Applications* (on reserve at engineering library, also used by
cs 280)

Background text for Lectures 14,16,18,19:
D. Kozen, Automata and Computability, pp 1-54.

Background text for Lectures 23-25
Sheldon Ross, *Stochastic Processes*

#### Advising Notes

This course is a required course for the new Information Science (IS)
major in the College of Arts and Sciences and the College of
Agriculture and Life Sciences. It can also be used to satisfy the
math/stats course requirement for the Information Science
minor/concentration.

INFO295 and CS280 may not both be taken for credit towards the IS major.