Text Processing

Instructor: Chakravarthy Bhagvati
Class Timings: Tuesday 5:00-6:00 PM, Wednesday 4:00-6:00 PM (R7 LHC)

Syllabus

I. Text Representation [July 21 - 27]

  • ASCII Represenation
  • UNICODE Represenation
  • ISCII (for Indian Scripts)

II. Low Level Text Processing [July 28 - August 18]

  • Searching
  • Comparision
  • String Matching
  • Similarity Between Strings
  • Dictionary

III. Information Retrieval [August 18 - (Still continuing)]

  • Defnitions (Document Unit, Token, Stop words, Term, etc)
  • Tokenization
  • Single Word Query, Phrase Query
  • Scoring (Word Level, and Document Level)
  • Syllabus Will be Added as classes commences

IV. Advances in Text Processing

  • Syllabus will be Added as classes commences

Subject Material and Links


Assignments and Minors

Note: Dates are only indicative and may change