Which chapter would you like to explore?

Chapter 1: Parsing and Transforming Text Files

File parsing is a computational method of reading file piece by piece. In most parsing routines, the pieces can be matched against a pattern and then extracted or modified. This chapter will explore basic text file parsing techniques and scripts.

Continue  Data Files

Chapter 2: Utility Scripts

Utility scripts are small programs that perform a specific task, very efficiently. Potential uses include file compression, file searching, image conversion, etc. This chapter will introduce some general utility scripts that will be applied in later chapters. 

Continue Data Files

Chapter 3: Viewing and Modifying Image Files

Everyone who deals with data will eventually need a simple way of representing their data in images. There are many different image formats and operations for modifying images. This chapter will introduce some scripts that introduce the user to viewing and modifying images. 

Continue Data Files

Chapter 4: Indexing Text
Text indexing is a method for organizing textual data which increases the speed of textual searches and processing. In thi chapter, you will learn the rudiments of text indexing. These fundamental algorithms will also be encountered again in chapter 14, autocoding. 

Continue Data Files

Chapter 5: The National Library of Medicine’s Medical Subject Headings (MeSH)

Nomenclatures are comprehensive repositories of domain terminologies. A well-organized, comprehensive nomenclature can be used to annotate and index any information in any document, and permits that information to be retrieved and merged with relevant information contained in other documents. MeSH (Medical Subject Headings) is a nomenclature of medical terms available from the U.S. National Library of Medicine (NLM). In this chapter, we will be exploring scripts interacting with MeSH data provided by the NLM website. 

Continue Data Files

Chapter 6: The International Classification of Diseases
The International Classification of Diseases (ICD) is a nomenclature of the disease occurring in humans, with each listed disease assigned a unique identifying code. In this chapter, we will explore scripts which create ICD dictionaries. ICD will also play a role in future chapters. 

Continue Data Files

Chapter 9: PubMed

PubMed is the U.S. National Library of Medicine’s public search engine for millions of citations from the medical literature. In addition, PubMed can serve as an excellent data source for savvy informaticists. In this chapter, we will be exploring scripts which interact with PubMed. 

Continue Data Files

Chapter 11: Developmental Lineage Classification and Taxonomy of Neoplasms
Creating an accurate and complete nomenclature is important since misuse of medical terminology can lead to medical errors. Nomenclatures are often large and detailed thanks to electronic storage. In this chapter we will be exploring one such nomenclature system, the Developmental Lineage Classification adn Taxonomy of Neoplasms. 

Continue Data Files

Chapter 14: Autocoding

In the field of biomedical informatics, it is often necessary to extract medical terms from text and attach a nomenclature concept code to the extracted term. This can often be done automatically using software. This chapter will explore software automated medical coding. 

Continue Data Files

Chapter 15: Text Scrubber for Deidentifying Confidential Text

Medical records contain confidential and sensitive information. Often, medical data must be deidentified before being used. This chapter will explore how medical text can be deidentified with the aid of software. 

Continue Data Files

Chapter 16: Web Pages and CGI Scripts

There are many network protocols for exchanging information over the Internet, and for using remotely located applications. This chapter will explore a script for grabbing information from the internet utilizing one such protocol. 

Continue Data Files

Chapter 18: Describing Data with Data Using XML
XML is a method for marking up files so that every piece of data is surrounded by bracketed text that describe the piece of data. markup allows us to convey any message as XML. This chapter will explore XML and script that can parse XML files. 

Continue Data Files