Part I: Digital Texts, Digital Social Science 1. Social Science and the Digital Text Revolution Learning Objectives Introduction History of Text Analysis Risk and Rewards of Text Mining for the Social Sciences Social Data from Digital Environments Theory and Metatheory Ethics of Text Mining Organization of This Volume 2. Research Design Strategies Learning Objectives Introduction Levels of Analysis Strategies for Document Selection and Sampling Types of Inferential Logic Approaches to Research Design Part II: Text Mining Fundamentals 3. Web Crawling and Scraping Learning Objectives Introduction Web Statistics Web Crawling Web Scraping Software for Web Crawling and Scraping 4. Lexical Resources Learning Objectives Introduction WordNet Roget's Thesaurus Linguistic Inquiry and Word Count General Inquirer Wikipedia Downloadable Lexical Resources and APIs 5. Basic Text Processing Learning Objectives Introduction Tokenization Stopword Removal Stemming and Lemmatization Text Statistics Language Models Other Text Processing Software for Text Processing 6. Supervised Learning Learning Objectives Feature Representation and Weighting Supervised Learning Algorithms Evaluation of Supervised Learning Software for Supervised Learning Part III: Text Analysis Methods from the Humanities and Social Sciences 7. Thematic Analysis, QDAS, and Visualization Learning Objectives Thematic Analysis Qualitative Data Analysis Software Visualization Tools 8. Narrative Analysis Learning Objectives Introduction Conceptual Foundations Mixed Methods of Narrative Analysis Automated Approaches to Narrative Analysis Future Directions Specialized Software for Narrative Analysis 9. Metaphor Analysis Learning Objectives Introduction Theoretical Foundations Qualitative Metaphor Analysis Mixed Methods of Metaphor Analysis Automated Metaphor Identification Methods Software for Metaphor Analysis Part IV: Text Mining Methods from Computer Science 10. Word and Text Relatedness Learning Objectives Introduction Theoretical Foundations Corpus-based and Knowledge-based Measures of Relatedness Software and Datasets for Word and Text Relatedness Further Reading 11. Text Classification Learning Objectives Introduction Applications of Text Classification Representing Texts for Supervised Text Classification Text Classification Algorithms Bootstrapping in Text Classifcation Evaluation of Text Classification Software and Datasets for Text Classification 12. Information Extraction Learning Objectives Introduction Entity Extraction Relation Extraction Web Information Extraction Template Filling Software and Datasets for Information Extraction and Text Mining 13. Information Retrieval Learning Objectives Introduction Theoretical Foundations Components of an Information Retrieval System Information Retrieval Models The Vector-Space Model Evaluation of Information Retrieval Models Web-Based Information Retrieval Software and Datasets for Information Retrieval 14. Sentiment Analysis Learning Objectives Introduction Theoretical Foundations Lexicons Corpora Tools Future Directions Software and Datasets for Word and Text Relatedness 15. Topic Models Learning Objectives Introduction Digital Humanities Political Science Sociology Software for Topic Modeling V: Conclusions 16. Text Mining, Text Analysis, and the Future of Social Science Introduction Social and Computer Science Collaboration
Gabe Ignatow is Professor of Sociology and Director of Graduate Studies at the University of North Texas. His research interests are mainly in the areas of sociological theory, digital research methods, cognitive social science, and the philosophy of social science. His most recent books are Text Mining and An Introduction to Text Mining, both coauthored with Rada Mihalcea (University of Michigan). He is also a coeditor, with Wayne Brekhus (University of Missouri), of the Oxford Handbook of Cognitive Sociology. Rada Mihalcea is a professor of computer science and engineering at the University of Michigan. Her research interests are in computational linguistics, with a focus on lexical semantics, multilingual natural language processing, and computational social sciences. She serves or has served on the editorial boards of the following journals: Computational Linguistics, Language Resources and Evaluation, Natural Language Engineering, Research on Language and Computation, IEEE Transactions on Affective Computing, and Transactions of the Association for Computational Linguistics. She was a general chair for the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL, 2015) and a program cochair for the Conference of the Association for Computational Linguistics (2011) and the Conference on Empirical Methods in Natural Language Processing (2009). She is the recipient of a National Science Foundation CAREER award (2008) and a Presidential Early Career Award for Scientists and Engineers (2009). In 2013, she was made an honorary citizen of her hometown of Cluj-Napoca, Romania.
Text Mining and Analysis is a comprehensive book that deals with
the latest developments of text mining research, methodology, and
applications. An excellent choice for anyone who wants to learn how
these emerging practices can benefit their own research in an era
of Big Data. -- Kenneth C. C. Yang
This is a clear, comprehensive and thorough description of new text mining techniques and their applications: a "must" for students and social researchers who wish to understand how to tackle the challenges raised by Big Data. -- Aude Bicquelet