We live in an age of information. Everywhere around us information is embedded in the devices we wear, the tools we use, the media we watch or hear, in the workplace and at home. New information is added every second to the interconnected web of devices that is the Internet. This exponential data explosion brings with it the availability of knowledge to end-users, the ability to connect, share, create and develop ideas and businesses together. However, this explosion also brings what is known as Information Overload, where we are bombarded with too much information that actually makes them less productive. The volume of scientific knowledge has outpaced our ability to manage it. In this work we investigate the tools and techniques necessary to create an Information Extraction system that lays the foundation for Semantic Web agents and more complex systems that understand rather that just index the ever-increasing digital data out there. Это и многое другое вы найдете в книге Information Extraction from Semi-structured and Unstructured Sources (Stefan Daniel Dumitrescu)