Getting Started with Beautiful Soup
暫譯: 開始使用 Beautiful Soup

Vineeth G. Nair

  • 出版商: Packt Publishing
  • 出版日期: 2014-01-27
  • 售價: $1,660
  • 貴賓價: 9.5$1,577
  • 語言: 英文
  • 頁數: 130
  • 裝訂: Paperback
  • ISBN: 1783289554
  • ISBN-13: 9781783289554
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

Learn how to extract information from websites using Beautiful Soup and the Python urllib2 module. This practical, hands-on guide covers everything you need to know to get a head start in website scraping.

Overview

  • Learn about the features of Beautiful Soup with Python
  • Extract information from Google's home page
  • Understand how to use a simple method to extract information from websites using Beautiful Soup and the Python urllib2 module
  • Master searching, navigation, content modification, encoding, and output methods quickly and efficiently
  • Try out the example code and get to grips with Beautiful Soup easily

In Detail

Beautiful Soup is a Python library designed for quick turnaround projects like screen-scraping. Beautiful Soup provides a few simple methods and Pythonic idioms for navigating, searching, and modifying a parse tree: a toolkit for dissecting a document and extracting what you need without writing excess code for an application. It doesn't take much code to write an application using Beautiful Soup.

Getting Started with Beautiful Soup is a practical guide to Beautiful Soup using Python. The book starts by walking you through the installation of each and every feature of Beautiful Soup using simple examples which include sample Python codes as well as diagrams and screenshots wherever required for better understanding. The book discusses the problems of how exactly you can get data out of a website and provides an easy solution with the help of a real website and sample code.

Getting Started with Beautiful Soup goes over the different methods to install Beautiful Soup in both Linux and Windows systems. You will then learn about searching, navigating, content modification, encoding support, and output formatting with the help of examples and sample Python codes for each example so that you can try them out to get a better understanding. This book is a practical guide for scraping information from any website. If you want to learn how to efficiently scrape pages from websites, then this book is for you.

What you will learn from this book

  • Learn how to scrape HTML pages from websites
  • Implement a simple method to scrape any website with the help of developer tools, the Python urllib2 module, and Beautiful Soup
  • Learn how to search for information within an HTML/XML page
  • Modify the contents of an HTML tree
  • Understand encoding support in Beautiful Soup
  • Learn about the different types of output formatting

Approach

This book is a practical, hands-on guide that takes you through the techniques of web scraping using Beautiful Soup.

Who this book is written for

Getting Started with Beautiful Soup is great for anybody who is interested in website scraping and extracting information. However, a basic knowledge of Python, HTML tags, and CSS is required for better understanding.

商品描述(中文翻譯)

學習如何使用 Beautiful Soup 和 Python 的 urllib2 模組從網站中提取資訊。本實用的手把手指南涵蓋了您需要了解的所有內容,以便在網站爬蟲方面取得先機。

概述
- 了解 Beautiful Soup 與 Python 的特性
- 從 Google 的首頁提取資訊
- 理解如何使用簡單的方法,利用 Beautiful Soup 和 Python 的 urllib2 模組從網站提取資訊
- 快速有效地掌握搜尋、導航、內容修改、編碼和輸出方法
- 嘗試範例程式碼,輕鬆掌握 Beautiful Soup

詳細內容
Beautiful Soup 是一個為快速周轉專案(如螢幕擷取)設計的 Python 函式庫。Beautiful Soup 提供了一些簡單的方法和 Pythonic 的慣用法,用於導航、搜尋和修改解析樹:這是一個用於解剖文檔並提取所需內容的工具包,而無需為應用程式編寫多餘的程式碼。使用 Beautiful Soup 編寫應用程式所需的程式碼並不多。

《Getting Started with Beautiful Soup》是一本使用 Python 的 Beautiful Soup 實用指南。本書首先通過簡單的範例逐步引導您安裝 Beautiful Soup 的每一個功能,這些範例包括示範的 Python 程式碼以及必要的圖表和截圖,以便更好地理解。本書討論了如何從網站中提取數據的問題,並提供了一個簡單的解決方案,通過一個真實的網站和範例程式碼來幫助您。

《Getting Started with Beautiful Soup》介紹了在 Linux 和 Windows 系統中安裝 Beautiful Soup 的不同方法。接著,您將學習搜尋、導航、內容修改、編碼支援和輸出格式化,並提供每個範例的示範 Python 程式碼,以便您可以嘗試以獲得更好的理解。本書是從任何網站擷取資訊的實用指南。如果您想學習如何有效地從網站中擷取頁面,那麼這本書適合您。

您將從本書學到的內容
- 學習如何從網站擷取 HTML 頁面
- 實現一種簡單的方法,利用開發者工具、Python 的 urllib2 模組和 Beautiful Soup 擷取任何網站
- 學習如何在 HTML/XML 頁面中搜尋資訊
- 修改 HTML 樹的內容
- 理解 Beautiful Soup 中的編碼支援
- 了解不同類型的輸出格式化

方法
本書是一本實用的手把手指南,帶您了解使用 Beautiful Soup 進行網頁擷取的技術。

本書的讀者對象
《Getting Started with Beautiful Soup》非常適合任何對網站擷取和提取資訊感興趣的人。然而,為了更好地理解,建議具備基本的 Python、HTML 標籤和 CSS 知識。