Selectors with CSS

Selectors with CSS refer to using CSS selectors to select and extract specific elements or data from HTML or XML documents. CSS selectors are a powerful and flexible way to extract data from web pages, and they are supported by Scrapy, a Python-based web scraping framework.

Here are some examples of CSS selectors and their usage in Scrapy:

Select all div elements: div

from scrapy import Selector

html = '<div><p>First paragraph.</p></div><div><p>Second paragraph.</p></div>'

selector = Selector(text=html)

# Select all div elements

divs = selector.css('div')

In this example, we use the css() method to select all div elements in the HTML document.

Select all elements with class name example: .example

from scrapy import Selector

html = '<div class="example">This is an example.</div><p>This is not an example.</p>'

selector = Selector(text=html)

# Select all elements with class name "example"

elements = selector.css('.example')

In this example, we use the css() method to select all elements with class name example in the HTML document.

Select the text content of the first p element inside a div element: div p:first-of-type::text

from scrapy import Selector

html = '<div><p>First paragraph.</p><p>Second paragraph.</p></div>'

selector = Selector(text=html)

# Select the text content of the first p element inside a div element

text = selector.css('div p:first-of-type::text').get()

In this example, we use the css() method to select the text content of the first p element inside a div element in the HTML document. The ::text pseudo-element is used to select the text content of the element, rather than the element itself.

CSS selectors can also be combined with other CSS selectors, as well as pseudo-classes and pseudo-elements, to create more complex selections. For more information on CSS selectors, you can refer to the W3C specification at https://www.w3.org/TR/selectors-3/.

Articles

Selectors with CSS

Built-in Functions

Generating your code...