Beautiful Soup find_all_next() 方法

一、方法描述

在 BeautifulSoup 库中，find_all_next() 方法查找所有与给定条件匹配并在文档中出现在当前元素之后的所有 PageElements。此方法返回标签或 NavigableString 对象，并接受与 find_all() 方法相同的参数。

二、语法

find_all_next(name, attrs, string, limit, **kwargs)

三、参数

name：一个对标签名的过滤器。
attrs：一个包含属性值过滤器的字典。
recursive：如果为 True，则执行递归搜索；否则，只考虑直接子元素。
limit：在找到指定数量的匹配后停止搜索。
kwargs：一个包含属性值过滤器的字典。

四、返回值

此方法返回一个包含 PageElements（Tag 或 NavigableString 对象）的 ResultSet。

五、示例

示例 1

使用 index.html 作为本例的 HTML 文档，我们首先定位 <form> 标签，并使用 find_all_next() 方法收集其后的所有元素。

from bs4 import BeautifulSoup

fp = open("index.html")
soup = BeautifulSoup(fp, 'html.parser')

tag = soup.form
tags = tag.find_all_next()
print(tags)

输出：

[<input id="nm" name="name" type="text"/>, <input id="age" name="age" type="text"/>, <input id="marks" name="marks" type="text"/>]

示例 2

在这里，我们在 find_all_next() 方法中应用了一个过滤器来收集所有位于 <form> 后面的标签，其中 id 为 nm 或 age。

from bs4 import BeautifulSoup

fp = open("index.html")
soup = BeautifulSoup(fp, 'html.parser')

tag = soup.form
tags = tag.find_all_next(id=['nm', 'age'])
print(tags)

输出：

[<input id="nm" name="name" type="text"/>, <input id="age" name="age" type="text"/>]

示例 3

如果我们检查 body 标签后面的标签，它包括一个 <h1> 标签以及 <form> 标签，后者包含了三个输入元素。

from bs4 import BeautifulSoup

fp = open("index.html")
soup = BeautifulSoup(fp, 'html.parser')

tag = soup.body
tags = tag.find_all_next()
print(tags)

输出：

<h1>Yoagoa</h1>
<form>
<input id="nm" name="name" type="text"/>
<input id="age" name="age" type="text"/>
<input id="marks" name="marks" type="text"/>
</form>
<input id="nm" name="name" type="text"/>
<input id="age" name="age" type="text"/>
<input id="marks" name="marks" type="text"/>