捕获和转换Web的工具

如何使用GrabzIt的Web Scraper API更改刮擦?

增加 intGrabzIt 之间的整合 网页刮板 以及使用 GrabzIt 的应用程序 网络爬虫 API 以编程方式更改网络抓取的属性。

更改网络抓取的目标

下面的示例显示了如何更改 种子网址 和主要的 目标网址.

GrabzItScrapeClient client = new GrabzItScrapeClient("Sign in to view your Application Key", "Sign in to view your Application Secret");
Target target = new Target();
//Set the target URL
target.URL = "http://www.example.com";
List<string> seedUrls = new List<string>();
seedUrls.Add("http://www.example.com/news.html");
seedUrls.Add("http://www.example.com/about.html");
seedUrls.Add("http://www.example.com/contactus.html");
//Set the seed URL's
target.SeedURLs = seedUrls.ToArray();
//Enter the id of the scrape you want to alter along with the target object
client.SetScrapeProperty("59421f049e3d991318d35e49", target);
$client = new GrabzItScrapeClient("Sign in to view your Application Key", "Sign in to view your Application Secret");
$target = new GrabzItTarget();
//Set the target URL
$target->SetURL("http://www.example.com");
$seedUrls = array();
$seedUrls[] = "http://www.example.com/news.html";
$seedUrls[] = "http://www.example.com/about.html";
$seedUrls[] = "http://www.example.com/contactus.html";
//Set the seed URL's
$target->SetSeedURLs($seedUrls);
//Enter the id of the scrape you want to alter along with the target object
$client->SetScrapeProperty("59421f049e3d991318d35e49", $target);
client = GrabzItScrapeClient.GrabzItScrapeClient("Sign in to view your Application Key", "Sign in to view your Application Secret");
target = new GrabzItTarget.GrabzItTarget()
#Set the target URL
target.url = "http://www.example.com"
seedUrls = []
seedUrls.append("http://www.example.com/news.html")
seedUrls.append("http://www.example.com/about.html")
seedUrls.append("http://www.example.com/contactus.html")
#Set the seed URL's
target.seedURLs = seedUrls
#Enter the id of the scrape you want to alter along with the target object
client.SetScrapeProperty("59421f049e3d991318d35e49", target)

更改网络抓取中的变量

在下面的例子中我们 设置一个变量 到 string 名称数组,但是该变量也可以设置为任何其他类型的简单数据。

GrabzItScrapeClient client = new GrabzItScrapeClient("Sign in to view your Application Key", "Sign in to view your Application Secret");
Variable variable = new Variable("names");
List<string> names = new List<string>();
names.Add("Tom");
names.Add("Dick");
names.Add("Harry");
foreach(string name in names)
{
	variable.AddArrayItem(name);
}
//Enter the id of the scrape you want to alter along with the variable object
client.SetScrapeProperty("59421f049e3d991318d35e49", variable);
$client = new GrabzItScrapeClient("Sign in to view your Application Key", "Sign in to view your Application Secret");
$variable = new GrabzItVariable("names");
$names = array();
$names[] = "Tom";
$names[] = "Dick";
$names[] = "Harry";
$variable->SetValue($names);
//Enter the id of the scrape you want to alter along with the variable object
$client->SetScrapeProperty("59421f049e3d991318d35e49", $variable);
client = GrabzItScrapeClient.GrabzItScrapeClient("Sign in to view your Application Key", "Sign in to view your Application Secret");
variable = new GrabzItVariable.GrabzItVariable("names")
names = []
names.append("Tom")
names.append("Dick")
names.append("Harry")
#Set the seed URL's
variable.value = names
#Enter the id of the scrape you want to alter along with the variable object
client.SetScrapeProperty("59421f049e3d991318d35e49", variable)

然后可以使用正常方式访问变量值 全局获取 方法如下图所示。

var names = Global.get("names");

然后可以在抓取指令中正常使用名称数组。